@perseveranza-pets/milo
TypeScript icon, indicating that this package has built-in type declarations

0.2.1 • Public • Published

Milo

Milo is a fast and embeddable HTTP/1.1 parser written in Rust.

It is usable in JavaScript via WebAssembly.

How to use it

Install it from npm:

npm install @perseveranza-pets/milo

Then create a sample source file:

const milo = require('@perseveranza-pets/milo')

// Prepare a message to parse.
const message = Buffer.from('HTTP/1.1 200 OK\r\nContent-Length: 3\r\n\r\nabc')

// Allocate a memory in the WebAssembly space. This speeds up data copying to the WebAssembly layer.
const ptr = milo.alloc(message.length)

// Create a buffer we can use normally.
const buffer = Buffer.from(milo.memory.buffer, ptr, message.length)

// Create the parser.
const parser = milo.create()

/*
  Milo works using callbacks.

  All callbacks have the same signature, which characterizes the payload:
  
    * The current parent
    * from: The payload offset.
    * size: The payload length.
    
  The payload parameters above are relative to the last data sent to the milo.parse method.

  If the current callback has no payload, both values are set to 0.
*/
milo.setOnData(parser, (p, from, size) => {
  console.log(`Pos=${milo.getPosition(p)} Body: ${message.slice(from, from + size).toString()}`)
})

// Now perform the main parsing using milo.parse. The method returns the number of consumed characters.
buffer.set(message, 0)
const consumed = milo.parse(parser, ptr, message.length)

// Cleanup used resources.
milo.destroy(parser)
milo.dealloc(ptr, message.length)

Finally build and execute it using node:

node index.js
# Pos=38 Body: abc

API

The module exports several constants (* is used to denote a family prefix):

  • FLAG_DEBUG: If the debug informations are enabled or not.
  • MESSAGE_TYPE_*: The type of the parser: it can autodetect (default) or only parse requests or response.
  • ERROR_*: An error code.
  • METHOD_*: An HTTP/RTSP request method.
  • CONNECTION_*: A Connection header value.
  • CALLBACK_*: A parser callback.
  • STATE_*: A parser state.

Callbacks handling

All callback in Milo have the following signature (TypeScript syntax):

(parser: number, offset: number, length: number) => void

where the parameters have the following meaning:

  1. The current parser.
  2. The payload offset. Can be 0.
  3. The data length. Can be 0.

If both offset and length are 0, it means the callback has no payload associated.

MessageTypes

An enum listing all possible message types.

Access is supported from string constant or numeric value.

Errors

An enum listing all possible parser errors.

Access is supported from string constant or numeric value.

Methods

An enum listing all possible HTTP/RTSP methods.

Access is supported from string constant or numeric value.

Connections

An enum listing all possible connection (Connection header value) types.

Access is supported from string constant or numeric value.

Callbacks

An enum listing all possible parser callbacks.

Access is supported from string constant or numeric value.

States

An enum listing all possible parser states.

Access is supported from string constant or numeric value.

alloc

Allocates a shared memory area with the WebAssembly instance which can be used to pass data to the parser.

The returned value MUST be destroyed later using dealloc.

dealloc(ptr)

Deallocates a shared memory area created with alloc.

create

Creates a new parser.

The returned value MUST be destroyed later using destroy.

destroy(parser)

Destroys a parser.

parse(parser, data, limit)

Parses data up to limit characters.

It returns the number of consumed characters.

reset(parser)

Resets a parser. The second parameters specifies if to also reset the parsed counter.

The following fields are not modified:

  • position
  • context
  • mode
  • manage_unconsumed
  • continue_without_data
  • context

clear(parser)

Clears all values about the message in the parser.

The connection and message type fields are not cleared.

pause(parser)

Pauses the parser. The parser will have to be resumed via resume.

resume(parser)

Resumes the parser.

finish(parser)

Marks the parser as finished. Any new invocation of milo::milo_parse will put the parser in the error state.

fail(parser, code, description)

Marks the parsing a failed, setting a error code and and error message.

getMode(parser)

Returns the parser mode.

isPaused(parser)

Returns true if the parser is paused.

manageUnconsumed(parser)

Returns true if the parser should automatically copy and prepend unconsumed data.

continueWithoutData(parser)

Returns true if the next execution of the parse loop should execute even if there is no more data.

isConnect(parser)

Returns true if the current request used CONNECT method.

skipBody(parser)

Returns true if the parser should skip the body.

getState(parser)

Returns the parser state.

getPosition(parser)

Returns the parser position.

getParsed(parser)

Returns the total bytes consumed from this parser.

getErrorCode(parser)

Returns the parser error.

getMessageType(parser)

Returns the parser current message type.

getMethod(parser)

Returns the parser current request method.

getStatus(parser)

Returns the parser current response status.

getVersionMajor(parser)

Returns the parser current message HTTP version major version.

getVersionMinor(parser)

Returns the parser current message HTTP version minor version.

getConnection(parser)

Returns the parser value for the connection header.

getContentLength(parser)

Returns the parser value of the Content-Length header.

getChunkSize(parser)

Returns the parser expected length of the next chunk.

getRemainingContentLength(parser)

Returns the parser missing data length of the body according to the content_length field.

getRemainingChunkSize(parser)

Returns the parser missing data length of the next chunk according to to the chunk_size field.

hasContentLength(parser)

Returns true if the parser the current message has a Content-Length header.

hasChunkedTransferEncoding(parser)

Returns true if the parser the current message has a Transfer-Encoding: chunked header.

hasUpgrade(parser)

Returns true if the parser the current message has a Connection: upgrade header.

hasTrailers(parser)

Returns true if the parser the current message has a Trailers header.

getErrorDescription(parser)

Returns the parser error description or null.

getCallbackError(parser)

Returns the parser callback error or null.

setMode(parser, value)

Sets the parser mode.

setManageUnconsumed(parser, value)

Sets if the parser should automatically copy and prepend unconsumed data.

setContinueWithoutData(parser, value)

Sets if the next execution of the parse loop should execute even if there is no more data.

setSkipBody(parser, value)

Set if the parser should skip the body.

setIsConnect(parser, value)

Sets if the current request used the CONNECT method.

setBeforeStateChange(parser, cb)

Sets the parser before_state_change callback.

setAfterStateChange(parser, cb)

Sets the parser after_state_change callback.

setOnError(parser, cb)

Sets the parser on_error callback.

setOnFinish(parser, cb)

Sets the parser on_finish callback.

setOnMessageStart(parser, cb)

Sets the parser on_message_start callback.

setOnMessageComplete(parser, cb)

Sets the parser on_message_complete callback.

setOnRequest(parser, cb)

Sets the parser on_request callback.

setOnResponse(parser, cb)

Sets the parser on_response callback.

setOnReset(parser, cb)

Sets the parser on_reset callback.

setOnMethod(parser, cb)

Sets the parser on_method callback.

setOnUrl(parser, cb)

Sets the parser on_url callback.

setOnProtocol(parser, cb)

Sets the parser on_protocol callback.

setOnVersion(parser, cb)

Sets the parser on_version callback.

setOnStatus(parser, cb)

Sets the parser on_status callback.

setOnReason(parser, cb)

Sets the parser on_reason callback.

setOnHeaderName(parser, cb)

Sets the parser on_header_name callback.

setOnHeaderValue(parser, cb)

Sets the parser on_header_value callback.

setOnHeaders(parser, cb)

Sets the parser on_headers callback.

setOnConnect(parser, cb)

Sets the parser on_connect callback.

setOnUpgrade(parser, cb)

Sets the parser on_upgrade callback.

setOnChunkLength(parser, cb)

Sets the parser on_chunk_length callback.

setOnChunkExtensionName(parser, cb)

Sets the parser on_chunk_extension_name callback.

setOnChunkExtensionValue(parser, cb)

Sets the parser on_chunk_extension_value callback.

setOnChunk(parser, cb)

Sets the parser on_chunk callback.

setOnBody(parser, cb)

Sets the parser on_body callback.

setOnData(parser, cb)

Sets the parser on_data callback.

setOnTrailerName(parser, cb)

Sets the parser on_trailer_name callback.

setOnTrailerValue(parser, cb)

Sets the parser on_trailer_value callback.

setOnTrailers(parser, cb)

Sets the parser on_trailers callback.

How it works?

Milo leverages Rust's procedural macro, syn and quote crates to allow an easy definition of states and matchers for the parser.

See the macros internal crate for more information.

The data matching is possible thanks to power of the Rust's match statement applied to data slices.

The resulting parser is as simple state machine which copies the data in only one (optional) specific case: to automatically handle unconsumed portion of the input data.

In all other all cases, no data is copied and the memory footprint is very small as only 30 bool, uintprt_t or uint64_t fields can represent the entire parser state.

Why?

The scope of Milo is to replace llhttp as Node.js main HTTP parser.

This project aims to:

  • Make it maintainable and verificable using easy to read Rust code.
  • Be performant by avoiding any unnecessary data copy.
  • Be self-contained and dependency-free.

To see the rationale behind the replacement of llhttp, check Paolo's talk at Vancouver's Node Collab Summit in January 2023 (slides).

To see the initial disclosure of milo, check Paolo's talk at NodeConf EU 2023 in November 2023 (slides).

Sponsored by

NearForm

Contributing to milo

  • Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
  • Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it.
  • Fork the project.
  • Start a feature/bugfix branch.
  • Commit and push until you are happy with your contribution.
  • Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.

Copyright

Copyright (C) 2023 and above Paolo Insogna (paolo@cowtech.it) and NearForm (https://nearform.com).

Licensed under the ISC license, which can be found at https://choosealicense.com/licenses/isc or in the LICENSE.md file.

Readme

Keywords

Package Sidebar

Install

npm i @perseveranza-pets/milo

Weekly Downloads

116

Version

0.2.1

License

ISC

Unpacked Size

186 kB

Total Files

7

Last publish

Collaborators

  • shogun_panda