This module mainly defines the
http_protocol class which implements the
exchange of messages with a HTTP client. The request messages are represented
as sequence of
req_token values. The response is encapsulated in a separate
http_response class. The contents of the response are represented as sequence
These are the serious protocol violations after that the daemon stops any further processing.
`Timeout refers to a timeout in the middle of a request.
`Broken_pipe_ignore is the "harmless" version of
Long messages are fatal because it is suspected that they are denial
of service attacks. The kernel generates
`Message_too_long only for
long headers, not for long bodies.
Fatal server errors can happen when exceptions are not properly handled. As last resort the HTTP daemon closes the connection without notifying the client.
A bad request is a violation where the current request cannot be decoded, and it is not possible to accept further requests over the current connection.
Convert error to a string, for logging
data_chunk is a substring of a string. The substring is described by
(s, pos, len) where
s is the container,
pos is the
position where the substring begins, and
len its length.
= (code, phrase)
resp_token represents a textual part of the response to send:
`Resp_info_lineis an informational status line (code=100..199). There can be several informational lines, and they can be accompanied with their own headers. Such lines are only sent to HTTP/1.1 clients.
`Resp_status_lineis the final status line to send (code >= 200)
`Resp_headeris the whole response header to send
`Resp_bodyis the next part of the response body to send.
`Resp_traileris the whole response trailer to send (currently ignored)
`Resp_actionis special because it does not directly represent a token to send. The argument is a function which is called when the token is the next token on the active event queue. The function is also called when the event queue is dropped because of an error (the state of the response object indicates this). The function must not raise exceptions except
Unix_error, and it must not block.
The response state:
`Inhibited= it is not yet allowed to start the response
`Queued= the response waits on the queue for activation
`Active= the response is currently being transmitted
`Processed= the response has been completely sent
`Error= an error occurred during the transmission of this response
`Dropped= an earlier response forced to close the connection, and this response is dequeued
Tokens generated by
`Resp_wire_dataare data tokens.
`Resp_endindicates the end of the response.
Represents the action of sending the response
This class has an internal
queue of response tokens that are not yet processed. One can easily add
new tokens to the end of the queue (
The class is responsible for determining the transfer encoding:
TE request header is not taken into account. The trailer
is always empty.
The following headers are set (or removed) by this class:
Server(it is appended to this field)
Responses for HEAD requests have the special behaviour that the body is silently dropped. The calculation of header fields is not affected by this. This means that HEAD can be easily implemented by doing the same as for GET.
Responses for other requests that must not include a body must set
Content-Length to 0.
These methods can be called by the content provider:
The bidrectional phase starts after "100 Continue" has been sent to the client, and stops when the response body begins. The bidirectional phase is special for the calculation of timeout values (input determines the timeout although the response has started).
Return whether the send queue is empty. When the state is
method fakes an empty queue.
Returns whether the connection should be closed after this response.
This flag should be evaluated when the
`Resp_end front token has been
Returns the selected transfer encoding. This is valid after the header
has been passed to this object with
The first token of the queue, represented as
Send_queue_empty when there is currently no front token, or the state
If there is a front token, it will never have length 0.
Unix_error exceptions can be raised when
tokens are processed.
The function will be called when either
set_state changes the state,
or when the send queue becomes empty. Note that the callback must never
fail, it is called in situations that make it hard to recover from errors.
Accumulated size of the response body
These methods must only be called by the HTTP protocol processor:
Tell this object that
n bytes of the front token could be really
Unix.write. If this means that the whole front token
has been sent, the next token is pulled from the queue and is made
the new front token. Otherwise, the data chunk representing the
front token is modified such that the position is advanced by
n, and the length is reduced by
Encapsultation of the HTTP response for a single request
Sends the string argument as response body, together with the given status and the header (optional). Response header fields are set as follows:
Content-Lengthis set to the length of the string.
Content-Typeis set to "text/html" unless given by the header. If the header object is passed in, these modifications are done directly in this object as side effect.
Sends the contents of a file as response body, together with the given status and the header (optional). The descriptor must be a file descriptor (that cannot block). The int64 number is the length of the body. Response header fields are set as follows:
Content-Lengthis set to the length of the string.
Content-Typeis set to "text/html" unless given by the header.
Content-Range is not set automatically, even if the file is only
If the header object is passed in, these modifications are done directly in this object as side effect.
The function does not send the file immediately, but rather sets the
object up that the next chunk of the file is added when the send queue becomes
empty. This file will be closed when the transfer is done.
req_token represents a textual part of the received request:
`Req_headeris the full received header. Together with the header, the corresponding
http_responseobject is returned which must be used to transmit the response.
`Req_expect_100_continueis generated when the client expects that the server sends a "100 Continue" response (or a final status code) now. One should add
`Resp_info_line resp_100_continueto the send queue if the header is acceptable, or otherwise generate an error response. In any case, the rest of the request must be read until
`Req_bodyis a part of the request body. The transfer-coding, if any, is already decoded.
`Req_traileris the received trailer
`Req_endindicates the end of the request (the next request may begin immediately).
`Eofindicates the end of the stream
`Bad_request_errorindicates that the request violated the HTTP protocol in a serious way and cannot be decoded. It is required to send a "400 Bad Request" response. The following token will be
`Fatal_errorindicates that the connection crashed. The following token will be
`Timeoutmeans that nothing has been received for a certain amount of time, and the protocol is in a state that the next request can begin. The following token will be
Note that it is always allowed to
send tokens to the client. The protocol
implementation takes care that the response is transmitted at the right point
Maximum size of the request line. Longer lines are immediately replied with a "Request URI too long" response. Suggestion: 32768.
Maximum size of the header, including the request line. Longer headers
are treated as attack, and cause the fatal error
Maximum size of the trailer
Limits the length of the pipeline (= unreplied requests). A value of 0 disables pipelining. A value of n allows that another request is received although there are already n unreplied requests.
Limits the size of the pipeline in bytes. If the buffered bytes in the input queue exceed this value, the receiver temporarily stops reading more data. The value 0 has the effect that even the read-ahead of data of the current request is disabled. The value (-1) disables the receiver completely (not recommended).
Whether to set the
`Ignore: The kernel does not touch the
`Ocamlnet: Announce this web server as "Ocamlnet/<version>"
`Ocamlnet_and s: Announce this web server as
sand append the Ocamlnet string.
`As s: Announce this web server as
Whether to suppress
`Broken_pipe errors. Instead
`Broken_pipe_ignore is reported.
Configuration values for the HTTP kernel
config_max_reqline_length = 32768
config_max_header_length = 65536
config_max_trailer_length = 32768
config_limit_pipeline_length = 5
config_limit_pipeline_size = 65536
config_announce_server = `Ocamlnet
config_suppress_broken_pipe = false
Modifies the passed config object as specified by the optional arguments
Exchange of HTTP messages
fd one must pass the already connected socket. It must be in non-
How to use this class: Basically, one invokes
cycle until the whole
message exchange on
fd is processed.
cycle receives data from the
socket and sends data to the socket. There are two internal queues:
The receive queue stores parts of received requests as
One can take values from the front of this queue by calling
The response queue stores
http_response objects. Each of the objects
corresponds to a request that was received before. This queue is handled
fully automatically, but one can watch its length to see whether all responses
are actually transmitted over the wire.
The basic algorithm to process messages is:
let rec next_token () = if proto # recv_queue_len = 0 then ( proto # cycle (); next_token() ) else proto # receive() let cur_token = ref (next_token()) in while !cur_token <> `Eof do (* Process first token of next request: *) match !cur_token with | `Req_header(req_line, header, resp) -> (* Depending on [req_line], read further tokens until [`Req_end] *) ... (* Switch to the first token of the next message: *) cur_token := next_token() | `Timeout -> ... | `Bad_request_error(e,resp) -> (* Generate 400 error, send it to [resp] *) ... (* Switch to the first token of the next message: *) cur_token := next_token() | `Fatal_error e -> failwith "Crash" | _ -> assert false done; while proto # resp_queue_len > 0 do proto # cycle (); done; proto # shutdown()
See the file
tests/easy_daemon.ml for a complete implementation of this.
As one can see, it is essential to watch the lengths of the queues in order
to figure out what has happened during
When the body of the request is empty,
`Req_body tokens are omitted.
Note that for requests like
GET that always have an empty body, it is
still possible that an errorneous client sends a body, and that
tokens arrive. One must accept and ignore these tokens.
Error handling: For serious errors, the connection is immediately aborted.
In this case,
receive returns a
`Fatal_error token. Note that the
queued responses cannot be sent! An example of this is
There is a large class of non-serious errors, esp. format errors
in the header and body. It is typical of these errors that one cannot determine
the end of the request properly. For this reason, the daemon stops reading
further data from the request, but the response queue is still delivered.
For these errors,
receive returns a
This token contains a
http_response object that must be filled with a
400 error response.
Looks at the file descriptor. If there is data to read from the descriptor,
and there is free space in the input buffer, additional data is read into
the buffer. It is also tried to interpret the new data as
and if possible, new
req_tokens are appended to the receive queue.
If the response queue has objects, and there is really data one can send, and if the socket allows one to send data, it is tried to send as much data as possible.
block (default: 0) can be set to wait until data
can be exchanged with the socket. This avoids busy waiting. The number
is the duration in seconds to wait until the connection times out
(0 means not to wait at all, -1 means to wait infinitely). When a timeout
happens, and there is nothing to send, and the last request was fully
receive will simply return
`Timeout (i.e. when
true). Otherwise, the
`Timeout is generated.
Returns the first
req_token from the receive queue. Raises
Recv_queue_empty when the queue is empty (= has no new data)
Peeks the first token, but leaves it in the queue.
Recv_queue_empty when the queue is empty.
Returns the length of the receive queue (number of tokens)
Returns the length of the internal response queue (number of
objects that have not yet fully processed)
Returns the number of unanswered requests = Number of received
minus number of responses in state
`Processed. Note that
-1 when bad requests are responded.
Returns the (estimated) size of the input queue in bytes
Whether the kernel is currently waiting for the beginning of a new
arriving HTTP request. This is
false while the request is being
Suggests the calculation of a timeout value for input:
`Normal: The normal timeout value applies
`Next_message: The timeout value applies while waiting for the next message
`None: The connection is output-driven, no input timeout value
Shuts the socket down. Note: the descriptor is not closed.
Process a timeout condition as
Stops the transmission of data. The receive queue is cleared and filled
with the two tokens
The response queue is cleared. The
method will return immediately without doing anything.
true iff the protocol engine is interested in new data from the
false after EOF and after errors.
true iff the protocol engine has data to output to the socket
true when a lingering close operation is needed to reliably shut
down the socket. In many cases, this expensive operation is not necessary.
See the class
For testing: returns a list of tokens indicating into which cases the program ran.
The core event loop of the HTTP daemon
Closes a file descriptor using the "lingering close" algorithm
while lc # lingering do lc # cycle ~block:true () done
Reads data from the file descriptor until EOF or until a fixed timeout
is over. Finally, the descriptor is closed. If
block is set, the method
blocks until data is available. (Default:
Whether the socket is still lingering
Closes a file descriptor using the "lingering close" algorithm.
preclose function is called just before