Implementing HTTP over UDP in Node.js
An introduction to TCP and UDP in Node.js, and a Quic mention of HTTP/3
Let’s start with the most basic example. We would like to have an HTTP server that can say “Hi” to people. So given that we have a client and a server, a request that says “Eytan”, will get a response of “Hi, Eytan”.
This is how it’s going to look like in Node.js:
The example is built to be simple and thus doesn’t necessarily follow HTTP standards; let’s roll with it for now.
Now I would like to make a hypothesis — HTTP is based on another protocol called “TCP” (which is indeed true according to the HTTP specs, but let’s pretend for a moment that we don’t know that yet). If I could build a proxy that’s based on TCP, and successfully pass messages between HTTP client and server, the hypothesis would be proven right.
Node.js comes with a built-in module called “net”, which provides an API for creating TCP clients and servers. Unlike HTTP:
- Requests can’t be initiated spontaneously by a client, unless a connection has been established. Once so, messages can be passed freely between a client and a server.
- A TCP connection is a bi-directional stream, meaning that messages can be sent from either party at any point.
- Once the communication has been fulfilled, the connection has to be closed manually.
If a TCP client would like to be greeted by a server, this is how it would look like:
So putting everything that we’ve learned thus far, this is how an HTTP message exchange would look like with a TCP proxy in between:
Note the logs; the request and response contents made it through, despite the fact the the TCP proxy has no awareness whatsoever about HTTP being involved.
As promised in the beginning of this article, I’m going to try to strip-out the HTTP abstraction layer, i.e., achieving HTTP communication without an HTTP client or server. We know for a fact that HTTP is based on TCP, which means we can use TCP client / server to simulate HTTP request / response, as long as we stick to the HTTP message format. However, without a parser, there’s no way for us to read the messages, at least not a definite one.
After digging into Node’s “http” module, it occurred to me that it uses an internal module called “_http_common” that exports an HTTP parser, which we can use to parse HTTP messages. Since it’s used internally, its API isn’t documented, but we have plenty of usage examples in the
test-http-parser.js file. Needless to say that the parser shouldn’t be used in production; internal components are for internal use only. But if you’re onboard with my experiment, please ensure that you’re using Node.js v18.10.0 for similar results.
Before we continue, I would like to mention a few facts about TCP (from Wikipedia):
- TCP is reliable — It manages message acknowledgment, re-transmission and timeouts. Multiple attempts to deliver the message are made. If data gets lost along the way, data will be re-sent. In TCP, there’s either no missing data, or, in case of multiple timeouts, the connection is dropped.
- Packets are ordered — When data segments arrive in the wrong order, TCP buffers the out-of-order data until all data can be properly re-ordered and delivered to the application.
Despite these great attributes, TCP is heavyweight — it requires three packets to set up a socket connection before any user data can be sent. This means that 2 or more parallel HTTP/1.1 requests will need to utilize multiple TCP connections, which can be expensive.
HTTP/2 tries to mitigate that by managing parallel requests across a single connection. However, as part of TCP’s reliability mechanism, if a single packet is timed out, the entire connection will drop, which will cause all requests to unnecessarily fail. In addition, a delayed packet, or one that arrives out of order, can stall all requests across the board.
One may ask — is there an alternative protocol for passing parallel HTTP messages without having collective consequences? And the answer is yes — the “UDP” protocol. Unlike TCP:
- There’s no such concept of connection. Either you send data and hopefully someone receives it, or you listen to data and hopefully someone transmits it.
- You can only send small chunks of data which don’t necessarily represent a complete message (see MTU), and there’s no delimiters whatsoever unless you explicitly include them.
- As a result, having a request / response mechanism is much less trivial, but still achievable.
Similar to TCP, Node.js comes with a built-in module that implements UDP, called “dgram”. If a UDP client would like to be greeted by a server, this is how it would look like:
Note that the
client.connect() function doesn’t actually create a connection, it’s just a utility to filter incoming traffic.
So given that we have an HTTP parser, this is how an HTTP solution can be implemented over UDP:
As much as it looks like an ideal alternative to TCP, I would hold my horses. UDP comes with some major drawbacks (from Wikipedia):
- UDP is unreliable — When a UDP message is sent, it cannot be known if it will reach its destination; it could get lost along the way. There is no concept of acknowledgment, re-transmission, or timeout.
- Packets don’t arrive in order — If two messages are sent to the same recipient, the order in which they arrive cannot be guaranteed.
Despite these issues, due to the fact that UDP is the only protocol that can allow multiplexing, a new protocol was invented, called “Quic”. Quic provides an API somewhat closer to TCP, but it’s built on top of UDP. By using some clever algorithms and techniques, Quic managed to mitigate the many drawbacks within the UDP protocol. This means:
- Quic is reliable, and will try to re-transmit lost packets.
- Packets arrive in order.
- Quic is lightweight.
Accordingly, HTTP/3, which is relatively new and still experimental (see browser compatibility table and Node.js discussion), fixes the issues presented by HTTP/2 by taking advantage of the Quic protocol.
Are we going to say goodbye to TCP at some point? Perhaps this is a good question for another article.