Writing a reverse proxy/loadbalancer from the ground up in C, part 1: a trivial single-threaded proxy

12 August 2013

This is the first step along my road to building a simple C-based reverse proxy/loadbalancer so that I can understand how nginx/OpenResty works — more explanation here. It’s called rsp, for Really Simple Proxy. This version listens for connections on a particular port, specified on the command line; when one is made it sends the request down to a backend — another server with an associated port, also specified on the command line — and sends whatever comes back from the backend back to the person who made the original connection. It can only handle one connection at a time — while it’s handling one, it just queues up others, and it handles them in turn. This will, of course, change later.

I’m posting this in the hope that it might help people who know Python, and some basic C, but want to learn more about how the OS-level networking stuff works. I’m also vaguely hoping that any readers who code in C day to day might take a look and tell me what I’m doing wrong :-)

The code that I’ll be describing is hosted on GitHub as a project called rsp, for “Really Simple Proxy”. It’s MIT licensed, and the version of it I’ll be walking through in this blog post is as of commit f214f5a. I’ll copy and paste the code that I’m describing into this post anyway, so if you’re following along there’s no need to do any kind of complicated checkout.

The repository has this structure:

  +- README.md
  +- LICENSE.md
  +- setup-env.sh
  +- run_integration_tests
  +- promote_to_live
  +- handle_integration_error
  +- .gitignore
  +- fts
     \--- test_can_proxy_http_request_to_backend.py
  +- src
     \--- Makefile
      +-- rsp.c

README.md, LICENSE.md and .gitignore are pretty self-explanatory.

setup-env.sh contains a few shell commands that, when run on a fresh Ubuntu machine, will install all of the dependencies for compiling rsp.

fts, short for Functional Tests, contains one Python script; this creates a simple Python web server on a specific port on localhost, then starts rsp configured so that all requests that come in on its own port get forwarded to that backend. It then sends a request to rsp and checks that the response that comes back is the one from the backend, then does another to make sure it can handle multiple requests. This is the most trivial test I could think of for a first cut at rsp, and this blog post contains an explanation of how the minimal C code to make that test pass works. Over time, I’ll add more test scripts to the repository to check each incremental improvement in rsp’s functionality. Naturally, I’m writing all of this test-first (though I’m too lazy to write unit tests right now — this may well come back to bite me later).

src contains a Makefile that knows how to build rsp, and the code for the proxy itself, rsp.c. As you might expect, most of the rest of this post will focus on the latter.

Finally, there’s run_integration_tests (which runs all of the Python tests in fts), promote_to_live, which pushes the repository it’s in to origin/master, and handle_integration_error. This is boilerplate code for the not-quite-continuous-integration system I use for my own projects, leibniz — basically just a git repository setup and a hook that makes it impossible for me to push stuff that doesn’t pass functional tests to GitHub. You can probably ignore all of those, though if you do check out the repository then run_integration_tests is a convenient way to run the FTs in one go.

So, that’s the structure. I won’t go into a description of how the Python functional test works, as I expect most readers here will understand it pretty well. And I’m assuming that you have at least a nodding acquaintance with Makefiles, so I won’t explain that bit. So, on to the C code!

rsp.c contains code for an incredibly basic proxy. It’s actually best understood by working from the top down, so let’s go. First, some pretty standard header includes:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>

A couple of constants for later use:

#define MAX_LISTEN_BACKLOG 1
#define BUFFER_SIZE 4096

And on to our first function — one that, given a connection to a client (ie. browser) that’s hit the proxy, connects to a backend, sends the client’s request to it, then gets the response from the backend and sends it back to the client. This is simpler than the code that handles the process of listening for incoming connections, so it’s worth running through it first.

void handle_client_connection(int client_socket_fd, 
                              char *backend_host, 
                              char *backend_port_str)
{

The first parameter is an integer file descriptor for the client socket connection (which has already been established by the time we get here). A file descriptor is the low-level C way of describing an open file, or a file-like object like a socket connection. The kernel has some kind of big array, where each item in the array describes all of the inner details about an open file-like thing, and the integer file descriptor is ultimately an index into that array.

We also take a string specifying the address of the backend we’re going to connect to, and an another specifying the port on the backend. We accept a string for the port (rather than an integer) because one of the the OS system calls we’re going to use accepts string services rather than ports — the most immediate advantage of which is that we can just specify "http" and let it work out that means port 80. Hardly a huge win, but neat enough.

Right, next our local variable definitions — these need to go at the start of the function — an annoying requirement in C [Update: turns out that it hasn’t been a requirement since the 1999 C standard, so later posts update this] — but we’ll backtrack to talk about what each one’s used for as we use them.

    struct addrinfo hints;
    struct addrinfo *addrs;
    struct addrinfo *addrs_iter;
    int getaddrinfo_error;

    int backend_socket_fd;

    char buffer[BUFFER_SIZE];
    int bytes_read;

So now it’s time to actually do something. Our first step is to convert the hostname/service descriptor strings we have into something that we can use to make a network connection. Historically one might have used the gethostbyname system call, but that’s apparently frowned upon these days; it’s non-reentrant and makes it hard to support both IPv4 and IPv6. The hip way to get host information is by using getaddrinfo, so that’s what we’ll do.

getaddrinfo needs three things; the hostname and service we’re connecting to, and some hints telling us what kind of thing we’re interested in hearing about — for example, in our case we only want to know about addresses of machines that can handle streaming sockets which we can read to and write from like files, rather than datagrams where we send lumps of data back and forth, one lump at a time.

We already have the hostname and service passed in as arguments, so our first step is to set up a structure to represent these hints. We have the local variable that was defined earlier as:

    struct addrinfo hints;

So we need to set some values on it. In C code, a struct that’s allocated on the stack as a local variable like that has completely undefined contents, which means that we need to clear it out by setting everything in it to zeros using memset, like this:

    memset(&hints, 0, sizeof(struct addrinfo));

…and once that’s done, we can fill in the things we’re interested in:

    hints.ai_family = AF_UNSPEC;
    hints.ai_socktype = SOCK_STREAM;

AF_UNSPEC means that we’re happy with either IPv4 or IPv6 results, and SOCK_STREAM means that we want something that supports streaming sockets.

Now we’ve set up our hints structure, we can call getaddrinfo. It returns zero if it succeeds, or an error code if it doesn’t, and the real address information results are returned in a parameter — we pass a pointer to a pointer to a struct addrinfo in, and it puts a pointer to the first in a list of results into the pointer that the pointer points to. Lovely, pointers to pointers and we’re only a dozen lines in…

    getaddrinfo_error = getaddrinfo(backend_host, backend_port_str, &hints, &addrs);
    if (getaddrinfo_error != 0) {
        fprintf(stderr, "Couldn't find backend: %s\n", gai_strerror(getaddrinfo_error));
        exit(1);
    }

So now we have, in our variable addrs, a pointer to the first item in a linked list of possible addresses that we can connect to. Each item in the list has the associated kinds of family (IPv4 or IPv6, basically), socket type (restricted to streaming sockets because that’s what we asked for in the hints), and protocol. We want to find one that we can connect to, so we loop through them:

    for (addrs_iter = addrs; 
         addrs_iter != NULL; 
         addrs_iter = addrs_iter->ai_next) 
    {

For each one, we try to create a socket using the system socket call, passing in the details of the address that we’re trying, and if that fails we move on to the next one:

        backend_socket_fd = socket(addrs_iter->ai_family, 
                                   addrs_iter->ai_socktype,
                                   addrs_iter->ai_protocol);
        if (backend_socket_fd == -1) {
            continue;
        }

If it succeeded, we try to connect to the address using that socket, and if that succeeds we break out of the loop:

        if (connect(backend_socket_fd, 
                    addrs_iter->ai_addr, 
                    addrs_iter->ai_addrlen) != -1) { 
            break;
        }

If, on the other hand, the connect failed, we close the socket (to tidy up) and move on to the next one in the loop:

        close(backend_socket_fd);
    }

Once we’re out of the loop, we need to check if we ever managed to do a successful socket creation and connect — if we don’t, we bomb out.

    if (addrs_iter == NULL) {
        fprintf(stderr, "Couldn't connect to backend");
        exit(1);
    }

Otherwise, we free the list of addresses that we got back from getaddrinfo (ah, the joys of manual memory management…)

    freeaddrinfo(addrs);

…and finally we do a really simple bit of code to actually do the proxying. For this first cut, I’ve assumed that a single read on the file descriptor that is connected to the client is enough to pull down all of the client’s headers, and that we never want to send anything from the client to the backend after those headers. So we just do one read from the client, and send everything we get from that read down to the backend:

    bytes_read = read(client_socket_fd, buffer, BUFFER_SIZE);
    write(backend_socket_fd, buffer, bytes_read);

…then we just go into a loop that reads everything it can from the backend until the read call returns zero bytes (which means end-of-file) and writes everything that it reads down the socket to the client.

    while (bytes_read = read(backend_socket_fd, buffer, BUFFER_SIZE)) {
        write(client_socket_fd, buffer, bytes_read);
    }

Then we close the client socket, and that’s the total of our client-handling code.

    close(client_socket_fd);
}

The code we use to create a socket to listen for incoming client connections and pass them off to the function we’ve just gone through is actually pretty similar, but with a few interesting twists. It lives (as you might expect) in the program’s main function:

int main(int argc, char *argv[]) {

…which we start off with our local variables again:

    char *server_port_str;
    char *backend_addr;
    char *backend_port_str;

    struct addrinfo hints;
    struct addrinfo *addrs;
    struct addrinfo *addr_iter;
    int getaddrinfo_error;

    int server_socket_fd;
    int client_socket_fd;

    int so_reuseaddr;

The first step is just to check that we have the right number of command-line arguments and to put them into some meaningfully-named variables:

    if (argc != 4) {
        fprintf(stderr, 
                "Usage: %s   \n", 
                argv[0]);
        exit(1);
    }
    server_port_str = argv[1];
    backend_addr = argv[2];
    backend_port_str = argv[3];

Now, the next step is to get the address of localhost. We do that with the same kind of getaddinfo call that we did on the client-connection handling side, but this time we add one extra value to the hints, and pass in NULL as the first parameter to the call:

    memset(&hints, 0, sizeof(struct addrinfo));
    hints.ai_family = AF_UNSPEC;
    hints.ai_socktype = SOCK_STREAM;
    hints.ai_flags = AI_PASSIVE;

    getaddrinfo_error = getaddrinfo(NULL, server_port_str, &hints, &addrs);

The ai_flags structure member being set to AI_PASSIVE, combined with the NULL first parameter, tells getaddrinfo that we want to be able to run a server socket on this address — we want to be able to listen for incoming connections, accept them, and handle them.

Once we’ve got the list of appropriate addresses, we iterate through them again, and for each one we create a socket like we did before, but now instead of trying to connect to them to make an outgoing connection, we try to bind so that we can accept incoming connections:

    for (addr_iter = addrs; addr_iter != NULL; addr_iter = addr_iter->ai_next) {
        server_socket_fd = socket(addr_iter->ai_family,
                                  addr_iter->ai_socktype,
                                  addr_iter->ai_protocol);
        if (server_socket_fd == -1) {
            continue;
        }

        so_reuseaddr = 1;
        setsockopt(server_socket_fd, SOL_SOCKET, SO_REUSEADDR, &so_reuseaddr, sizeof(so_reuseaddr));

        if (bind(server_socket_fd, 
                 addr_iter->ai_addr, 
                 addr_iter->ai_addrlen) == 0) 
        {
            break;
        }

        close(server_socket_fd);
    }

Binding basically says “I own this socket and I’m going to listen for incoming connections on it”.

There’s also a second little tweak in that code — the call to setsockopt. This is useful when you’re working on something like this. The main loop for rsp never exits, so of course you need to use control-C or kill to quit it. The problem is that this means we never close our server socket, so the operating system is never told “we’re not listening on this port any more”. The OS has timeouts, and if it notices that the program that was listening on a particular port has gone away, it will free it up for use by other programs. But this can take a few minutes, so if you’re debugging and starting and stopping the server frequently, you can wind up with errors trying to bind when you start it. The SO_REUSEADDR flag that we’re associating with the socket is just a way of saying “I’m happy to share this socket with other people”, which mitigates this problem.

Anyway, once we’ve bound (or if we were unable to bind) then we handle errors and tidy up just as we did before:

    if (addr_iter == NULL) {
        fprintf(stderr, "Couldn't bind\n");
        exit(1);
    }

    freeaddrinfo(addrs);

Finally, we need to mark the socket so that it’s “passive” — that is, it’s one that will listen for incoming connections instead of making outgoing connections. This is done using the slightly-confusingly-named listen call, which doesn’t actually listen for anything but simply marks the socket appropriately:

    listen(server_socket_fd, MAX_LISTEN_BACKLOG);

The second parameter says that we’re going to allow a certain number of incoming connections to build up while we’re handling stuff.

Now we’ve got our server socket ready, and the next code is the endless loop that actually does the proxying.

    while (1) {

In it, we need to wait for incoming connections, using accept, which blocks until someone connects to us.

        client_socket_fd = accept(server_socket_fd, NULL, NULL);
        if (client_socket_fd == -1) {
            perror("Could not accept");
            exit(1);
        }

accept takes three parameters; the server socket’s file descriptor (which you’d expect given that it needs to know what to work on) and also some pointers into which it can put information about the incoming client connection. We’re likely to need something like that later, but right now we don’t need it so we won’t worry about it — passing in NULL is the appropriate way to tell accept that we don’t care.

After the accept has been done, we have a file descriptor that describes the client connection, so we can hand off to the function that we described earlier:

        handle_client_connection(client_socket_fd, backend_addr, backend_port_str);

And off we go, around the loop again:

    }
}

Phew. So that was a bit harder than it would have been in Python. But not too scary. Hopefully it was all reasonably clear — and if it wasn’t, please let me know in the comments. And if any C experts have been reading — thank you for putting up with the slow pace, and if you have any suggestions then I’d love to hear them!

The next step, I think, is to make this a more useful proxy by making it no longer single-shot, and instead accept multiple simultaneous client connections and proxy them back to the backend. We can then add multiple backends, and start looking at selecting which one to proxy to based on the Host HTTP header. And, as I’m aiming to produce a cut-down version of OpenResty, then adding some Lua scripting would help too.

But multiple connections first. Here’s how I handle them.

Some acknowledgements: obviously the Linux man pages at linux.die.net were invaluable in putting this together. An earlier version of this proxy used code from this socket server example at tutorialspoint and its associated socket client example, but the code there (on examination) turned out to use quite a few deprecated functions, so in fact most of it wound up getting rewritten using the man page for getaddrinfo.