Writing a reverse proxy/loadbalancer from the ground up in C, part 2: handling multiple connections with epoll

7 September 2013

This is the second step along my road to building a simple C-based reverse proxy/loadbalancer so that I can understand how nginx/OpenResty works — more background here. Here’s a link to the first part, where I showed the basic networking code required to write a proxy that could handle one incoming connection at a time and connect it with a single backend.

This (rather long) post describes a version that uses Linux’s epoll API to handle multiple simultaneous connections — but it still just sends all of them down to the same backend server. I’ve tested it using the Apache ab server benchmarking tool, and over a million requests, 100 running concurrently, it adds about 0.1ms to the average request time as compared to a direct connection to the web server, which is pretty good going at this early stage. It also doesn’t appear to leak memory, which is doubly good going for someone who’s not coded in C since the late 90s. I’m pretty sure it’s not totally stupid code, though obviously comments and corrections would be much appreciated!

[UPDATE: there’s definitely one bug in this version — it doesn’t gracefully handle cases when the we can’t send data to the client as fast as we’re receiving it from the backend. More info here.]
Continue reading

Writing a reverse proxy/loadbalancer from the ground up in C, part 1: a trivial single-threaded proxy

12 August 2013

This is the first step along my road to building a simple C-based reverse proxy/loadbalancer so that I can understand how nginx/OpenResty works — more explanation here. It’s called rsp, for Really Simple Proxy. This version listens for connections on a particular port, specified on the command line; when one is made it sends the request down to a backend — another server with an associated port, also specified on the command line — and sends whatever comes back from the backend back to the person who made the original connection. It can only handle one connection at a time — while it’s handling one, it just queues up others, and it handles them in turn. This will, of course, change later.

I’m posting this in the hope that it might help people who know Python, and some basic C, but want to learn more about how the OS-level networking stuff works. I’m also vaguely hoping that any readers who code in C day to day might take a look and tell me what I’m doing wrong :-)
Continue reading

Writing a reverse proxy/loadbalancer from the ground up in C, part 0: introduction

8 August 2013

We’re spending a lot of time on nginx configuration at PythonAnywhere. We’re a platform-as-a-service, and a lot of people host their websites with us, so it’s important that we have a reliable load-balancer to receive all of the incoming web traffic and appropriately distribute it around backend web-server nodes.

nginx is a fantastic, possibly unbeatable tool for this. It’s fast, reliable, and lightweight in terms of CPU resources. We’re using the OpenResty variant of it, which adds a number of useful modules — most importantly for us, one for Lua scripting, which means that we can dynamically work out where to send traffic as the hits come in.

It’s also quite simple to configure at a basic level. You want all incoming requests for site X to go to backend Y? Just write something like this:

    server {
        server_name X
        listen 80;

        location / {
            proxy_set_header Host $host;
            proxy_pass Y;
        }
    }

Simple enough. Lua scripting is pretty easy to add — you just put an extra directive before the proxy_pass that provides some Lua code to run, and then variables you set in the code can be accessed from the proxy_pass.

But there are many more complicated options. worker_connections, tcp_nopush, sendfile, types_hash_max_size… Some are reasonably easy to understand with a certain amount of reading, some are harder.

I’m a big believer that the best way to understand something complex is to try to build your own simple version of it. So, in my copious free time, I’m going to start putting together a simple loadbalancer in C. The aim isn’t to rewrite nginx or OpenResty; it’s to write enough equivalent functionality that I can better understand what they are really doing under the hood, in the same way as writing a compiler for a toy language gives you a better understanding of how proper compilers work. I’ll get a good grasp on some underlying OS concepts that I have only a vague appreciation of now. It’s also going to be quite fun coding in C again. I’ve not really written any since 1997.

Anyway, I’ll document the steps I take here on this blog; partly because there’s a faint chance that it might be interesting to other experienced Python programmers whose C is rusty or nonexistent and want to get a view under the hood, but mostly because the best way to be sure you really understand it is to try to explain it to other people.

I hope it’ll be interesting!

Here’s a link to the first post in the series: Writing a reverse proxy/loadbalancer from the ground up in C, part 1: a trivial one-shot proxy

SNI-based reverse proxying with Go(lang)

18 July 2013

Short version for readers who know all about this kind of stuff: we build a simple reverse-proxy server in Go that load-balances HTTP requests using the Hosts header and HTTPS using the SNIs from the client handshake. Backends are selected per-host from sets stored in a redis database. It works pretty well but we won’t be using it because it can’t send the originating client IP to the backends when it’s handling HTTPS. Code here.

We’ve been looking at options to load-balance our user’s web applications at PythonAnywhere; this post is about something we considered but eventually abandoned; I’m posting it because the code might turn out to be useful to other people.

A bit of background first; if you already know what a reverse proxy is and how load-balancing and virtual hosting work, you can skip forward a bit.

Imagine an old-fashioned shared hosting environment. You’re able to run a web application on a machine that’s being used by lots of other people, and you’re given that machine’s IP address. You set up your DNS configuration so that your domain points to that IP address, and it all works. When a connection comes in from a browser to access your site, the web server on the machine needs to work out which person’s web app it should route it to. It does this by looking at the HTTP request and finding a Host header in it. So, by using the Host header, the shared hosting provider can keep costs down by sharing an IP address and a machine between multiple clients. This is called virtual hosting.

Now consider the opposite case — a high-traffic website, where one machine isn’t enough to handle all of the traffic. Processing a request for a page on a website can take a certain amount of machine resources — database lookups, generating dynamic pages from templates, and so on. So a single web server might not be enough to cope with lots of traffic. In this case, people use what’s called a reverse proxy, or load-balancer. In the simplest case, this is just a machine running on a single IP. When a request comes in, it selects a backend — that is, one of a number of web servers, each of which is running the full website’s code. It then just sends the request down to one of them, and copies all data that comes back from that backend up to the browser that made the request. Because just copying data around from backend to browser and vice versa is much easier work than processing the actual request, a single load-balancer can handle many more requests than any of the backend web servers could, and if it’s configured to select backends appropriately it can spread the load smoothly across them. Additionally, this kind of setup can handle outages gracefully — if one backend stops responding, it can stop routing to it and use the others as backups.

Now let’s combine those two ideas. Imagine a platform-as-a-service, where each outward-facing IP might be responsible for handling large numbers of websites. But for reliability and performance, it might make sense to have each website backed by multiple backends. So, for example, a PaaS might have a thousand websites backed by one hundred different webservers, where website one is handled by backends one, two and three, website two by backends two, three and four, and so on. This means that the PaaS can keep costs down (running ten web apps per backend server) and reliability and performance up (each website having three independent backends).

So, that’s the basics. There are a number of great tools which can be used to operate as super-efficient proxies that can handle this kind of many-hostnames-to-many-backends mapping. nginx is the most popular, but there are also haproxy and hipache. We are planning to choose one of these for PythonAnywhere (more about that later), but we did identify one slight problem with all of them. The code I’m shortly going to show was our attempt at working around that problem.

The description above of how virtual hosting works is fine when we’re talking about HTTP. But increasingly, people want to use HTTPS for secure connections.

When an HTTPS connection comes in, the server has a problem. Before it can decode what’s in the request and get the Host header, it needs to establish a secure link. Its first step to establish that link is to send a certificate to the client to prove it is who it says it is. But each of the different virtual hosts on the machine will need a different certificate, because they’re all on different domains. So there’s a chicken-and-egg problem; it needs to know which host it is meant to be in order to send the right certificate, but it needs to have sent the certificate in order to establish a secure connection to find out which host it is meant to be. This was a serious problem until relatively recently; basically, it meant that every HTTPS-secured site had to have its own dedicated IP address, so that the server could tell which certificate to serve when a client connected by looking at the IP address the connection came in on.

This problem was solved by an extension to the TLS protocol (TLS being the latest protocol to underly HTTPS) called “Server Name Indication”. Basically, it takes the idea of the HTTP Host header and moves it down the stack a bit. The initial handshake message that a client connecting to a server used to just say “here I am and here’s the kind of SSL protocol I can handle — now what’s your certificate?” With SNI the handshake also says “here’s the hostname I expect you to have”

So with SNI, a browser connects to a server, and the server looks at the handshake to find out which certificate to use. The browser and server establish a secure link and then the browser sends the normal HTTP request, which has a Host header, which it then uses to send the request to the appropriate web app.

Let’s get back to the proxy server that’s handling incoming requests for lots of different websites and routing them to lots of different backends. With all of the proxies mentioned above — nginx, hipache and haproxy — a browser makes a connection, the proxy does all of the SNI stuff to pick the right certificate, it decodes the data from the client, works out which backend to send it to using the Host header in the decoded data, and then forwards everything on.

There’s an obvious inefficiency here. The proxy shouldn’t have to decode the secure connection to get the Host header — after all, it already knows that from the information in the SNI. And it gets worse. Decoding the secure connection uses up CPU cycles on the proxy. And either the connection between the proxy and the backends is non-secure, which could be an issue if a hacker got onto the network, or it’s secure, in which case the proxy is decoding and then encoding everything that goes through it — even more CPU load. Finally, all of the certificates for every site that the proxy’s handling — and their associated private keys — have to be available to the proxy. Which is another security risk if it gets hacked.

So, probably like many people before us, we thought “why not just route HTTPS based on the SNI? It can’t be that hard!” And actually, it isn’t. Here’s a GitHub project with a simple Go application that routes HTTP requests using the hosts header, and HTTPS using the SNI. It never needs to know anything about the certificates for the sites it’s proxying for, and all data is passed through without any decryption.

So why didn’t we decide to use it? Access logs and spam filters. The thing is, people who are running websites like to know who’s been looking at their stuff — for their website metrics, for filtering out spammy people using tools like Akismet, and so on. If you’re using a proxy, then the backend sees every request as coming from the proxy’s IP, which isn’t all that useful. So normally a proxy will add an extra header to HTTP requests it passes through — X-Forwarded-For is the usual one.

And the problem with an SNI proxy is the same as its biggest advantage. Because it’s not decoding the secure stream from the browser, it can’t change it, so it can’t insert any extra headers. So all HTTPS requests going over any kind of SNI-based reverse proxy will appear to come from the proxy itself. Which breaks things.

So we’re not going to use this. And TBH it’s not really production-level code — it was a spike and is also the first Go code I’ve ever written, so it’s probably full of warts (comments very much welcomed!). Luckily we realised the problem with the backends not knowing about the client’s IP before we started work on rewriting it test-first.

On the other hand, it might be interesting for anyone who wants to do stuff like this. The interesting stuff is mostly in handleHTTPSConnection, which decodes the TLS handshake sent by the client to extract the SNI.

I did a bit of very non-scientific testing just to make sure it all works. I started three backends servers with simple Flask apps that did a sleep on every request to simulate processing:

from flask import Flask
import time
from socket import gethostname

app = Flask(__name__)

@app.route("/")
def index():
    time.sleep(0.05)
    return "Hello from " + gethostname()

if __name__ == "__main__":
    app.run("0.0.0.0", 80, processes=4)

Then ran the Apache ab tool to see what the performance characteristics were for one of them:

root@abclient:~# ab -n1000 -c100 http://198.199.83.71/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 198.199.83.71 (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests


Server Software:        Werkzeug/0.9.2
Server Hostname:        198.199.83.71
Server Port:            80

Document Path:          /
Document Length:        19 bytes

Concurrency Level:      100
Time taken for tests:   21.229 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      172000 bytes
HTML transferred:       19000 bytes
Requests per second:    47.10 [#/sec] (mean)
Time per request:       2122.938 [ms] (mean)
Time per request:       21.229 [ms] (mean, across all concurrent requests)
Transfer rate:          7.91 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    3   7.4      0      37
Processing:    73 2025 368.7   2129    2387
Waiting:       73 2023 368.4   2128    2386
Total:        103 2028 363.7   2133    2387

Percentage of the requests served within a certain time (ms)
  50%   2133
  66%   2202
  75%   2232
  80%   2244
  90%   2286
  95%   2317
  98%   2344
  99%   2361
 100%   2387 (longest request)
root@abclient:~# 

Then, after adding records to the proxy’s redis instance to tell it to route requests with the hostname proxy to any of the backends, and hacking the hosts file on the ab client machine to make the hostname proxy point to it:

root@abclient:~# ab -n1000 -c100 http://proxy/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking proxy (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests


Server Software:        Werkzeug/0.9.2
Server Hostname:        proxy
Server Port:            80

Document Path:          /
Document Length:        19 bytes

Concurrency Level:      100
Time taken for tests:   7.668 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      172000 bytes
HTML transferred:       19000 bytes
Requests per second:    130.41 [#/sec] (mean)
Time per request:       766.803 [ms] (mean)
Time per request:       7.668 [ms] (mean, across all concurrent requests)
Transfer rate:          21.91 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   1.7      0       9
Processing:    93  695 275.4    617    1228
Waiting:       93  693 275.4    614    1227
Total:         99  696 274.9    618    1228

Percentage of the requests served within a certain time (ms)
  50%    618
  66%    799
  75%    948
  80%    995
  90%   1116
  95%   1162
  98%   1185
  99%   1204
 100%   1228 (longest request)
root@abclient:~#

So, it works. I’ve not done ab testing with the HTTPS side of things, but I have hacked my own hosts file and spent a day accessing Google and PythonAnywhere itself via the proxy. It works :-)

As to what we’re actually going to use for load-balancing PythonAnywhere:

  • nginx is great but stores its routing config in files, which doesn’t easily scale to large numbers of hosts/backends. It’s doable, but it’s just a nightmare to manage, especially if things go wrong.
  • haproxy is the same — worse, it needs to be fully restarted (interrupting ongoing connections) if you change the config.
  • hipache stores data in redis (which is what inspired me to do something similar for this proxy) so it can gracefully handle rapidly-changing rounting setups. But it’s written in Node.js, so while it’s pretty damn fast, it’s not as fast as nginx.

But… as the dotcloud people who wrote hipache recently pointed out (bottom of the post), nginx’s built-in lua scripting support is now at a level where you can store your routing config in redis — so with a bit of work, you can get the speed of nginx with the ease of configuration of hipache. So that’s where we’re heading. We’ll just have to make sure the proxy and its certificates are super-secure, and live with the extra CPU load.

How many Python programmers are there in the world?

24 June 2013

We’ve been talking to some people recently who really wanted to know what the potential market size was for PythonAnywhere, our Python Platform-as-a-Service and cloud-based IDE.

There are a bunch of different ways to look at that, but the most obvious starting point is, “how many people are coding Python?” This blog post is an attempt to get some kind of order-of-magnitude number for that.

First things first: Wikipedia has an estimate of 10 million Java developers (though I couldn’t find the numbers to back that up on the cited pages) but nothing for Python — or, indeed, any of the other languages I checked. So nothing there.

A bit of Googling around gets one interesting hit; in this Stack Overflow answer, “Tall Jeff” says that the 2007 version of Learning Python estimated that there were 1 million Python programmers in the world. Using Amazon’s “Look inside” feature on the current edition, they still have the same number but for the present day, but let’s assume that they were right originally and the number has grown since then. Now, according to the Python wiki, there were 586 people at the 2007 PyCon. According to the front page at PyCon.org, there were 2,500 people at PyCon 2013. So if we take that as a proxy for the growth of the language, we get one guess of the number of Python developers: 4.3 million.

Let’s try another metric. Python.org’s web statistics are public. Looking at the first five months of this year, and adding up the total downloads, we get:

Jan: 2,584,754
Feb: 2,539,177
Mar: 3,182,946
Apr: 3,199,012
May: 2,855,033

Averaging that over a year gives us 34,466,213 downloads per year. It’s worth noting that these are overwhelmingly Windows downloads — most Linux users are going to be using the versions packaged as part of their distro, and (I think, but correct me if I’m wrong) the same is largely going to be the case on the Mac.

So, 34.5 million downloads. There were ten versions of Python released over the last year, so for let’s assume that each developer downloaded each version once and once only; that gives us 3.5 million Python programmers on Windows.

What other data points are there? This job site aggregator’s blog post suggests using searches for resumes/CVs as a way of getting numbers. Their suggested search for Python would be

(intitle:resume OR inurl:resume) Python -intitle:jobs -resumes -apply

Being in the UK, where we use “CV” more than we use “resume”, I tried this:

(intitle:resume OR inurl:resume OR intitle:cv OR inurl:cv) Python -intitle:jobs -resumes -apply

The results were unfortunately completely useless. 338,000 hits but the only actual CV/resume on the first page was Guido van Rossum’s — everything else was about the OpenCV computer vision library, or about resuming things.

So let’s scrap that. What else can we do? Well, taking inspiration (and some raw data) from this excellent blog post about estimating the number of Java programmers in the world, we can do this calculation:

  • Programmers in the world: 43,000,000 (see the link above for the calculation)
  • Python developers as per the latest TIOBE ranking: 4.183%, which gives 1,798,690
  • Python developers as per the latest LangPop.com ranking: 7% (taken by an approximate ratio of the Python score to the sum of the scores of all languages), which gives 2,841,410

OK, so there I’m multiplying one very approximate number of programmers by a “percentage” rating that doesn’t claim to be a percentage of programmers using a given language. But this ain’t rocket science, I can mix and match units if I want.

The good news is, we’re in the same order of magnitude; we’ve got numbers of 1.8 million, 2.8 million, 3.5 million, and 4.3 million. So, based on some super-unscientific guesswork, I think I can happily say that the number of Python programmers in the world is in the low millions.

What do you think? Are there other ways of working this out that I’ve missed? Does anyone have (gasp!) hard numbers?

A super-simple chat app with AngularJS, SockJS and node.js

12 February 2013

We’re planning to move to a more advanced JavaScript library at PythonAnywhere. jQuery has been good for us, but we’re rapidly reaching a stage where it’s just not enough.

There are a whole bunch of JavaScript MVC frameworks out there that look tempting — see TodoMVC for an implementation of a simple app in a bunch of them. We’re asking the people we know and trust which ones are best, but in the meantime I had a look at AngularJS and knocked up a quick chat app to see how easy it would be. The answer was “very”.

Here’s the client-side code:

<html ng-app>
<head>
<script src="http://cdn.sockjs.org/sockjs-0.3.min.js"></script>
<script src="http://ajax.googleapis.com/ajax/libs/angularjs/1.0.4/angular.min.js"></script>

<script>
    var sock = new SockJS('http://192.168.0.74:9999/chat');
    function ChatCtrl($scope) {
        $scope.messages = [];
        $scope.sendMessage = function() {
            sock.send($scope.messageText);
            $scope.messageText = "";
        };

        sock.onmessage = function(e) {
            $scope.messages.push(e.data);
            $scope.$apply();
        };
    }
</script>

</head>

<body>

<div ng-controller="ChatCtrl">
    <ul>
        <li ng-repeat="message in messages">{{message}}</li>
    </ul>

    <form ng-submit="sendMessage()">
        <input type="text" ng-model="messageText" placeholder="Type your message here" />
        <input type="submit" value="Send" />
    </form
</div>

</body>
</html>

Then on the server side I wrote this server (in node.js because I’ve moved to Shoreditch and have ironic facial hair it was easy to copy, paste and hack from the SockJS docs — I’d use Tornado if this was on PythonAnywhere):

var http = require('http');
var sockjs = require('sockjs');

var connections = [];

var chat = sockjs.createServer();
chat.on('connection', function(conn) {
    connections.push(conn);
    var number = connections.length;
    conn.write("Welcome, User " + number);
    conn.on('data', function(message) {
        for (var ii=0; ii < connections.length; ii++) {
            connections[ii].write("User " + number + " says: " + message);
        }
    });
    conn.on('close', function() {
        for (var ii=0; ii < connections.length; ii++) {
            connections[ii].write("User " + number + " has disconnected");
        }
    });
});

var server = http.createServer();
chat.installHandlers(server, {prefix:'/chat'});
server.listen(9999, '0.0.0.0');

And that's it! It basically does everything you need from a simple chat app. Definitely quite impressed with AngularJS. I'll try it in some of the other frameworks we evaluate and post more here.

Reverse proxying HTTP and WebSockets with virtual hosts using nginx and tcp_proxy_module

5 October 2012

I spent today trying to work out how we could get PythonAnywhere to support WebSockets in our users’ web applications. This is a brief summary of what I found, I’ll put it in a proper post on the PythonAnywhere blog sometime soon…

We use nginx, and it can happily route HTTP requests through to uwsgi applications (which is the way we use it) and can even more happily route them through to other socket-based servers running on specific ports (which we don’t use but will in the future so that we can support Twisted, Tornado, and so on — once we’ve got network namespacing sorted).

But by default, nginx does not support reverse proxying WebSockets requests. There are various solutions to this posted around the net, but they don’t explain how to get it working with virtual hosts. I think that this is because they’re all a bit old, because it’s actually quite easy once you know how.

(It’s worth mentioning that there are lots of cool non-nginx solutions using excellent stuff like haproxy and hipache. I’d really like to upgrade our infrastructure to use one of those two. But not now, we all too recently moved from Apache to nginx and I’m scared of big infrastructure changes in the short term. Lots of small ones, that’s the way forward…)

Anyway, let’s cut to the chase. This excellent blog post by Johnathan Leppert explains how to configure nginx to do TCP proxying. TCP proxying is enough to get WebSockets working if you don’t care about virtual hosts — but because arbitrary TCP connections don’t necessarily have a Host: header, it can’t work if you do care about them.

However, since the post was written, the nginx plugin module Johnathan uses has been improved so that it now supports WebSocket proxying with virtual hosts.

To get nginx to successfully reverse-proxy WebSockets with virtual host support, compile Nginx with tcp_proxy_module as per Johnathan’s instructions (I’ve bumped the version to the latest stable as of today):

export NGINX_VERSION=1.2.4
curl -O http://nginx.org/download/nginx-$NGINX_VERSION.tar.gz
git clone https://github.com/yaoweibin/nginx_tcp_proxy_module.git
tar -xvzf nginx-$NGINX_VERSION.tar.gz
cd nginx-$NGINX_VERSION
patch -p1 < ../nginx_tcp_proxy_module/tcp.patch
./configure --add-module=../nginx_tcp_proxy_module/
sudo make && make install

Then, to use the new WebSockets support in tcp_proxy_module, put something like this in your nginx config:

worker_processes  1;

events {
    worker_connections  1024;
}

tcp {
    upstream site1 {
        server 127.0.0.1:1001;

        check interval=3000 rise=2 fall=5 timeout=1000;
    }

    server {
        listen 0.0.0.0:80;
        server_name site1.com;

        tcp_nodelay on;
        websocket_pass site1;
    }
}


tcp {
    upstream site2 {
        server 127.0.0.1:1002;

        check interval=3000 rise=2 fall=5 timeout=1000;
    }

    server {
        listen 0.0.0.0:80;
        server_name site2.com;

        tcp_nodelay on;
        websocket_pass site2;
    }
}

Hopefully that's enough to help a few people googling around for help like I was this morning. Leave a comment if you have any questions!

Raspberry Pi setup notes part 1: getting the display to work!

20 June 2012

I received my Raspberry Pi yesterday, and today got it working well enough to display a text-based console on a DVI monitor using Arch Linux. There were a few hiccups along the way, so here are the complete notes so that anyone googling for the same errors as the ones I saw can benefit from my experience.

tl;dr: the file /boot/config.txt sets up various things before the OS is loaded, including HDMI settings. The one in the Arch Linux SD card image didn’t work with my machine setup. The system default, which you can get just by removing that file completely, worked just fine.

Here are the details showing how I got to that…

  • I started with a Stock Arch Linux SD image
  • The monitor was a basic Acer DVI one, 1280×1024. It was attached to the RPi’s HDMI port using a DVI -> HDMI adapter (labelled DVI-D -> HDMI) from Nikkai via Maplin. NB there are lots of different kinds of DVI plugs/sockets: here are some diagrams on Wikipedia. The socket on the adapter had holes for DVI-I dual link, and the plug from the monitor’s cable was DVI-D single link. Nonetheless, as I eventually got it working, I guess that they are intercompatible to some degree.
  • As well as the monitor, the RPi was plugged into a USB keyboard, and an ethernet switch.
  • When I plugged my MicroUSB mains adapter into the RPi, the display remained blank — it said there was no signal.
  • I checked out the logs of my DHCP server and saw that something called alarmpi had just got a lease, so I sshed to the IP and was able to log in using the default username/password for the Arch Linux distro (root/root) It was obviously working :-) :-) :-) (BTW the best guess around the office is that alarmpi is Arch Linux ARM Pi.
  • A bit of poking around with Google found various forum posts, many of which suggested that the key was the settings in /boot/config.txt
  • So I decided to look at that file in my ssh session. On the image I was using, it contained this:
    hdmi_mode=19
    #arm_freq=800
    disable_overscan=1
    
  • A forum post referenced this page, which appears to be a script to build a Debian wheezy image for the RPi; specifically, it contains the complete /boot/config.txt for that image, which has detailed comments explaining what a bunch of the switches do. So I edited my own config, merging that file in with the original settings (which I flagged with ### Original OOB setting):
    # uncomment if you get no picture on HDMI for a default "safe" mode
    #hdmi_safe=1
    
    # uncomment this if your display has a black border of unused pixels visible
    # and your display can output without overscan
    #disable_overscan=1
    ### Original OOB setting:
    disable_overscan=1
    
    # uncomment the following to adjust overscan. Use positive numbers if console
    # goes off screen, and negative if there is too much border
    #overscan_left=16
    #overscan_right=16
    #overscan_top=16
    #overscan_bottom=16
    
    # uncomment to force a console size. By default it will be display's size minus
    # overscan.
    #framebuffer_width=1280
    #framebuffer_height=720
    
    # uncomment if hdmi display is not detected and composite is being output
    #hdmi_force_hotplug=1
    
    # uncomment to force a specific HDMI mode (this will force VGA)
    #hdmi_group=1
    #hdmi_mode=1
    ### Original OOB setting:
    hdmi_mode=19
    
    # uncomment to force a HDMI mode rather than DVI. This can make audio work in
    # DMT (computer monitor) modes
    #hdmi_drive=2
    
    # uncomment to increase signal to HDMI, if you have interference, blanking, or
    # no display
    #config_hdmi_boost=4
    
    # uncomment for composite PAL
    #sdtv_mode=2
    
    #uncomment to overclock the arm. 700 MHz is the default.
    #arm_freq=800
    
    # for more options see http://elinux.org/RPi_config.txt
    
  • I noticed that saving the file was slow. My best current guess for why this is is that the boot stuff is mounted from a separate tiny partition at the start of the SD card, FAT-formatted, which the RPi’s chipset knows how to read to start the whole bootstrap process before there’s even an operating system installed. And writing to that partition is just harder work than writing to a normal one. But perhaps I’m completely off-base about the reason for the slowdown, although I’m pretty sure about the bootstrap/partition thing.
  • Anyway, the hdmi_safe setting looked interesting, so I tried uncommenting it and rebooting. That didn’t help.
  • Another likely one was suggested in this forum post: setting hdmi_force_hotplug=1. But that didn’t help either.
  • I’d also noticed that people were suggesting running /opt/vc/bin/tvservice as a diagnostic tool; I ran it, but it failed:
    [root@alarmpi ~]# /opt/vc/bin/tvservice
    /opt/vc/bin/tvservice: error while loading shared libraries: libvcos.so: cannot open shared object file: No such file or directory
    
  • Clearly something important was either missing or just not on the search path. Check:
    [root@alarmpi ~]# ldd /opt/vc/bin/tvservice
            libvcos.so => not found
            libpthread.so.0 => /lib/libpthread.so.0 (0x400f9000)
            libdl.so.2 => /lib/libdl.so.2 (0x40074000)
            librt.so.1 => /lib/librt.so.1 (0x4007f000)
            libc.so.6 => /lib/libc.so.6 (0x4013f000)
            /lib/ld-linux.so.3 (0x4004c000)
    
  • Right. This forum post made it clear where it was, so:
    [root@alarmpi ~]# ls /opt/vc/lib/
    libbcm_host.so    libEGL.so        libGLESv2_static.a  libkhrn_static.a  libmmal.so       libvcfiled_check.a  libvcos.so
    libcontainers.so  libEGL_static.a  libilclient.a       libluammal.so     libopenmaxil.so  libvchiq_arm.a      libvmcs_rpc_client.a
    libdebug_sym.so   libGLESv2.so     libkhrn_client.a    liblua.so         libOpenVG.so     libvchostif.a       libWFC.so
    [root@alarmpi ~]# export LD_LIBRARY_PATH=/opt/vc/lib/:LD_LIBRARY_PATH
    [root@alarmpi ~]# ldd /opt/vc/bin/tvservice
            libvcos.so => /opt/vc/lib/libvcos.so (0x400c2000)
            libpthread.so.0 => /lib/libpthread.so.0 (0x4012d000)
            libdl.so.2 => /lib/libdl.so.2 (0x40038000)
            librt.so.1 => /lib/librt.so.1 (0x400ff000)
            libc.so.6 => /lib/libc.so.6 (0x4014d000)
            /lib/ld-linux.so.3 (0x40010000)
    
  • Good. Now let’s get some diagnostics:
    [root@alarmpi ~]# /opt/vc/bin/tvservice -s
    state 0x40002, 720x480 @ 60Hz, interlaced
    [root@alarmpi ~]# /opt/vc/bin/tvservice -m CEA
    Group CEA has 0 modes:
    [root@alarmpi ~]# 
    
  • Hmmm. Not terribly helpful. But maybe there are more options? This forum post suggested one:
    [root@alarmpi ~]# /opt/vc/bin/tvservice -m DMT
    Group DMT has 13 modes:
               mode 4: 640x480 @ 60Hz, progressive
               mode 5: 640x480 @ 72Hz, progressive
               mode 6: 640x480 @ 75Hz, progressive
               mode 8: 800x600 @ 56Hz, progressive
               mode 9: 800x600 @ 60Hz, progressive
               mode 10: 800x600 @ 72Hz, progressive
               mode 11: 800x600 @ 75Hz, progressive
               mode 16: 1024x768 @ 60Hz, progressive
               mode 17: 1024x768 @ 70Hz, progressive
               mode 18: 1024x768 @ 75Hz, progressive
               mode 21: 1152x864 @ 75Hz, progressive
               mode 35: 1280x1024 @ 60Hz, progressive
               mode 36: 1280x1024 @ 75Hz, progressive
    
  • ooookay… mode 35 looks the most relevant for my monitor…
  • But reading the forum post further, it also sounds like removing the config.txt and rebooting is a good thing for debugging. Let’s do that first.
    [root@alarmpi ~]# mv /boot/config.txt .
    [root@alarmpi ~]# ls /boot
    arm128_start.elf  bootcode.bin      kernel_emergency.img  start.elf
    arm192_start.elf  cmdline.txt       kernel.img
    arm224_start.elf  kernel_debug.img  loader.bin
    [root@alarmpi ~]# ls .
    config.txt
    
  • Rebooted, and… well, bugger me, it worked!

Running Django unit tests on PythonAnywhere

21 May 2012

I was working on a side project today, a Django app hosted at PythonAnywhere. While writing some initial unit tests, I discovered a confusing bug. When you try to run the tests for your app, you get an error message creating the database (for the avoidance of doubt, USERNAME was my PA username):

18:57 ~/somewhere (master)$ ./manage.py test
Creating test database for alias 'default'...
Got an error creating the test database: (1044, "Access denied for user 'USERNAME'@'%' to database 'test_USERNAME$default'")
Type 'yes' if you would like to try deleting the test database 'test_USERNAME$default', or 'no' to cancel: no
Tests cancelled.

The problem is that PythonAnywhere users don’t have the privileges to create the database test_USERNAME$default (whose name Django’s unit testing framework has auto-generated from the USERNAME$default that is the DB name in settings.py). PA only allows you to create new databases from its web interface, and also only allows you to create databases that are prefixed with your-username$

After a bit of thought, I realised that you can work around the problem by setting TEST_NAME in settings.py to point to a specific new database (say, USERNAME$unittest) and then creating a DB of that name from the MySQL tab. Once you’ve done that, you run the tests again; you get an error like this:

19:02 ~/somewhere (master)$ ./manage.py test
Creating test database for alias 'default'...
Got an error creating the test database: (1007, "Can't create database 'USERNAME$unittest'; database exists")
Type 'yes' if you would like to try deleting the test database 'USERNAME$unittest', or 'no' to cancel:

You just enter “yes”, and it will drop then recreate the database. This works, because when you created it from the MySQL page, the settings were set up correctly for you to be able to create and drop it again in the future. Once this has been done once, tests run just fine in the future, with no DB errors.

Obviously we’ll be fixing this behaviour in the future (though I can’t offhand see how…). But there’s the workaround, anyway.

New business idea

10 April 2012

So, here’s the plan. We write an iPhone app. iPhone-only, no Android. It’s a simple social network, adding friends and chatting and sharing photos and all that crap. The cool thing is, it monitors your location. If you ever spend more than 50% of one week outside Shoreditch in London, the East Village in New York, or SoMa in San Francisco, it kicks you out — you can never log in again. Once a week, it asks you a question about post-1900 conceptual art or artisan food vendors in your area. If you get it wrong, it kicks you out. Every day you have to take a photo of yourself, and other users get to vote on your outfit/fixed-gear bike/ironic facial hair. If you get less than a 50% approval rating, it kicks you out. Finally, the app comes with a guarantee that if the company’s ever bought by Facebook, 10% of the purchase price goes to its few remaining members.

Who’s with me? What should we call it?