Bare Git repositories

We started a new project at Resolver today — I’m pretty excited about it, and will be blogging about it soon. However, in the meantime, here’s something that’s half a note-to-self and half something to help people googling for help with Git problems.

We’ve previously been using Subversion as our main source code control system, but for more recent projects we’ve moved to Mercurial. When we started the new one today, we decided to try out Git for a change; I use GitHub for my personal stuff, but hadn’t used it for anything involving multiple developers — and various people had been telling us that it wasn’t subject to some of the problems we’d had with Mercurial.

So we created a new Git repo on a shared directory, by creating a directory and then running git init in it. We then cloned it into a working directory on my machine, and started work. After a while, we had our first checkin ready, so we added the files, committed them, and then decided to push to the central repo to make sure everything worked OK. We got this error message:

remote: error: refusing to update checked out branch: refs/heads/master
remote: error: By default, updating the current branch in a non-bare repository
remote: error: is denied, because it will make the index and work tree inconsistent
remote: error: with what you pushed, and will require 'git reset --hard' to match
remote: error: the work tree to HEAD.
remote: error:
remote: error: You can set 'receive.denyCurrentBranch' configuration variable to
remote: error: 'ignore' or 'warn' in the remote repository to allow pushing into
remote: error: its current branch; however, this is not recommended unless you
remote: error: arranged to update its work tree to match what you pushed in some
remote: error: other way.
remote: error:
remote: error: To squelch this message and still keep the default behaviour, set
remote: error: 'receive.denyCurrentBranch' configuration variable to 'refuse'.

It took us a while to work out precisely what this meant, because we’d never heard of “bare” repositories before. It turns out that there are two kinds of repository in Git: bare and non-bare. A non-bare repository is the same as the ones we were used to in Mercurial; it has a bunch of working files, and a directory containing the version control information. A bare repository, by contrast, just contains the version control information — no working files.

Now, you can (in theory) push and pull between repositories regardless of whether they are bare or not. But if you were to push to a non-bare repository, it would cause problems. Part of the SCC data that Git keeps is an index, which basically tells it what the head of the current branch looks like. Now, if you push to a non-bare repository, Git will look at the working files, compare them to the index, and see that they differ — so it will think that the working files have changed! For example, if your push added a new file, it would notice that the working directory didn’t include that file, and would conclude that it had been deleted. There’s a step-by-step example here.

You can see how that could be confusing. So bare repositories exist as a way of having central repositories that a number of people can push to. If you want to transfer changes from a non-bare repository to another, the correct way is to pull from the destination rather than push from the target — which makes some kind of sense when you think about it. In general, any repository that someone is working on is not something that should be receiving changes without their approval… on the other hand, we’ve not encountered problems with pushing to regular repositories with Mercurial.

Anyway, this was our first checkin, so we had no history to lose, we fixed the problem by creating a new central repository using git --bare init in a new directory on the shared drive, cloning it to a new working repo, copying our files over from the old working repo to the new one, committing, and pushing back to the bare repository. It worked just fine. If we’d done multiple checkins before we tried our first push, we could have saved things by hand-editing the central repository; it had no working files (because we’d only just created it) so we could have moved the contents of the .git directory up to the repository’s root, and deleted .git — this would have “bared” it so that we could have pushed from our working repo. That would have been a bit scary, though.

Running Resolver One on Mono for Windows

Mono is an open source version of the .NET framework; it allows you to run .NET applications not just on Windows but on Linux and the Mac. I’ve spent quite some time over the last week getting our Python spreadsheet, Resolver One, to run on the Windows version, and thought it would be worth sharing some experiences.

Some background first: one of our long-term goals at Resolver Systems is to get our currently Windows-only products working on other platforms. Everything’s coded in .NET, so in an ideal world we’d just be able to run it all under Mono. However, there are two problems:

  1. Some of the third-party components we use are “.NET” in the sense that they offer us a .NET interface, but under the hood they call down to lower-level Windows functions using P/Invoke. Because they’re using Windows-specific stuff, they won’t run on non-Windows operating systems, even with Mono.
  2. As always with these things, while there is a formal specification for what .NET implementations like Microsoft’s .NET or Mono are meant to do (called the CLI), implementations differ due to bugs, things that are awaiting implementation, or ambiguities in the spec.

Obviously, we need to handle both of these kinds of problem to successfully port our software. We’re handling the first kind by moving over to newer, “pure” .NET components, for example by writing our own grid. But we didn’t want to finish all that work and only then discover all of the problems caused by the second kind of incompatibility. Now, Mono does support P/Invoke, so while the first kind of problem clearly prevents us from running on Mono right now on, say, a Mac, it does not prevent us from running on Mono on Windows. So we decided to do that simpler port in parallel with our development of the new components, so that we could find out any nasty issues of the second kind as early as possible.

First things first: it really was surprisingly easy! All kudos to the Mono developers, this really is an example of an open source project that works really well. The problems below are really low-level details, and most of them are unlikely to hit anyone apart from us. However, it’s worth enumerating them, at least for posterity’s sake — and perhaps they’ll be helpful for people Googling for solutions to similar obscure problems.

Problem 1: A change to the process binary

The first problem we hit was our code to load up our DLLs. Resolver One is comprised of quite a few libraries, and we need to be careful to load the specific ones that it’s shipped with rather than others that might be on the user’s path. We do this by finding our binary’s location, using Path.GetDirectoryName(Process.GetCurrentProcess().MainModule.Filename), and then using Assembly.LoadFile(filename) to load the DLLs explicitly rather than the more normal clr.AddReference, which uses the path.

The problem is, when you run a .NET application under Mono, the current process’s binary is not (say) Resolver-One.exe, but instead mono.exe. So Resolver One was trying to find its libraries in the Mono install directory, which obviously didn’t work. In the short term, we were able to work around this by hard-wiring the DLL path. In the longer term, we’ll have to do something more clever…

Problem 2: ImeModeBase

Once we’d fixed the first problem we got a new error: Could not find "System.Windows.Forms.Control.get_ImeModeBase". A bit of investigation made it clear that the current version of Mono doesn’t support this method (though when it does, the target of that link will show it). It looks like the method was introduced in the (relatively recent) .NET 2.0 SP2, and presumably it will be implemented sometime, but it’s not there right now.

The question was, could Resolver One run without this method? The answer seemed likely to be “yes”, as we were able to run on earlier versions of .NET 2.0. We took a look at our codebase and tried to find out what it was that was referencing the method. It turned out to be our “precompiled subclass” DLLs. These are something we introduced a while back to improve our startup performance; simplifying a bit, what happens is that when we package up Resolver One for distribution, we run a compile process on all of the classes in our code that are subclasses of .NET components. Doing this process once before we release the software means that it’s not done every time Resolver One starts up, with obvious performance benefits. The downside is that the compiled subclasses have explicit references to the members of their .NET superclasses, whether they use them or not. And because we run the compilation process under .NET 2.0 SP2, our compiled subclasses wind up with explicit references to stuff that doesn’t exist in earlier versions of .NET, or (as it turns out) in Mono.

The good news is that if you’re willing to take a performance hit, the subclass compilation can be made optional. This isn’t something we put in the production version of Resolver One right now, as it’s Windows only (and so there’s little point in having a “start up more slowly for no useful reason” command-line option), but it was easy to add in. Using non-precompiled subclasses got us past this problem. Perhaps our future Linux/Mac-compatible version can use subclasses that are precompiled against Mono.

Problem 3: System.Reflection:MonoGenericMethod

This one was the easiest one to find out how to work around, but took the longest time. Once we’d got past the precompiled assembly problem, trying to start Resolver One gave us a dialog saying “** ERROR **: Can’t handle methods of type System.Reflection:MonoGenericMethod aborting…”. Mono then bombed out with a Windows “application has stopped working” dialog.

A bit of Googling followed, and we were delighted to discover that a bug that triggered this exact error message had been fixed just ten days previously on the Mono trunk and the 2.6 branch. There’s luck for you.

Unfortunately we also discovered that the Mono team don’t release nightly builds of their Windows installer, so the only ways to get this fix would be either to wait until the 2.6.5 release, or to build it ourselves. Being impatient by nature, we decided to do the latter, and this took quite a while. I’ll post separately about what we had to do, as it may be useful for others; there’s a lot of excellent documentation on building Mono for Windows, but some of it’s a bit out of date. Luckily, the people on the Mono email list are friendly and helpful, so we were able to get there in the end.

So, after a bit of work we had a working version of Mono built from the tip of the 2.6 branch.

Problem 4: Logfile locations

When we started up Resolver One with the new Mono, we got the error SystemError: Access to the path "C:\ResolverOne.log" is denied. This was interesting, because our default logfile location is determined using Path.GetTempPath(). I’m not sure where Mono picks up the value for that, but presumably it was returning an empty string. Perhaps we were missing something in our environment variables? Either way, we decided to work around it by using Resolver One’s --logfile command-line option.

Problem 5: Madness in our methods

When told to log to an appropriate directory, Resolver One started up, and things started looking pretty good! The splash screen appeared, the “starting” progress bar moved and then… it crashed. The log file had this in it:

Exception: Method DLRCachedCode:FormulaLanguage.parsetab$16 (IronPython.Runtime.CodeContext,IronPython.Runtime.FunctionCode) is too complex.
CLS Exception: System.InvalidProgramException: Method DLRCachedCode:FormulaLanguage.parsetab$16 (IronPython.Runtime.CodeContext,IronPython.Runtime.FunctionCode) is too complex.
  at IronPython.Compiler.OnDiskScriptCode.Run () [0x00000] in :0
....

A bit of Googling lead us to two pages that suggested that “too complex” means that the method in question was too long. The module FormulaLanguage.parsetab is, as you might expect, related to the code we use to parse the formula language — that is, the language in which you write formulae in cells. This language is specified in our code in BNF with associated handler code, and compiled down into Python by the excellent PLY. The parsetab module is the generated code, and as you might expect it has some pretty unreadable stuff in it; there’s one dictionary literal that is on one 81,000-character line.

The easy fix to work around this problem was to split parsetab.py up into multiple modules. There were three variables that were being initialised with oversized literal expressions, _lt_action_items, _lt_goto_items, and _lt_productions. We created a separate module for each, which simply contained the initialisation code for the specific variables: lt_action_items_file.py, lt_goto_items_file.py, and lt_productions_file.py. Finally, we replaced the code in parsetab.py that had been moved to the new files with appropriate import statements: for example, from lt_action_items_file import _lt_action_items.

This fixed the problem, and allowed us to move onto the next one! It’s not obvious how we’ll fix this in the production release, though — the file is auto-generated, so either we’ll have to patch PLY or post-process it. Something for us to think about.

Problem 6: Extra vars

The error we got after fixing the parsetab problem was a bit obscure:

Exception: Finalize
CLS Exception: System.Collections.Generic.KeyNotFoundException: Finalize
at IronPython.Runtime.PythonDictionary.GetItem (object) 

The actual location of the error took quite a long time to track down, due to the vagaries of the way we import modules and its effects on stack traces. We eventually wound up having to do a binary chop with log statements in our startup code until we managed to narrow it down to a single line!

It turned out that we have some code that needed to create wrapper functions around all of the functions in a different module. It did this by looping over the values in the dictionary returned by vars(other_module). However, it didn’t want to wrap functions that were internal to .NET, so it had a list of function names to exclude; specifically, MemberwiseClone and Equals. Clearly these were two function names that had been found by experiment to belong to IronPython modules when running under .NET. The error we were getting was ultimately being caused by IronPython under Mono having just one extra function visible on the module: Finalize. Adding that to the list of exclusions got us past this error, and on to:

Problem 7: Um… that’s it!

…on to nothing else! Once we’d fixed the Finalize problem Resolver One started up and ran reasonably happily under Mono on Windows. There were glitches, of course; our code editor component, in particular, didn’t like being resized. But the software worked well enough to test, which is all we need for now.

There’s obviously a lot to be done before we can get Resolver One running on Macs and Linux machines; the creation of our grid component is going well, but takes time, and we need to do something about the code editor. But the good news is that we’ve identified the incompatibilities between Mono and Microsoft .NET that will hit us beyond the obvious operating system issues, and there’s nothing we can’t work around, given a bit of ingenuity. It took a while, but at the end of the day, it was surprisingly easy :-)

An odd crontab problem

This took a little while to work out, so it’s worth sharing here just in case anyone else has the same problems and is googling for solutions. We had a problem on one of our web servers at Resolver which manifested itself in some (but not all) cron jobs being run twice, which was causing all kinds of problems. Here’s how we tracked it down and solved it.

The main symptom of the problem was that something was going wrong with Apache logfile rotation. The files appeared to be being rotated twice, so each week we’d get a properly rotated one and then a zero-length one created immediately after:

-rw-r--r-- 1 root root  861 2010-05-16 06:23 access.log.2.gz
-rw-r----- 1 root adm     0 2010-05-16 06:25 access.log.1
-rw-r----- 1 root adm  5590 2010-05-18 11:20 access.log

This was annoying, and it was making it unnecessarily difficult to measure some of our website metrics. It also made us worry that data was being lost; there were occasionally gaps in the logfiles, where it looked like a week’s worth of data had been lost while rotating.

Our first thought was that because something seemed to be going wrong with the log rotate script, it was odd that we weren’t receiving any email about it. Normally if a cron job writes output, it gets emailed to root. A bit of investigation revealed a problem with the mail setup (which I won’t go into now), and fixing led to some interesting information. Once we’d fixed the email problem, we started getting messages like this at 17 minutes past every hour (when the hourly jobs were scheduled to run):

Subject: Cron  root	test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )

/bin/sh: root: command not found

It appeared that it was prepending the name of the user account that a cron job should be run as onto the job’s command. Now, the hourly jobs are triggered by a line in /etc/crontab that looks like this:

17 *    * * *   root    cd / && run-parts --report /etc/cron.hourly

So our first guess was that it was some kind of whitespace thing; the format is

MM HH DD MM WW username command

So perhaps there were spaces separating the day-of-week (WW) field from the username, when there should have been a single space or a tab? On first examination, this looked like it might be it: every other line in the crontab used a tab to separate the two fields, but the hourly cron job line used a number of spaces. We fixed that, and waited for the next 17 minutes past the hour.

But it didn’t work — we got the same error. By this time, it was getting quite late in the evening, so we left it to run overnight to see if we got any more useful information.

The next morning, we found that (as you’d expect) we’d got an error message at 17 minutes past each hour. However, more usefully, we got this pair of emails:

Time: 06:25
Subject: Cron  root	test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )

/bin/sh: root: command not found
/etc/cron.daily/logrotate:
error: Failed to remove old log /var/log/mysql.log.8.gz: No such file or directory
run-parts: /etc/cron.daily/logrotate exited with return code 1
/etc/cron.daily/sysklogd:
mv: cannot stat `/var/log/syslog.new': No such file or directory
Time: 06:25
Subject: Cron  test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )

/etc/cron.daily/standard:
mv: cannot stat `.//dpkg.status.5.gz': No such file or directory
/etc/cron.daily/sysklogd:
gzip: /var/log//syslog.0: No such file or directory
mv: cannot stat `/var/log//syslog.0.gz': No such file or directory

So, the problem seemed clear. Each line in the crontab was being run twice; once with the username being mistakenly taken as part of the command, and once without. This was what was causing the double log-rotation — and perhaps other problems besides.

It didn’t take long to work out what had happened. The format of the file /etc/crontab is unusual in having a username field between the timing information and the command to run; there are also separate crontabs for each user, which omit that field. You set up your own per-user crontab by running crontab filename; this installs the specified file as your personal crontab. However, there’s no need to do that with /etc/crontab — it’s always run, and there’s no need to install it.

Clearly, someone had been unaware of that last point, and after modifying /etc/crontab, had run crontab /etc/crontab as root to install it. We checked this by running as root crontab -l, which outputs the installed crontab for the current user — as we suspected, it displayed an out-of-date version of the contents of /etc/crontab. Running crontab -r to remove it fixed the problem; jobs were being run from /etc/crontab only, and things started working again.

Generating political news using NLTK

It’s election week here in the UK; on Thursday, we’ll be going to the polls to choose our next government. At Resolver Systems, thanks to energy and inventiveness of our PR guys over at Chameleon, we’ve been doing a bunch of things related to this, including some analysis for the New Statesman that required us to index vast quantities of tweets and newspaper articles.

Last week I was looking at the results of this indexing, and was reminded of the fun I had playing with NLTK back in February. NLTK is the Python Natural Language Toolkit; as you’d expect, it has a lot of clever stuff for parsing and interpreting text. More unexpectedly (at least for me), it has the ability to take some input text, analyse it, and then generate more text in the same style. Here’s something based on the Book of Genesis:

In the selfsame day entered Noah , and asses , flocks , and Maachah . And Joseph
said unto him , Abrah and he asses , and told all these things are against me . And
Jacob told Rachel that he hearkened not unto you . And Sarah said , I had seen the
face of the air ; for he hath broken my covenant between God and every thing that
creepeth upon the man : And Eber lived after he begat Salah four hundred and thirty
years , and took of every sort shalt thou be come thither .

It was the work of a moment to knock together some code that would read in all of the newspaper articles that we’d tagged as being about a particular subject, run them through a Beautiful Soup-based parser to pull out the article text, and feed that into NLTK, then to dump the results into a Wordpress blog (after a little manual polishing for readability).

The result? REABot, the Resolver Electoral Analysis Robot. Here’s a sample of what I think is its finest post, which was based on articles about the Nick Clegg:

They’re interested in local government, free TV licences, pension credits and child
trust fund, Carrousel Capital, run by local Liberal Democrats. TV Exclusive Trouser
Clegg Nick Clegg, but clashed on how the vexing issue of honesty, principles and
policies of electric shock. It is easy to do. "Louis Vuitton advertising used to pay back
your debts", he declared that he has delivered his strongest warning yet on the party
first place and still obsessed with outdated class structures. Inspired by Barack
Obama’s repertoire, they advise you to send a message to voters at home. "You
haven’t want to try to counter the threat of it yet," he says.

So, what does the code look like? It’s actually trivially simple. Let’s say that we’ve downloaded all of contents of the newspaper articles (I shan’t waste your time with HTML-munging code here) and put them into objects with content fields. Here’s what REABot does:

import nltk

tokenizer = nltk.tokenize.RegexpTokenizer(r'\w+|[^\w\s]+')

content_text = ' '.join(article.content for article in articles)
tokenized_content = tokenizer.tokenize(content_text)
content_model = nltk.NgramModel(3, tokenized_content)

starting_words = content_model.generate(100)[-2:]
content = content_model.generate(words_to_generate, starting_words)
print ' '.join(content)

It’s a bit of a hack — I’m sure an NLTK expert could write something much more elegant — but it works :-) What this does is generate a single string, which is formed of the text of all of our relevant articles, and runs it through a tokeniser, which splits it up into words and punctuation symbols, so that (for example) the string "I spent some time this afternoon playing with NLTK, the Python Natural Language Toolkit; the book is highly recommended." would be turned into the list ['I', 'spent', 'some', 'time', 'this', 'afternoon', 'playing', 'with', 'NLTK', ',', 'the', 'Python', 'Natural', 'Language', 'Toolkit', ';', 'the', 'book', 'is', 'highly', 'recommended', '.']

This is then fed into an NgramModel. This is nothing to do with Scientology; Ngram is a word created by extension from “bigram” and “trigram” to refer to collections of n tokens. What we’re doing with the expression nltk.NgramModel(3, tokenized_content) is creating an NLTK object that, in effect, knows about every three-token sequence (trigram) that occurs in the tokenised text (['I', 'spent', 'some'], ['spent', 'some', 'time'], ['some', 'time', 'this'], and so on), and knows how frequently each one occurs.

Once we’ve got the set of all possible trigrams and their respective frequencies, it’s pretty easy to see how we can generate some text given two starting words and a simple Markov-chain algorithm:

  • Let’s say that we start off with ['The', 'tabloid'].
  • Our analysis might tell us that there are three trigrams starting with those two tokens, ['The', 'tabloid', 'headlines'] 50% of the time, ['The', 'tabloid', 'newspapers'] 10% of the time, and ['The', 'tabloid', 'titles'] 40% of the time.
  • We generate a random number, and if it’s less than 0.5, we emit “headlines”, if it’s between 0.5 and 0.6, we emit “newspapers”, and if it’s between 0.6 and 1.0, we emit “titles”. Let’s say it was 0.7, so we now have ['The', 'tabloid', 'titles'].
  • The next step is to look at the trigrams starting ['tabloid', 'titles']; we work out the probabilities, roll the dice again, and get (say) ['tabloid', 'titles', 'have']
  • Repeat a bunch of times, and we can generate any arbitrarily long text.

This is pretty much what the NgramModel’s generate method does. Of course, the question is, how do we get two words to start with? By default, the method will always use the first two tokens in the input text, which means that every article we generate based on the same corpus starts with the same words. (Those who know the Bible will now know why the bit from Genesis started with the words “In the”.)

I worked around this by telling it to first generate a 100-token stream of text and pick out the last two:

starting_words = content_model.generate(100)[-2:]

…and then to generate the real output using those two as the starting point:

content = content_model.generate(words_to_generate, starting_words)

It’s kind of (but not really ;-) like seeding your random number generator.

And that’s it! Once the text has been generated, I just copy and paste it into a Wordpress blog, do a bit of prettification (for example, remove the spaces from before punctuation and — perhaps this is cheating a little — balance brackets and quotation marks), add appropriate tags, and hit publish. It takes about 5 minutes to generate an article, and to be honest I think the end result is better than a lot of the political blogs out there…

[An aside to UK readers: does anyone know if the business news in The Day Today was generated by something like this?]

Regular expressions and Resolver One column-level formulae

Recently at Resolver we’ve been doing a bit of analysis of the way people, parties and topics are mentioned on Twitter and in the traditional media in the run-up to the UK’s next national election, on behalf of the New Statesman.

We’ve been collecting data, including millions of tweets and indexes to newspaper articles, in a MySQL database, using Django as an ORM-mapping tool — sometime in the future I’ll describe the system in a little more depth. However, from our perspective the most interesting thing about it is how we’re doing the analysis — in, of course, Resolver One.

Here’s one little trick I’ve picked up; using regular expressions in column-level formulae as a way of parsing the output of MySQL queries.

Let’s take a simple example. Imagine you have queried the database for the number of tweets per day about the Digital Economy Bill (or Act). It might look like this:

+------------+----------+
| Date       | count(*) |
+------------+----------+
| 2010-03-30 |       99 |
| 2010-03-31 |       30 |
| 2010-04-01 |       19 |
| 2010-04-02 |       12 |
| 2010-04-03 |        2 |
| 2010-04-04 |       13 |
| 2010-04-05 |       30 |
| 2010-04-06 |      958 |
| 2010-04-07 |     1629 |
| 2010-04-08 |     1961 |
| 2010-04-09 |     4038 |
| 2010-04-10 |     2584 |
| 2010-04-11 |     1940 |
| 2010-04-12 |     3333 |
| 2010-04-13 |     2421 |
| 2010-04-14 |     1319 |
| 2010-04-15 |     1387 |
| 2010-04-16 |     3194 |
| 2010-04-17 |      860 |
| 2010-04-18 |      551 |
| 2010-04-19 |      859 |
| 2010-04-20 |      685 |
| 2010-04-21 |      528 |
| 2010-04-22 |      631 |
| 2010-04-23 |      591 |
| 2010-04-24 |      320 |
| 2010-04-25 |      363 |
| 2010-04-26 |      232 |
+------------+----------+

Now, imagine you want to get these numbers into Resolver One, and because it’s a one-off job, you don’t want to go to all the hassle of getting an ODBC connection working all the way to the DB server. So, first step: copy from your PuTTY window, and second step, paste it into Resolver One:

Right. Now, the top three rows are obviously useless, so let’s get rid of them:

Now we need to pick apart things like | 2010-03-30 | 99 | and turn them into separate columns. The first step is to import the Python regular expression library:

…and the next, to use it in a column-level formula in column B:

Now that we’ve parsed the data, we can use it in further column-level formulae to get the dates:

…and the numbers:

Finally, let’s pick out the top 5 dates for tweets on this subject; we create a list

…sort it by the number of tweets in each day…

…reverse it to get the ones with the largest numbers of tweets…

…and then use the “Unpack” command (control-shift-enter) to put the first five elements into separate cells.

Now, once we’ve done this once, it’s easy to use for other data; for example, we might want to find the fives days when Nick Clegg was mentioned most on Twitter. We just copy the same kind of numbers from MySQL, paste them into column A, and the list will automatically update:

So, a nice simple technique to create a reusable spreadsheet that parses tabular data.

An aside: SEO for restaurants

The other day, we got an ad through our letterbox for a new Thai restaurant. We’d become fed up with the other neighbourhood Thais, so decided to try this one this evening. We could remember the name, “Cafe de Thai”, and the street, All Saints Road, but no more, but hey, no problem: let’s Google it!

The results were odd; I won’t link to them because they’ll change rapidly enough, but what we found was that the front page results had two links to aggregators of celebrity Twitter accounts (because someone who is apparently semi-famous tweeted about the place), but everything else was about other places on the same street, or with vaguely similar names. By contrast, a search for their competitors came up with a bunch of random London restaurant listing sites, many of which I’d never heard of — but all of which had the information I was looking for, to wit the telephone number and the precise address.

What’s interesting to me is that (a) neither restaurant’s own web page was on the first page of the listings, and (b) this didn’t matter. All that mattered was that the contact details were at the front of the list; the more established place had loads of listings sites giving contact details for them, but the newer place was nowhere to be found. So perhaps, while software companies spend money to make as sure as possible that their own website is at the top of the search results for their name and industry segment, SEO for restaurants is much more nuanced: you don’t need your own website to come first, just that of a decent listings site. Ideally, one would assume, a listings site where you get a good rating…

Anyway, just in case anyone has wound up on this page looking for details of the restaurant:

Cafe de Thai
29 All Saints Road
London
020 7243 3001

I recommend the scallops and the weeping tiger; Lola liked her dim sum and red curry with prawns. Alan Carr recommends the green curry, apparently…

OpenCL: .NET, C# and Resolver One integration — the very beginnings

Today I wrote the code required to call part of the OpenCL API from Resolver One; just one function so far, and all it does is get some information about your hardware setup, but it was great to get it working. There are already .NET bindings for OpenCL, but I felt that it was worthwhile reinventing the wheel — largely as a way of making sure I understood every spoke, but also because I wanted the simplest possible API, with no extra code to make it more .NETty. It should also work as a great example of how you can integrate a C library into a .NET/IronPython application like Resolver One.

I’ll be documenting the whole thing when it’s a bit more finished, but if you want to try out the work in progress, and are willing to build the interop code, here’s how:

  • Make sure you have OpenCL installed — here’s the NVIDA OpenCL download page, and here’s the OpenCL page for ATI. I’ve only tested this with NVIDIA so far, so I’m keen to hear of any incompatibilities.
  • Clone the dot-net-opencl project from Resolver Systems’ GitHub account.
  • Load up the DotNetOpenCL.sln project file in the root of the project using Visual C# 2008 (here’s the free “Express” version if you don’t have it already).
  • Build the project
  • To try it out from IronPython, run ipy test_clGetPlatformIDs.py
  • To try it in Resolver One, load test_clGetPlatformIDs.rsl

That should be it! If you want to look at the code, the only important bit is in DotNetOpenCL.cs — and it’s simply an external method definition… the tricky bit was in working out which OpenCL function to write an external definition for, and what that definition should look like.

I’ve put a slightly tidied version of the notes I kept as I implemented this below, for posterity’s sake; if you’re interested in finding out how the implementation went, read on…

Read the rest of this entry »

OpenCL: first investigations with an NVIDA card

I’m taking a look at OpenCL at the moment, with the vague intention of hooking it up to Resolver One. OpenCL, in case you’ve not heard about it, is a language that allows you to do non-graphical computing on your graphics card (GPU). Because GPUs have more raw computational power than even modern CPUs, in the form of a large number of relatively slow stream processors, this can speed up certain kinds of calculations — in particular, those that are amenable to massive parallelisation.

Until recently, the two main graphics card manufacturers had their own languages for this kind of general-purpose GPU computing; NVIDIA had CUDA, and ATI had their Stream technology. OpenCL was created as a way of having one language that would work on all graphics cards, so although the tools for developing using it are not currently as good as those for CUDA (which has been around for a long time and has great support), as a longer-term investment OpenCL looks to me like the best one to be looking at.

It took a little bit of work to get something up and running on my machine here at work, so it’s probably worth documenting to help others who are trying the same.

I’m doing this using a machine with an NVIDIA GeForce 8600 GT graphics card; it’s a bit old, but it can run CUDA, and (as you might expect) NVIDIA’s OpenCL implementation is build on top of CUDA. So this description will probably only help people trying to get stuff working using NVIDIA cards. I have a laptop with an ATI card at home, and I’ll try installing it all there some other time and write that up too.

Here’s what I did, including mis-steps and error messages:

  • Firstly, I obviously needed to download the appropriate drivers and libraries from NVIDIA. Here is their OpenCL download page. From there, I followed the “Click here to download OpenCL” link, gave them my details when asked, and then on the resulting page for the “NVIDIA Drivers for WinVista and Win7 (190.89)” 32-bit version.
  • I installed the drivers. Windows warned that it couldn’t verify they’d work, but I went ahead anyway. While installing, it did odd stuff including blanking the display and switching resolution a few times (unsuprisingly given that it’s basically a new graphics driver) but seemed to succeed. It wanted to reboot, so I let it do so.
  • When the machine came back, I found some PhysX demos on the start menu. PhysX is a separate but related NVIDIA product that allows games developers to use the graphics card to handle parts of their calculations — for example, simulating realistic cloth. This looked like a good way to check the install had worked, so I tried running it. Unfortunately when I tried, it told me that I needed DirectX 9.0c and only had 9.0 installed. I checked the machine (which used to be used by someone else) and discovered that it hadn’t been updated for a long time — it didn’t even have Vista Service Pack 1, which is two years old!
  • So, I let Windows Update install everything it wanted to install (which took a few hours) and tried again. Unfortunately, I got the same error.
  • A bit of Googling found Microsoft’s page for downloading the latest versions of DirectX, so I ran that and tried again. This time it worked, and I was able to look at the PhysX demos; here’s a video from someone else showing what they look like.
  • Right, time for some real OpenCL. I downloaded “GPU Computing SDK code samples and more” from the NVIDIA site where I originally got the drivers, and installed it.
  • It put an icon titled “NVIDIA GPU Computing SDK Browser” on the desktop, so I double-clicked it. A dialog came up saying “The application has failed to start because its side-by-side configuration is incorrect. Please see the application event log for more detail.” I decided not to worry about this; errors like that can be a pain to track down (it’s usually a missing or misplaced DLL) and given that the app in question is just a browser for demos, and the stuff that was just installed was mostly the source code for those demos, it looked like a good plan to go straight to the source code.
  • In C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\OpenCL\src\, there were a number of directories, each including what appeared to be a Visual Studio project demonstrating some aspect of OpenCL. I took a closer look at the oclMatVecMul subdirectory, and saw that it was a C++ project. I didn’t have a C++ compiler installed on the machine, but…
  • Microsoft, in their infinite kindness, allow you to use the “Express” version of Visual C++ for free, so I downloaded it from here. For some reason it failed to install the first time I tried, but when I tried again (not doing anything in the meantime) it worked just fine. Hmm.
  • Once it was installed, I opened the oclMatVecMul_vc9.sln with it. From the Build menu, I chose Build Solution.
  • Then from the Debug menu, the eccentrically-located Start Without Debugging option.
  • The application failed, with a number of dialog boxes describing the problem. Once I’d quit it, I could see a log window which had all of the text that had been in the dialogs, all of which is listed below:
    'oclMatVecMul.exe': Loaded 'C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\OpenCL\bin\Win32\Debug\oclMatVecMul.exe', Symbols loaded.
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\ntdll.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\kernel32.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\OpenCL.dll', Binary was not built with debug information.
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\advapi32.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\rpcrt4.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\nvcuda.dll', Binary was not built with debug information.
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\user32.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\gdi32.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\imm32.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\msctf.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\msvcrt.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\lpk.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\usp10.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\avgrsstx.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\dwmapi.dll'
    'oclMatVecMul.exe': Unloaded 'C:\Windows\System32\dwmapi.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\nvapi.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\ole32.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\oleaut32.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\shlwapi.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\shell32.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\setupapi.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\System32\version.dll'
    'oclMatVecMul.exe': Loaded 'C:\Windows\winsxs\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.6002.18005_none_5cb72f96088b0de0\comctl32.dll'
    Run-Time Check Failure #0 - The value of ESP was not properly saved across a function call.  This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention.
    
    Run-Time Check Failure #0 - The value of ESP was not properly saved across a function call.  This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention.
    
    First-chance exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    Unhandled exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    First-chance exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    Unhandled exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    First-chance exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    Unhandled exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    First-chance exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    Unhandled exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    First-chance exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    Unhandled exception at 0xcccccccc in oclMatVecMul.exe: 0xC0000005: Access violation reading location 0xcccccccc.
    The program '[3984] oclMatVecMul.exe: Native' has exited with code 0 (0x0).
    
  • Slightly dispirited, I tried a different sample, oclDeviceQuery, and got the same errors.
  • Looking at the errors more closely, I figured that it looked like the C++ code and its associated header files were incompatible with the libraries that were being linked in at runtime; the “This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention” in particular seemed to point in that direction. Given that I’d done a large-scale Windows Update shortly after installing the OpenCL drivers right at the start of this process, it seemed plausible that they might have been overwritten by older, non-compatible drivers.
  • The best way of testing this hypothesis seemed to be to reinstall the drivers, so I went back to the NVIDIA download page and once again downloaded the the “NVIDIA Drivers for WinVista and Win7 (190.89)” 32-bit version, and reinstalled (exiting Visual Studio first).
  • Once that was all done, I restarted Visual Studio, reloaded the oclDeviceQuery demo, rebuilt it, and ran it again…
  • …and it worked! The matrix multiplication one and the n-body physics simulation ones also worked, so I think everything’s sorted. Here’s a video of the latter:

    If it looks slow to you, that’s probably because it is a bit slow… but remember, it’s doing a lot of calculations, and I’m using a pretty old and crappy graphics card.

So, that’s how I got it all installed and running. Next time I’ll write about something a little more interesting, like how the programs are structured, or even how to call OpenCL from IronPython applications like Resolver One.

Playing with NLTK

I spent some time this afternoon playing with NLTK, the Python Natural Language Toolkit; the book is highly recommended. I couldn’t resist sharing the results of the first time I tried its generate function, which takes a particular text and tries to automatically generate more text in the same style.

First, from the Book of Genesis:

In the selfsame day entered Noah , and asses , flocks , and Maachah .
And Joseph said unto him , Abrah and he asses , and told all these
things are against me . And Jacob told Rachel that he hearkened not
unto you . And Sarah said , I had seen the face of the air ; for he
hath broken my covenant between God and every thing that creepeth upon
the man : And Eber lived after he begat Salah four hundred and thirty
years , and took of every sort shalt thou be come thither .

Next, Jane Austen:

[ Sense and Sensibility by Jane Austen 1811 ] CHAPTER 1 The family of
his relations , I know Marianne ‘ s indisposition , beyond giving time
for the world to seeing YOU .– Very far from being weary of her
feeling how more than commonly joyful . He was four years . Every
thing was disregarded . I had called in Berkeley Street , January . ”
Mrs . Jennings , sagaciously smiling , ” For my part , not being
Willoughby , where much of his fate with Miss Morton , with two
pointers playing round him ,

And finally, from a corpus of IRC conversations:

now im lost i agree i saw the whole thing JOIN PART google ’s start
page .. lol yeah , he said this morning he is hoping they win tonight
im gay ……. im happy as hell …… jus aint #### lol U42 , how are
you NICK : U23 what a combo hehehe JOIN . ACTION pictures the blues
brothers behind that chicken wire screen . (((((((((( U34
))))))))))))) Hi U7 ……… how are ya ll gotta watch my manners or
she wo n’t you play another somebody done somebody wrong song ? JOIN .
ACTION wonders if U16

Scarily accurate :-)

London Financial Python Users’ Group

I clearly need to post more stuff here so that it doesn’t just turn into a blog announcing the LFPUG’s meetings :-)

However, in the meantime, here are the details of the next one: it’ll be on 11 March 2010, and is hosted this time by Man Investments Ltd at Sugar Quay, Lower Thames Street, London EC3R 6DU. As before, all are welcome, but for security reasons you need to register in advance; just drop an email to Didrik Pinte.

Guest of honour this time around is Travis Oliphant, the creator of SciPy and the architect of NumPy. He’ll be talking about NumPy memory maps and structured data-types, and Didrik will also give a talk about integrating C/C++ libraries using Cython. More suggestions for talks (or even better, offers to give talks!) are very welcome — once again, just email Didrik, or post something in the LinkedIn group.