And the same to you too, Google!

Posted on 23 December 2010 in Funny

While we're on the subject of rude words... I hope that Google aren't showing this kind of CAPTCHA to french people...

Google enculecr CAPTCHA

Long S Google Books searches

Posted on 17 December 2010

This page shows a large (but not exhaustive) list of the words in the English language which could be mistaken for other words if they were written in the old-fashioned style where a lot of the time the letter "s" would be written by something that looks more like a modern "f". Each word-pair links to a Google Ngram Viewer graph showing their respective popularities over time.

See here for the background

With a bit of help from the Ispell dictionary list from here and this Python script:

after vs aster afters vs asters buffing vs bussing cafe vs case cafes vs cases chafe vs chase chafer vs chaser chafing vs chasing chefs vs chess confider vs consider cuffed vs cussed fable vs sable fables vs sables fag vs sag fags vs sags fail vs sail failed vs sailed failing vs sailing fails vs sails faint vs saint fainted vs sainted faintly vs saintly faints vs saints fake vs sake faker vs saker fakes vs sakes falter vs salter falters vs salters fame vs same fang vs sang fat vs sat fate vs sate fated vs sated fates vs sates fating vs sating fear vs sear feared vs seared fearing vs searing fears vs sears feat vs seat feating vs seating feats vs seats fee vs see feed vs seed feeder vs seeder feeders vs seeders feeding vs seeding feedings vs seedings feeds vs seeds fees vs sees fell vs sell feller vs seller fellers vs sellers felling vs selling fells vs sells fetter vs setter fetters vs setters fever vs sever fevered vs severed fevering vs severing fevers vs severs few vs sew fewer vs sewer fews vs sews fickle vs sickle fight vs sight fighter vs sighter fighting vs sighting fights vs sights fill vs sill fills vs sills fin vs sin fin's vs sin's fine vs sine fines vs sines finger vs singer fingers vs singers fining vs sining fins vs sins fir vs sir fire vs sire fired vs sired fires vs sires firing vs siring fit vs sit fits vs sits fitter vs sitter fitter's vs sitter's fitters vs sitters fitting vs sitting fittings vs sittings fix vs six fixes vs sixes flab vs slab flap vs slap flapping vs slapping flaps vs slaps flash vs slash flashed vs slashed flasher vs slasher flashes vs slashes flashing vs slashing flat vs slat flats vs slats fled vs sled fleet vs sleet flew vs slew flick vs slick flicker vs slicker flicks vs slicks flier vs slier flight vs slight flights vs slights fling vs sling flinger vs slinger flinging vs slinging flings vs slings flip vs slip flips vs slips flit vs slit flits vs slits flop vs slop floppier vs sloppier floppiness vs sloppiness floppy vs sloppy flops vs slops flow vs slow flowed vs slowed flower vs slower flowing vs slowing flows vs slows flung vs slung fly vs sly foil vs soil foiled vs soiled foiling vs soiling foils vs soils fold vs sold folder vs solder folders vs solders foot vs soot fore vs sore forest vs sorest fort vs sort forts vs sorts fought vs sought foul vs soul fouled vs souled fouls vs souls found vs sound founded vs sounded founder vs sounder founding vs sounding founds vs sounds four vs sour fours vs sours fun vs sun funnier vs sunnier funniness vs sunniness funny vs sunny future vs suture futures vs sutures infect vs insect infects vs insects leafed vs leased leafing vs leasing left vs lest lift vs list lifted vs listed lifter vs lister lifters vs listers lifting vs listing lifts vs lists loft vs lost miffed vs missed miffing vs missing rafter vs raster rafters vs rasters refelling vs reselling refined vs resined refining vs resining resifted vs resisted sifter vs sister unfounded vs unsounded wafter vs waster wife vs wise wifely vs wisely

def frepls(word):
    if len(word) == 0:
        return ['']
    kids = frepls(word[1:])
    result = [word[0] + w for w in kids]
    if len(word) > 1 and word.startswith('s'):
        result += ['f' + w for w in kids]
    return result

Long S is long

Posted on 17 December 2010 in Uncategorized

A bunch of people have been posting interesting searches on Google Labs' Books Ngram viewer. I heard about it from this tweet by @njrabit, but the tantalising link (don't follow if you don't like swearing) at the bottom of this blog post by S. Weasel, showed up something interesting. Why is it that of four swearwords, the one starting with 'F' is incredibly popular from 1750 to 1820, then drops out of fashion for 140 years — only appearing again in the 1960s?

Your first thought might be to do with the replacement of robust 18th-century English — the language of Jack Aubrey — with pusillanimous lily-livered Victorian bowdlerism. But the answer is actually much simpler. Check out this set of uses of that f-word from between 1750 and 1755. In every case where it was used, the word was clearly meant to be "suck". The problem is the old-fashioned "long S". It's a myth that our ancestors used "f" where we would use "s". Instead, they used two different glyphs for the letter "s". At the end of a word, they used a glyph that looked just like the one we use now, but at the start or in the middle of a word they used a letter that looked pretty much like an "f", except without the horizontal stroke in the middle.

But to an OCR program like the one Google presumably used to scan their corpus, this "long S" is just an F. Which, um, sucks. Easy to make an afs of yourself...

[UPDATE] with a bit of Python and a large dictionary file, I've generated a fun set of Google Books long S comparisons. I particularly like "cafe vs case" and "fame vs same"

[UPDATE] some scholarly discussion of related issues at Language Log, including the excellent funk vs sunk comparison. An article on the rules for the use of the Long S in various languages, updated with Google books data. And some research triggered by the strange fact that people only seem to have become interested in "pleasure" after about 1800.

Fun with the Audio Data API

Posted on 6 December 2010 in Music, Programming

The latest beta version of Firefox 4 has an API for reading and writing audio data — right down to the sample level, right from JavaScript. JavaScript is, of course, totally the wrong language to write DSP-style code in, so that's what I decided to do :-)

If you fancy downloading FF4 beta and trying out some of the demos, here they are. There are lots of (much better) demos by other people here.

And if you try out the musical temperament example and have any thoughts on which chords sounded nicest, leave a comment below!