Long S is long

Posted on 17 December 2010 in Oddities

A bunch of people have been posting interesting searches on Google Labs' Books Ngram viewer. I heard about it from this tweet by @njrabit, but the tantalising link (don't follow if you don't like swearing) at the bottom of this blog post by S. Weasel, showed up something interesting. Why is it that of four swearwords, the one starting with 'F' is incredibly popular from 1750 to 1820, then drops out of fashion for 140 years -- only appearing again in the 1960s?

Your first thought might be to do with the replacement of robust 18th-century English -- the language of Jack Aubrey -- with pusillanimous lily-livered Victorian bowdlerism. But the answer is actually much simpler. Check out this set of uses of that f-word from between 1750 and 1755. In every case where it was used, the word was clearly meant to be "suck". The problem is the old-fashioned "long S". It's a myth that our ancestors used "f" where we would use "s". Instead, they used two different glyphs for the letter "s". At the end of a word, they used a glyph that looked just like the one we use now, but at the start or in the middle of a word they used a letter that looked pretty much like an "f", except without the horizontal stroke in the middle.

But to an OCR program like the one Google presumably used to scan their corpus, this "long S" is just an F. Which, um, sucks. Easy to make an afs of yourself...

[UPDATE] with a bit of Python and a large dictionary file, I've generated a fun set of Google Books long S comparisons. I particularly like "cafe vs case" and "fame vs same"

[UPDATE] some scholarly discussion of related issues at Language Log, including the excellent funk vs sunk comparison. An article on the rules for the use of the Long S in various languages, updated with Google books data. And some research triggered by the strange fact that people only seem to have become interested in "pleasure" after about 1800.