JAX backends and devices
There's nothing like writing your own code with a framework to clarify how things
fit together! Continuing with my port of my PyTorch LLM code to
JAX, I wanted to load up a large dataset:
the 10,248,871,837 16-bit unsigned integers in the train split of
gpjt/fineweb-gpt2-tokens.
That's just over 19GiB of data.
from safetensors.flax import load_file
...
full_dataset = load_file(dataset_dir / f"train.safetensors")["tokens"]
When I ran that, I got a CUDA out-of-memory error:
jax.errors.JaxRuntimeError: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 19.09GiB.
That makes sense! The allocation it was trying to do is exactly the size of the data I was trying to load. I have an RTX 3090 with 24 GiB, but some is already used up by the OS, various apps, and a model that the code creates earlier on.
But in PyTorch land, I was used to things being loaded into RAM by default, and only moved over to the GPU when I asked it to do that. JAX was clearly loading to the GPU by default. How could I stop it from doing that for this case? The load into the GPU was happening inside Safetensors, in code I couldn't directly control.
Understanding how to do it helped me understand a little bit more about JAX.
Using Safetensors with Flax
I'm porting my PyTorch LLM code to JAX, using Flax as the neural network layer. For various reasons I wanted to use Safetensors to store checkpoints of the model. It took a little while to get it working; here's the trick I learned.
10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module
In my last post I showed the somewhat-scary
temperatures I was getting on the MikroTik 10GBASE-T SFP+ module I have plugged
into nigel, the 10Gb/s switch I have in my study.
As I mentioned then, the plan was to try using some of the mini-heatsinks that
people use on Raspberry Pis, to see if that would help.
Here's how it went.
10Gb/s Ethernet: what I actually did to get it working in my home
Having learned enough about 10Gb/s Ethernet to be comfortable about setting it up in my house, it was time to bite the bullet: order it from the ISP, buy some kit, and get started.
I already had 2.5Gb/s working. The apartment has structured cabling -- each room has one or more RJ45 sockets in the wall, and there's a patch panel downstairs by our front door that has a matching patch socket for each wall socket. So when we moved in, I simply set things up so that there was a 2.5Gb/s switch down by the patch panel, and wired everything together there. Most of our stuff works over WiFi, of course, but I needed a wired backbone to connect the excessive number of computers in my study both to each other, and to the outside world.
What did I need to do?
10Gb/s Ethernet: what I had to (re)learn
My ISP recently started offering a 10Gb option, and my "shiny new thing!" Pavlovian response immediately kicked in. So of course, I had to upgrade the wired networking in my home -- which meant I had to learn a few things to get it all working, and relearn a bunch of stuff I'd forgotten over the years.
Wired networking for home and small offices hasn't really moved forward that much in the last 20-odd years. Back in 2006, gigabit Ethernet was standard for businesses, and most home users moved to it not long after. Perhaps due to the rise of WiFi for most "last few metres" connections, it's pretty much stagnated there, perhaps with a bit of a push towards 2.5Gb/s more recently.
But with faster ISP connections arriving, I think things are starting to become a bit more interesting. Even the fastest WiFi 7 connections are only able to get up to around 6Gb/s to a single device -- and that's in an ideal "super-fast machine sitting right next to the AP in a shielded lab" setup.
Here's what I had to drag up from my memory, and the new stuff I had to learn, in order to get this all working. I'll write about the background in this post, and then tomorrow I'll post about what I actually put in place.
Getting MathML to render properly in Chrome, Chromium and Brave
The other day I posted about adding mathematical typesetting to this blog using markdown2, LaTeX and MathML. One problem that remained at the end of that was that it looked a bit rubbish; in particular, the brackets surrounding matrices were just one line high, albeit centred, like this:

...rather than stretched to the height of the matrix, like this example from KaTex:

After posting that, I discovered that the problem only existed in Chromium-based browsers. I saw it in Chromium, Chrome and Brave on Android and Linux, but in Firefox on Linux, and on Safari on an iPhone, it rendered perfectly well.
Guided by the answers to this inexplicably-quiet Stack Overflow question,
I discovered that the prolem is the math fonts available on Chromium-based browsers.
Mathematical notation, understandably, needs specialised fonts. Firefox and Safari
either have these pre-installed, or do something clever to adapt the fonts you
are using (I suspect the former, but Firefox developer tools told me that it was
using my default body text font for <math> elements). Chromium-based browsers
do not, so you need to provide one in your CSS.
Using Frédéric Wang's MathML font test page,
I decided I wanted to use the STIX font. It was a bit tricky to find a downloadable
OTF file (you specifically need the "math" variant of the font -- in the same way
as you might find -italic and -bold files to download, you can find -math
ones) but I eventually found a link on this MDN page.
I put the .otf file in my font assets directory, then added the appropriate stuff
to my CSS -- a font face definition:
@font-face {
font-family: 'STIX-Two-Math';
src: url('/fonts/STIXTwoMath-Regular.otf') format('opentype');
}
...and a clause saying it should be used for <math> tags:
math {
font-family: STIX-Two-Math;
font-size: larger;
}
The larger font size is because by default it was rendering about one third of
the height of my body text -- not completely happy about that, as it feels like an
ad-hoc hack, but it will do for now.
Anyway, mathemetical stuff now renders pretty well! Here's the matrix from above, using my new styling:
I hope that's useful for anyone else hitting the same problem.
[Update: because RSS readers don't load the CSS, the bad rendering still shows up in NewsBlur's Android app, which I imagine must be using Chrome under the hood for its rendering. Other RSS readers are probably the same :-(]
Adding mathematical typesetting to the blog
I've spent a little time over the weekend adding the ability to post stuff in mathematical notation on this blog. For example:
It should render OK in any browser released after early 2023; I suspect that many RSS readers won't be able to handle it right now, but that will hopefully change over time. [Update: my own favourite, NewsBlur, handles it perfectly!]
Here's why I wanted to do that, and how I did it.
Installing the unifi controller on Arch
This is more of a note-to-self than a proper blog post. I recently got a new Ubiquiti access point, and needed to reinstall the unifi controller on my Arch machine in order to run it.
There's no formal package for unifi, so you have to install the AUR. I use yaourt for that, and if you do a simple
yaourt -S unifi
...then it will try to install MongoDB from source. According to the Arch Wiki, this requires "180GB+ free disk space, and may take several hours to build (i.e. 6.5 hours on Intel i7, 1 hour on 32 Xeon cores with high-end NVMe.)". So not ideal.
The trick is to install MongoDB from binary first:
yaourt -S mongodb-bin
And only after that:
yaourt -S unifi
Finally, activate the service:
sudo systemctl enable unifi
sudo systemctl start unifi
...and then go to https://localhost:8443/, accept the self-signed cert, and you're all set.
Creating a time series from existing data in pandas
pandas is a high-performance library for data analysis in Python. It's generally excellent, but if you're a beginner or you use it rarely, it can be tricky to find out how to do quite simple things -- the code to do what you want is likely to be very clear once you work it out, but working it out can be relatively hard.
A case in point, which I'm posting here largely so that I can find it again next
time I need to do the same thing... I had a list start_times of dictionaries,
each of which had (amongst other properties) a timestamp and a value. I wanted
to create a pandas time series object to represent those values.
The code to do that is this:
import pandas as pd
series = pd.Series(
[cs["value"] for cs in start_times],
index=pd.DatetimeIndex([cs["timestamp"] for cs in start_times])
)
Perfectly clear once you see it, but it did take upwards of 40 Google searches and help from two colleagues with a reasonable amount of pandas experience to work out what it should be.
Parsing website SSL certificates in Python
A kindly PythonAnywhere user dropped us a line today to point out that StartCom and WoSign's SSL certificates are no longer going to be supported in Chrome, Firefox and Safari. I wanted to email all of our customers who were using certificates provided by those organisations.
We have all of the domains we host stored in a database, and it was surprisingly hard to find out how I could take a PEM-formatted certificate (the normal base-64 encoded stuff surrounded by "BEGIN CERTIFICATE" and "END CERTIFICATE") in a string and find out who issued it.
After much googling, I finally found the right search terms to get to this Stack Overflow post by mhawke, so here's my adaptation of the code:
from OpenSSL import crypto
for domain in domains:
cert = crypto.load_certificate(crypto.FILETYPE_PEM, domain.cert)
issuer = cert.get_issuer().CN
if issuer is None:
# This happened with a Cloudflare-issued cert
continue
if "startcom" in issuer.lower() or "wosign" in issuer.lower():
# send the user an email