Acquired!

Posted on 28 September 2022 in PythonAnywhere, Business of Software

As those of you who know me (and probably a fair few that don't) will already know, PythonAnywhere was acquired by Anaconda, Inc back in June of this year. We're still the same team, and I'm still leading it, but now we're part of a larger company.

It's been quite a ride. Due diligence and negotiation in the months up to the close was just as tough as I'd always been told it would be (and that's despite the fact that according to our lawyers it was a pretty smooth one as these things go). And now I have to get used to having a boss again, which is weird... but is helped by the fact that said boss is a great guy, and is aligned with us (you can tell from the lingo that I work for a larger company now, right?) on keeping the platform up and running as it was, while investing into it so that it can get better and grow faster.

So, all good news :-)

I've been vaguely considering putting together a few blog posts outlining what happens during an acquisition -- just a general discussion of the steps and what they involve. I wouldn't be putting anything in about this particular deal, of course -- there are strict non-disclosures about the terms and so on -- but just a description of what happens might be useful for other people in the position I was in earlier on this year. I had to learn a lot of stuff very quickly, and while our lawyers were awesome and explained things brilliantly, it would have been useful to have some kind of layman's background information.

What do you think -- worth posting?

A somewhat indirect way of reporting stolen cards to the bank

Posted on 6 February 2022 in PythonAnywhere, Business of Software

One of the interesting things about having a business that accepts cards on the Internet is seeing what odd things people do when trying to use your site. A case in point is someone we've noticed over the last few months, who appears to be using our site as a rather indirect way to report stolen cards.

The behaviour that we see is that they run some kind of script that signs up for a bunch of accounts, with randomly-generated usernames, and then try to upgrade them all using stolen card numbers.

Naturally, our fraud-prevention systems pick that up pretty much immediately, and we run our own script that identifies every account that they've created, finds the card details used for them, and reports every transaction and attempted transaction as fraudulent. This means that our payment processor, Stripe, can flag the card numbers as stolen, so that they can't be used elsewhere without triggering fraud alerts to the other merchants. And, if a charge actually goes through (most of the cards tend to be pre-paid with no money on them, so most charges fail), then we refund it as fraudulent, which not only notifies Stripe, but I believe notifies the bank that the card number is circulating amongst card fraudsters.

Now, the fact that we do this should be obvious to them. Every time they run their scripts, it causes a minor inconvenience to us (the scripts that we have to handle the problem are getting ever-simpler to use), and it means that every card that they tried on our site is now significantly less valuable as an asset to them. They're essentially paying money for lists of stolen card numbers, and then burning it up.

Given that we're doing this, and they must know that we're doing it, the only explanation I can think of is that they're actually running some kind of strange public service where they buy lists of stolen card details and then get them blocked. It does seem a very roundabout way to do it, though. Surely it would be easier to just tell the banks directly?

But perhaps there's something I'm missing.

Or perhaps they really are dim enough to be using us to check stolen cards for validity, and haven't yet noticed that doing so against a site that reports every fraudulent transaction to the card processor is not a terribly good idea...

COVID-19 breakthrough / re-infection: a personal tale

Posted on 30 November 2021 in Personal

I'm just recovering from (PCR-confirmed) covid after (I believe) having had it in 2020, and having been double-jabbed with AstraZeneca over the course of the last year. I'm completely fine, and listening to people moaning about their health is rather dull, so I won't bore you by posting at length here. But a number of people I know were really surprised to hear about it, thinking that re-infection and breakthrough infections were rare. Given that I, my partner Sara, and a close friend have all had it again (PCR tested in each case) over the last month, it seems that it might be more common than generally suspected -- so I figured that a first-person account might be of some interest.

[ Read more ]

Fun with network namespaces, part 1

Posted on 13 March 2021 in Linux, Programming

Linux has some amazing kernel features to enable containerization. Tools like Docker are built on top of them, and at PythonAnywhere we have built our own virtualization system using them.

One part of these systems that I've not spent much time poking into is network namespaces. Namespaces are a general abstraction that allows you to separate out system resources; for example, if a process is in a mount namespace, then it has its own set of mounted disks that is separate from those seen by the other processes on a machine -- or if it's in a process namespace, then it has its own cordoned-off set of processes visible to it (so, say, ps auxwf will just show the ones in its namespace).

As you might expect from that, if you put a process into a network namespace, it will have its own restricted view of what the networking environment looks like -- it won't see the machine's main network interface,

This provides certain advantages when it comes to security, but one that I thought was interesting is that because two processes inside different namespaces would have different networking environments, they could both bind to the same port -- and then could be accessed from outside via port forwarding.

To put that in more concrete terms: my goal was to be able to start up two Flask servers on the same machine, both bound to port 8080 inside their own namespace. I wanted to be able to access one of them from outside by hitting port 6000 on the machine, and the other by hitting port 6001.

Here is a run through how I got that working; it's a lightly-edited set of my "lab notes".

[ Read more ]

Comments are back!

Posted on 22 February 2021 in Blogging, Meta, Website design

Comments are now back up and running. They were interesting to put together; as a concept they don't play well with a static site, as they are by their very nature dynamic.

I was considering using Disqus, but I do want to try to keep my data to myself with this blog. I wound up putting together a separate site, comments.gilesthomas.com, which is non-static, and handles all of the comments -- some simple JavaScript injects them into each post page. It uses Akismet -- the one external dependency I feel I can allow myself -- to filter spam.

Should be interesting to see how it works! I'll give the new system a few days to bed in, and for a spot of code-tidying, then I'll post on the design of the new blog as a whole. I feel that I have Things To Say.

A new beginning

Posted on 16 February 2021 in Blogging, Meta, Website design

If you're reading this, you're seeing my new and shiny blog :-)

Blogging has been quite light here over the last few years; as PythonAnywhere has taken off, life has become ever-busier, so, less time to post.

But I also feel like one of the reasons that I've not been posting has been that I was using a Wordpress blog. Not that there's anything wrong with Wordpress, mind, but every time I logged on to it there were a pile of security updates to download and install, which was very demotivating. So often I'd think, "oh, I should post about that" but just never get round to it.

(There's also the faint embarrassment factor of running one of the most popular Python hosting platforms, and having a blog based on PHP...)

For a long time I'd been vaguely planning to switch over to some kind of static site generator like Hugo or Sphinx. They are both well-regarded, but our experience in porting the PythonAnywhere blog over to the former gave me some pause; while Hugo was really configurable, it always seemed to be really hard to configure it the specific way we wanted.

And then I thought, wait a minute. I'm meant to be a programmer. How hard can it be to write a simple static site generator?

That's the kind of sentence that feels like it should be followed by, "it was actually really hard". But it wasn't, because all of the pieces have been coded by generous people already and it was just a case of plugging them together.

With the help of wpparser to parse an export of my old blog (which I fed into a little script that spat out the articles in a Hugo-like format) and then markdown2 to format markdown-based posts, Pygments to highlight my code blocks, and then Jinja2 to let me bung the results in some templates, and feedgen to write out an RSS file, it was pretty easy to put together something that replicated the URL structure of the old blog.

To be honest, I've spent significantly more time fiddling with the CSS to make it all look pretty. I doubt that bit shows.

Anyway, now I have something where I can knock together a quick post in markdown, run a command, and have it published. Welcome to my new blog!

I'll be scanning through the old posts over the coming days and fixing any formatting issues I find.

The next step will be to work out some way of bringing the comments over, as they (of course) don't really fit in with the whole "static site" side of things. I have some ideas, though... But if you'd like to leave a comment in the meantime, @ me on Twitter.

(Update 2021-02-22: comments are back!)

Installing the unifi controller on Arch

Posted on 20 August 2019 in Linux

This is more of a note-to-self than a proper blog post. I recently got a new Ubiquiti access point, and needed to reinstall the unifi controller on my Arch machine in order to run it.

There's no formal package for unifi, so you have to install the AUR. I use yaourt for that, and if you do a simple

yaourt -S unifi

...then it will try to install MongoDB from source. According to the Arch Wiki, this requires "180GB+ free disk space, and may take several hours to build (i.e. 6.5 hours on Intel i7, 1 hour on 32 Xeon cores with high-end NVMe.)". So not ideal.

The trick is to install MongoDB from binary first:

yaourt -S mongodb-bin

And only after that:

yaourt -S unifi

Finally, activate the service:

sudo systemctl enable unifi
sudo systemctl start unifi

...and then go to https://localhost:8443/, accept the self-signed cert, and you're all set.

Python code to generate Let's Encrypt certificates

Posted on 16 November 2018 in Python

I spent today writing some Python code to request certificates from Let's Encrypt. I couldn't find much in the way of simple sample code out there, so I thought it would be worth sharing some. It uses the acme Python package, which is part of the certbot client script.

It's worth noting that none of this is useful stuff if you just want to get a Let's Encrypt certificate for your website; scripts like certbot and dehydrated are what you need for that. This code and the explanation below are for people who are building their own systems to manage Let's Encrypt certs (perhaps for a number of websites) or who want a reasonably simple example showing a little more of what happens under the hood.

[ Read more ]

Creating a time series from existing data in pandas

Posted on 9 May 2017 in Programming, Python

pandas is a high-performance library for data analysis in Python. It's generally excellent, but if you're a beginner or you use it rarely, it can be tricky to find out how to do quite simple things -- the code to do what you want is likely to be very clear once you work it out, but working it out can be relatively hard.

A case in point, which I'm posting here largely so that I can find it again next time I need to do the same thing... I had a list start_times of dictionaries, each of which had (amongst other properties) a timestamp and a value. I wanted to create a pandas time series object to represent those values.

The code to do that is this:

import pandas as pd
series = pd.Series(
    [cs["value"] for cs in start_times],
    index=pd.DatetimeIndex([cs["timestamp"] for cs in start_times])
)

Perfectly clear once you see it, but it did take upwards of 40 Google searches and help from two colleagues with a reasonable amount of pandas experience to work out what it should be.

Parsing website SSL certificates in Python

Posted on 9 December 2016 in Programming, Python, PythonAnywhere

A kindly PythonAnywhere user dropped us a line today to point out that StartCom and WoSign's SSL certificates are no longer going to be supported in Chrome, Firefox and Safari. I wanted to email all of our customers who were using certificates provided by those organisations.

We have all of the domains we host stored in a database, and it was surprisingly hard to find out how I could take a PEM-formatted certificate (the normal base-64 encoded stuff surrounded by "BEGIN CERTIFICATE" and "END CERTIFICATE") in a string and find out who issued it.

After much googling, I finally found the right search terms to get to this Stack Overflow post by mhawke, so here's my adaptation of the code:

from OpenSSL import crypto

for domain in domains:
    cert = crypto.load_certificate(crypto.FILETYPE_PEM, domain.cert)
    issuer = cert.get_issuer().CN
    if issuer is None:
        # This happened with a Cloudflare-issued cert
        continue
    if "startcom" in issuer.lower() or "wosign" in issuer.lower():
        # send the user an email