ACCBot’s recent breakage

A couple of weeks ago, the IRC bot that we use over at Wikipedia’s Account Creation Assistance project decided it would stop giving notifications to the IRC channel. Previously, it used UDP as the transport between the web interface and the IRC bot. However, for some unknown reason, this stopped working.

After seemingly endless messing around with PHP, netcat, more PHP and a bit of telnet, I came to the conclusion that it was fucked, and there was no way to recover it with any ease.

Previously, the code looked something like this:

For receiving notifications over UDP, that was all we needed – it worked and was semi-secure. However, when it stopped working, I took a more radical approach.

You’ve probably heard of Amazon Web Services (AWS) by now, if you haven’t then I recommend you take a look.

One of the AWS services is something called the Simple Notification Service, which seems to be exactly what I want – a notification system. However, the only notification endpoints are HTTP pings, e-mail, or an SQS endpoint.

SQS is what I chose eventually – it’s another of Amazon’s services, the Simple Queue Service. This has the “advantage” of queuing all the notifications so if you’re offline you will still get them all. However, for our case this isn’t ideal, but the bot isn’t usually down for long if it goes down. So, I decided to go for SNS->SQS over HTTPS as the transport for the notifications, rather than UDP.

Of course, code needed changing – at first I thought drastically, but it turned out to be a much smaller change than I anticipated:

It looks small, just another explicit check to see if we actually received anything. That’s until you realise that I wrote another function to take some of the work off to one side.

There’s not much that’s changed, but it was an interesting technical challenge :P The only thing that has noticeably changed is the lag from notification generation to display on IRC – can be anywhere up to about 5s if you’re unlucky!