Ranty thoughts about MySQL, with added Oracle

I regularly use MySQL for many of my personal projects (except for that one that uses SQLite). As such, I have a well-used MySQL server on my main cluster in London, and another well-used MySQL server on my local cluster in my flat, not counting the numerous development MySQL servers sat on pretty much every device I own (including my phone…).

At work, we’re developing and maintaining an application which is backed by an Oracle database, and much of our application logic is contained within this database. As such, I get a lot of exposure to Oracle as well, and have even gone as far as setting up an Oracle XE instance at home, just to play with.

I’m not a cranky person, I don’t often complain about things, but recently as I’ve been getting more and more exposure to Oracle, and building more and more complex applications in my personal time on top of MySQL, I’ll try and do something that I know will work perfectly in Oracle, but for some unknown reason it doesn’t work – normally it is a very old bug or design decision I disagree with in MySQL.

To be fair, I have a fair share of grievances with Oracle too, which I’ll try and mention at the bottom, but mostly it’s MySQL’s issues that drive me up the wall.

Most of these stem from the differences between the two systems, and the intended audience of each. Oracle is much more enterprise-focused, but some of the things I mention below would not only be useful to MySQL users, they would allow developers to better leverage the database’s power, rather than being forced to do lots of data manipulation within the application layer.

Continue reading

Elite: Dangerous stats on WordPress blog posts

One of the things that you may have noticed recently on the Elite: Dangerous posts on this site is the little statistics panels at the bottom of posts – this post is about those:

They are generated by a custom extension to the WordPress theme that I’m currently using, backed by a MySQL database. The database schema I’m using can be found here for those interested.
Continue reading

Arrival at Zurich – Wikimedia Hackathon 2014

I’ve arrived!

After some hiccups with API, and my shoes setting off the metal detector causing me to have a full body scan et al. by the security guards, the flight here was relatively uneventful thankfully. SwissAir provided free drinks and lemon cake inflight, which was unexpected, but welcome.

The sight of the Alps as the plane was coming in to land in Zurich reminded me how much I love being in the mountains. Although Zurich is not really in the Alps, it’s not too far away, and just the overall feel of Switzerland (France too, but I’m not in France) is really relaxing and calming.

I’m quite glad to be back on the continent again, it’s just a completely different atmosphere. Unfortunately, I’m not going to be able to get out and explore Zurich much while I’m here, as I have a lot of coding to do for the Wikimedia Hackathon!

It’s good to see faces again I’ve not seen for a year – and also to put faces to names I didn’t meet last year in Amsterdam. While the Hackathon hasn’t officially started yet, a fair few people are already getting down to work, including myself. I’ve been making sure that the code I’m going to be working on is in a fit state to have new features added to it, so making sure I’m not building new features on top of existing bugs. I’m also attempting to replicate my working copy of the code that’s sat on my desktop machine onto my laptop – I thought I had all the pieces I needed, then I remembered how long it has been since I last did software development on my laptop!

Power initially proved to be problematic too, while I thought I had a Swiss power adaptor, all of the adaptors I had were in fact only European ones. A mad dash around the shops before I left proved fruitless, so I had to buy one in the airport. UK and EU plugs being larger than Swiss plugs, and with the elegance of the Swiss sockets allowing three plugs to be inserted in a space only slightly larger than the UK and EU plugs means that when an adaptor is plugged directly into a Swiss socket, the ability to use the other two Swiss sockets is severely limited. Thankfully, a fair few people (myself included) have brought extension leads (mainly European), which others can plug into without problems.

The main aims for my time here are to get a framework for supporting OAuth within the account creation tool, allowing us to replace a few key pieces of functionality for which the implementation is not ideal. Firstly, I want to replace our confirmation of on-wiki username, then work towards implementing the welcomer bot to use the account creator’s actual account, rather than faking their identity with a bot. Lastly, if we can allow the account creation tool to actually create the accounts itself too, so we don’t need to redirect to the creation web form, it would be awesome.

I’m hoping to get through as much of that list as possible, with assistance from the development team of the extension, but overall I’m aiming to get far enough with a working framework so that I can complete the rest without assistance. If I can also complete some of the outstanding issues that the extension has too, and learn more about MediaWiki in the process, it will be even better.

Wikimedia Hackathon 2014 – Zurich, Switzerland

In a few weeks time, I’m going to be heading to the Wikimedia Hackathon in Zurich, Switzerland.

I’ve been granted a scholarship by Wikimedia UK to cover my travel costs and accommodation, so I can spend a long weekend hacking on code for Wikipedia-related things in the company of a hundred or so other developers from around the world.

I’m heading there to work on integrating the Account Creation tool with the Wikimedia login system so we can replace our broken welcome bot with posts from the creator an account. In the past we have faked it by making the bot sign as a different user, but we’re hoping to allow the bot to edit as other users using the new OAuth tools.

I’m also hoping to learn a lot more about the internals of MediaWiki, the software which powers sites like Wikipedia, in the hope that I can get a lot more involved in the development of this remarkable piece of software, both the core application itself, and extensions. I’ve already done a bit of work on extensions, but I’ve hardly done anything with core. I’d love to learn more to be able to get much more involved with it.

One of the other things I’m looking forward to is the OpenPGP/GnuPG key signing party that’s been suggested, where people can get together and verify each other’s identities, then go away and sign their keys as being valid. It’ll be the first time that I’ve ever been to something like this, and it will be good to get a few more signatures on my key!

It’s going to be really good to meet people, learn, and importantly hack on code to try and get something worthwhile done!

Internet censorship doesn’t work. At all.

When David Cameron announced that he was planning to force all the ISPs to implement automatic filtering of porn on the internet, a lot of people said it was a good idea in principal. And a lot of people who know how the internet and/or filtering tech works said it’s never going to work.

Let me clear something up. I don’t think kids should be looking at porn. I also don’t think there’s a damn thing (on a technical level) that can be done to stop them.

Filters generally work with blacklists and whitelists, and heuristic patterns. Basically speaking, some sites may never be blocked, others will always be blocked, and the heuristics will likely work with keyword lists, so if a page contains a word like XXX or a phrase like “hot hard-core action”, the filter will probably block that site.

Enter the problem: now this site contains those phrases and isn’t one of the huge well-known sites, it could be blocked by those heuristics. Other sites like LGBT sites, rape support sites, and even teen puberty help sites could find themselves blocked, and not all porn sites will contain those phrases so some will inevitably slip through. What’s more, ISPs could find themselves breaking the law by following the law. The LibDem LGBT site was one of those caught in the crossfire, and blocking the website of a political party around election time could be seen as something like electoral fraud.

Apparently parents can override these filters, but what’s to say that parents don’t let their more knowledgeable kids manage the net connection? Or maybe don’t want to disable them because they’re oblivious to the issues? Or even worse, what if the kid is trying to get support regarding parental sexual abuse?

There is so much that could go (and has gone wrong already) with this, and there’s so much damage that could be done to vulnerable people who are trying to get support. It’s a real kick in the teeth for charitable organisations who are doing their best to help people, and then the government comes along and pushes this through.

This is one of many examples of why I feel so strongly against Governmental intervention in technical matters such as internet governance. If you don’t understand the technology, don’t try and legislate for it. Learn the technology, how it works on a basic level, for example with filtering learn how filtering works, and what sort of things get filtered, advantages and disadvantages, and problems. Don’t assume the industry will work out the problems Mr Cameron.

DNS Decompression

As part of my analysis of the DNS protocol (I’m writing a DNS server from the ground up), I’ve learnt some cool things that the DNS protocol does to reduce the number of bytes transmitted over the wire.

Let’s say you do a DNS lookup for:

stwalkerster.co.uk.            IN      AAAA

The actual DNS data received as a response is something like this:

Transaction ID: 0x707c
Flags: 0x0100 Standard query response, No error
  1... .... .... .... = Response: Message is a response
  .000 0... .... .... = Opcode: Standard query (0)
  .... .0.. .... .... = Authoritative: Server is not an authority for domain
  .... ..0. .... .... = Truncated: Message is not truncated
  .... ...1 .... .... = Recursion desired: Do query recursively
  .... .... 1... .... = Recursion available: Server can do recursive queries
  .... .... .0.. .... = Z: reserved (0)
  .... .... ..0. .... = Answer authenticated: Answer/authority portion was not authenticated
  .... .... ...0 .... = Non-authenticated data: Unacceptable
  .... .... .... 0000 = Reply code: No error (0)
Questions: 1
Answer RRs: 1
Authority RRs: 0
Additional RRs: 0
Queries
  stwalkerster.co.uk: type AAAA, class IN
    Name: stwalkerster.co.uk
	Type: AAAA (IPv6 address)
	Class: IN (0x0001)
Answers
  stwalkerster.co.uk: type AAAA, class IN, addr 2a01:7e00::f03c:91ff:feae:5fd9
    Name: stwalkerster.co.uk
	Type: AAAA (IPv6 address)
	Class: IN (0x0001)
	Time to live: 1 hour
	Data length: 16
	Addr: 2a01:7e00::f03c:91ff:feae:5fd9

… or at least, that’s a marked-up representation the actual byte stream. The actual byte stream looks like this (Ethernet/IP/UDP headers hidden for clarity):

0000   .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  ................
0010   .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  ................
0020   .. .. .. .. .. .. .. .. .. .. 70 7c 81 80 00 01  ..........p|....
0030   00 01 00 00 00 00 0c 73 74 77 61 6c 6b 65 72 73  .......stwalkers
0040   74 65 72 02 63 6f 02 75 6b 00 00 1c 00 01 c0 0c  ter.co.uk.......
0050   00 1c 00 01 00 00 0e 10 00 10 2a 01 7e 00 00 00  ..........*.~...
0060   00 00 f0 3c 91 ff fe ae 5f d9                    ...<...._.

The interesting part is that you can see the looked up name appearing twice in the human-readable form, but only once in the actual byte stream. To describe why, we need to look at how labels (or names) are stored within the protocol.

Let's start at a point in the byte stream which is the location of a label. The position I'm gonna look at is byte 0x37, which has the value 0x0c (0d12). This tells us to look for another 12 bytes, which forms this segment of the label. If we read this in, we get "stwalkerster". We now read in the next length byte, which is 0x02 (0d2). This gives us a segment of "co". The next length is also 0x02, which gives us "uk". Finally, we read in the next length (we don't know this is the end yet!), which gives us a length of 0x00, a zero-length label. This is the end of the label.

Thus, the final result is something like this:

dns

The next instance of this label is at 0x4e. We notice that the first byte has the initial two bits set, so we read the next byte too, to give a value of: 0b1100 0000 0000 1100. The first two bits indicate that this is a pointer, with the rest of the bits indicating the offset from the start of the DNS byte stream the occurrence of the label.

dns2

The DNS packet is therefore compressed in such a way that the duplication of the parts of the label already mentioned is eliminated. This is smart, so a packet containing labels for stwalkerster.co.uk, www.stwalkerster.co.uk, test.www.stwalkerster.co.uk, and simonwalker.me.uk will only mention the suffix of a name once:

dns3 (1)

This pointer-based approach saves a fair amount of bandwidth in large DNS packets where a name may be repeated many times for a round-robin type record.

There are some issues with specific implementations of this though, namely the restrictions on where a pointer may point. Consider this:

dns4

What is the label interpreted by this? simonwalker.me.simonwalker.me.simonwalker.me.simonwalker.me.....

In some bad implementations, this may cause a stack overflow (if done recursively, like mine), or just hang (stuck in an infinite loop, possibly eating RAM). Voila, Denial of Service attack. This can take several forms though, as there could be several pointers involved:

dns5

dns6

dns7

All of the above are technically valid labels. They will just almost never appear as a legitimate request to a DNS server. Any attempt to do something like that should be interpreted as a bad request, or a malicious request and dropped. It shouldn't result in a crash of the DNS server - but I do wonder how many Internet DNS servers are vulnerable to this.

Samsung Galaxy SIII

As my phone contract with Orange was nearing it’s completion, I decided that I would upgrade to the Samsung Galaxy SIII.

It’s a pretty good phone, running Android 4.0.4 Ice Cream Sandwich, 16GB internal memory, Samsung Exynos 4 Quad SoC based on a 1.4 GHz quad-core ARM Cortex-A9 CPU, so it’s not slow! The UI (TouchWiz) is pretty responsive, but a bit different to what I’m used to with HTC Sense – such as the customisable number of home screens. Admittedly, I do miss some of the things that the HTC Desire had – such as a widget to enable the hotspot mode, but there are so many really nice things with this phone – including the availability of Google Chrome :)

The S3 is also my first device which includes an NFC (Near Field Communication) reader, and I’ve found that I can read a number of different things somewhat unexpectedly, including both my bus pass and my passport – I’m hoping I’ll have some chance to play with that technology in the future, especially since I have some NFC tags on order :)

There has also been a bit of a problem with these phones – it was discovered by some researchers that on opening a URL using the “tel” scheme will automatically open the dialler and pre-fill in the number. For example, a link such as the example below will open the dialler and dial 012345:

<a href="tel:012345">click me!</a>

However, some special numbers (called USSD codes) will automatically detect as these special numbers and automatically start something – most phones can be made to show their IMEI number by dialling *#06#. If, by clicking a link such as this your phone shows it’s IMEI, it’s vulnerable to attack. The Galaxy SIII has a special code in it which can be used to run a factory reset without confirmation and without possibility of cancelling it. This could be started simply by clicking a link in a web page. Thankfully, my phone doesn’t appear to be affected by this, but a large number of Galaxy SIIIs out there are vulnerable.

Overall, it’s a brilliant phone, and I’m really glad I got it :)