2012-11-30

Parsing US Driving Licence Barcodes

Most states in the US and some Canadian provinces include a PDF417 barcode on the back of their driving licences. The barcode contains a host of information about its owner, such as names, address, height, weight, eye colour, date of birth, etc. There are currently 7 different versions of the standard, which you can download here (click on the “Documentation” tab):

Unfortunately, the standards are full of breathtakingly stupid mistakes. Dates are currently my favourite.

This is the date format used in version 1:

yyyymmdd

That’s one of the ISO-8601 standards for representing a date.

In version 2 they switched to this:

mmddyyyy

That’s the standard US way of representing dates (I like to think of them as “lumpy” dates, because the format goes “large-small-large”, whereas ISO dates are big-endian). I have no idea why they did this. I presume they got a lot of complaints from Americans who were stumped by the unusual date format whilst decoding the PDF417 barcodes with nothing more sophisticated than their eyes. Any automated parser would naturally re-format the date into the local standard, so they must have been doing it manually. An impressive skill.

In version 3 the Canadians decided to get in on the barcode action. Canadians use the big-endian date format, so the spec now states that date fields can store the dates in one of two ways:

yyyymmdd
mmddyyyy

Any parsers need to check the licence’s country code before they can parse dates. Not only does this version of the spec introduce a new standard but it contains multiple standards within a single field.

Wow.