You would think, wouldn’t you, that in these days of the internet and technology and a chicken on every table that there would be some reasonable, logical and standardised way of describing where you are. Well, my friend, you would be sadly mistaken!
Say, for example, that you were implementing a geotagging system - in other words, a simple series of hierarchical steps that, once you get to the bottom of the chain, hopefully describes roughly where you are. You’d think that there was some global accepted standard for doing this, perhaps from ISO or the IETF .. a simple, clean, well-documented and logically sound method for implementing this kind of thing.
Hell no. What there is is an absolute pig’s breakfast of incompatible, illogical, ever-changing and apparently totally arbitrary conflicting closed semi-standards. There’s no one obvious winner, not even a kind-of winner, and there isn’t even one that’s likeable. They’re all a complete joke.
One first begins to realise the abject state of things when one, embarking on this journey of enlightenment, realises that no-one can even agree on how many continents there are. You’d think it was pretty obvious - I’ve always accepted that it was 7, which seemed to be a fairly logical and geographically sound system of division. Well, apparently not! There are enough people arguing that it’s less than that, or more than that, or arguing where the lines are, that there isn’t even a number system for continents.
But don’t worry, it doesn’t end there! Next is countries! Actually, for this level of division there is a faily reasonable system - it’s called ISO 3166-1, and it basically assigns every country a number. It’s free, fairly consistent, and except for the odd idiocy (for example, in 1998 they decided to change the UK’s 2-letter code from UK to .. wait for it .. GB. Why? What was wrong with UK? Was everyone getting it mixed up with all those other UKs?) it’s OK. No idea how they derived the numbers, there seems to be no logic to it at all, but at least we end up with a list of country names and numbers.
However, if you thought this reasonableness would extend to the next product from ISO I examined, ISO-3166-2, think again! Firstly, 3166-2 isn’t even a free standard. ISO expect you to buy it from them - a $500 PDF detailing what they think things should be called! Forget that! Luckily, free (albeit somewhat outdated) versions are floating around the internet .. or you could just piece together the jigsaw from the million or so wikipedia articles about it. However, don’t waste your time.
The $500 might almost be justifiable if the standard made any sense. There is no sense to 3166-2 at all. Even basic naming is ridiculously inconsistent - a subcountry area might be referred to by a number, a number with a 0 in front of it (problematic when you remember this is a varchar), a one-letter code, a two-letter code, or hell, why not a 3-letter code! never mind that the maximum number of entries for any one country is 50 for America.
This is totally inappropriate for my needs - which I thought to be pretty simple. I just want a basic, hierarchical system of continents, countries, states and cities, tied together by some kind of rational system, in a form I can implement in a data structure. What do I get? a bunch of unsorted, unstructured, inconsistent plaintext with no rationality to it at all. And that’s the killer, really - if I liked the system, even if it was difficult to implement, I might still use it. But the ISO system is the opposite from likeable, it’s detestable - and don’t even get me started on the competing FIPS American system.
So what can I do? I need numbers. I need unique numbers for every place of interest to me, and I need a strong hierachy. So I invented my own system, based on a conjunction of the useful parts of the ISO efforts and Google’s AdWords subcountry system, itself an ISO derivation. So I’ve made a list of the top 535 subcountries in the world, given them numbers, and set it all up. It’s totally non-standard, totally un-inclusive, heavily biased towards those countries I actually give a fuck about, and that’s all great. OK, solved, kinda, down to the subcountry (ie state/province) level.
Now cities. Don’t remind me of cities, please…I’ve got 3558 unsorted cities here, waiting for some database transformations to categorise them, and that’s just with a population over 100,000 … : (