Apparently, according to ‘how to blog’ blogs I need to work on my post titles to not only convey what the post is about, but also throw some keywords in there too. According to the ’suggestions’ on www.hittail.com I should be considering the following keywords “cat tracker gps” to improve my search engine hits. Hope you’re looking forward to that post really soon.
Other people say you should make sure the first paragraph has all the key elements for the post, so as not to lose readers in their RSS feed or whatever. Which probably explains why I’ve never got the hang of the serious blogging lark. Myself I like the pretty pictures, so here’s one …
I’m just messing with visualizing where and how many photos are geotagged on Flickr. The data is aggregated from the Flickr database, keyed on location voodoo magic and photo counts. That data is then spat out into a CSV file, along with place names, values and bounding boxes. Because I’m a complete hack I then used Javascript of all things to convert that into a KML file … because sometimes it’s actually easier to do it that way then mess around with real programming languages.
Things that I’ve learnt …
- See if you can guess what ‘place’ in the whole world has the most geotagged photos … go on, think a little longer … tell you in a second … ok, then, the USA has the most, followed by the UK, then California and England 4th and so on down. Me, before I looked at the data I was thinking “Oh it’ll be San Francisco, followed by New York or London, it’ll look pretty, etc. etc.”. Raw Data doesn’t discriminate or care and is rarely what you expect it to be.
- When the USA is the largest, two things become apparent. The total in the US is the sum of all the states, and each state is a sum of all the places within it. This means the US is way way bigger than any one state and so on. Log, log, log. The second is that when you’re being lazy (like I was) and just plotting the bounding box rather than the whole outline, the curvature of the earth means you have to give it a pretty big elevation to even peak out of the top of the world, and then everything else looks crazy … see …
- To get things to look ‘right’ I had to tweak a few things, ignore areas over a certain size, dampen down the difference between the major players (which normally contained the smaller players) and the smaller players. The distribution looks like this … /action: waves hands around, points at something that looks like a long tail graph. See! And obviously getting it to look ‘right’ and pretty is way more important that the real numbers :)
- I’m going to do it again, but using volume this time, rather than just height. The USA will contain a lot of photos, but spread out over a large area. A smaller area with a large number of photos will therefore be much higher and may give a better indication of where’s densely geotagged and where’s not. Well thats the theory anyway, we’ll have to wait and see what the data says.
- Plotting things in 3D or flat 2D (as opposed to you know, other types of 2F) have their own strength and weakness.
Right, so that’s that and kind of fun in a first pass gentle prod at the data. At some point, hopefully soonish this stuff will escape out of the theoretical and into practical stuff, Ob:maybe-no-time-promises-though. The other thing is that density data will probably stay roughly proportional over time, once you’ve seen it once and go “Ah!” or “Oh!” then it’s not going to change much. However we have ways of working out where’s “HOT!!!” and I’ll have a pass at that too shortly to see how that looks.
Finally I hope to get the data out, probably as a CSV file (oh ok, and KMZ) so other people can have a quick mess around. But I need to check that that’s ok first, not so much on the photo counts total, but rather the bounding box data that’ll be spat out.
Oh and sorry, no kittens.


(\ /)
(( “-” ))
\ @ @ /
( “Y” )
“.^.” `-.
_/( _) ( )
(((.\(((./( (
/.(((”\ )
(((” ”
Kitten!
Thanks for the mention, and I’ll take a look at your stuff. By the way, I love Snow Crash! What type of mapping features do you think would be best in HitTail?
What, no kittens?
[...] I admire their thoroughness in doing the whole world (check the site for country by country breakdowns), and their multi-megabyte eye candy movies. It’s a shame it’s all based on a GDP-like measure, which isn’t the most intuitive or easy to visualise thing itself. I’m reading their papers now to see what the story is. Lastly, I’m really pleased Dan Catt over at Flickr/Geobloggers can’t resist plotting his interestingness heatmaps in 3D inside Google Earth. When the sky goes pink you know it’s because Yahoo’s Dubai office decided to build it for real. [...]