Posts tagged heathrowtower
Quick update on Heathrow Tower
Feb 13th
While there haven’t been any visible changes to my Heathrow Tower project in the past couple of weeks beyond my throwing in a few greetings in other languages to break things up a bit. Having put some of the statistical plans on hold as the snow last week prevented any data gathered from being anywhere close to representative, I’ve gradually been building up the database behind the scenes so I can start to do some of the more intricate things I’d like to do.
The key data I wanted was the airport codes for the various flights, and geographic data for those airports.
Firstly I found FlightAware.com who will provide all sorts of data from a flight code, but unfortunately the first time I tried making a request to their site using HTTP Client I spotted a comment in the HTML referring visitors to http://flightaware.com/about/termsofuse.rvt which states:
You will only access the FlightAware web site with an interactive web browser and not with any program, collection agent, or “robot” for the purpose of automated retrieval of content.
So I started looking at airline sites. United have a relatively straightforward URL scheme that responds to a GET and returns data that can be scraped. eg:
http://www.ua2go.com/flifo/FlightSummary.do?date=20090201&fltNbr=959
BA and Virgin, on the other hand, both require cookies to be enabled in order to get results from their flight trackers and don’t advertise any other URLs for flight data. Once I’d realised that only one of those three carriers was going to be helpful, I decided not to keep checking airlines.
So, a little frustrated, I tried just typing flight codes into google. And lo and behold… most of them give useful results. It doesn’t get them all, of course, but it’s enough that out of the 838 flight codes in the database, 695 are fully identified. Of those not identified, a number seem to be flights that were diverted to Heathrow but don’t normally go there.
So with some sense of the airports served, I also want to know about the airports themselves. Wikipedia’s pretty good there, with geo data for them all in an easy to capture form. Some, like Heathrow itself, are very easy to find:
http://en.wikipedia.org/wiki/LHR
while others are a bit trickier. But with some manual intervention I was able to get all of that data. The manually produced mappings and the code for pulling in the data can both be found on github.
More updates as time allows…
Tracking Heathrow with twitter
Jan 29th
A few months back—while we were discussing the number of talking objects appearing on twitter—Jenny pointed out to me that all Heathrow airport arrivals and departures data is online. That set my mind racing, as if you know all the flights leaving that currently controversial airport, there are all manner of things you could begin to do. Working out miles travelled and carbon emitted, spotting delays, and so on. But at the time it all came down to a quick note in Things to some day set aside time to explore.
That day arrived this week. The data turned out to be pretty simple to scrape, with a quick wrapper around hpricot, and to throw into an SQLite database using datamapper to give me a little abstraction and a place to throw a variety of methods to make my code simpler. And then it was a small matter of employing John Nunemaker’s twitter gem to set up regular tweets letting followers in on how many flights in and out of Heathrow there have been lately.
The result is a rather pleasing hourly summary, that adds a little rhythm and background awareness into my day. You can follow it at http://twitter.com/heathrowtower.
Perhaps the biggest frustration with the data is that all destinations/origins are given as city names. Given that city names are hardly unique, and even if they were a given city may have several airports connecting with Heathrow, that makes it a bit trickier to do some of the more sophisticated calculations. My hope is that the flight codes (which are given) can soon be transformed into a list of airport codes, which can then open up a route to more useful and interesting data. (if anyone knows of an existing database that does that mapping, please let me know!)
I’m looking forward to that, but I’m also anticipating the ambient awareness that having the bot running will create. Will the hourly ritual of seeing a sentence or two about Heathrow activity reveal any patterns? If they do, maybe I’ll update the code to make more of those. We’ll see.
For now, please do follow the tower on twitter, tell people about it, send it messages if you spot anything interesting, and feel free to take a look at the code over on github.