WebTech: Archive
Entry for 14th May 2006 at 11:06 GMT
Flawed BBC Reporting on ‘Google Trends’
The BBC Newsonline article: Birmingham tops web porn search uses Google Trends to make claims about what end-users in specific cities are searching for most often based on a flawed premise.
Web surfers in Birmingham are more likely than those in any other city to search for pornography, according to internet search engine Google.
Google Trends analyses the type of searches made from different cities around the globe.
Manchester came second, followed by Brisbane, Perth, Sydney and Melbourne. Brentford, in Middlesex, was seventh.
The article author writes a caveat at the end of the article:
Google uses IP address information to make a "best guess" about where the queries originated.
However, this is insufficient since geolocation is only as accurate as ISPs allow it to be. The article should state what “best guess” really means in this context and that usually it's only reasonable to label the traffic as coming from a geographical region and not a specific city.
Geolocation data tends to point to the ISPs routing centre where a block of its customers traffic leaves its own network and joins with major routing nodes in the country. An equivalent effect as far as geolocation is concerned can be achieved through the use of ‘transparent’ caches that ISPs use to route port 80 HTTP traffic out onto the national network via a small handful of IP addresses.
An example of the effect is that as far as Google Trends is concerned my personal search traffic comes from London, although I'm over 100 miles away in Somerset in the west of England. Geolocation techniques generally point to locations that are major cities as this is where the big routing nodes tend to be located, but that's not always true. Spend a short while searching for UK-themed search terms with Google Trends and you might see “Bletchley, UK” appear rather frequently. BT has a large routing centre near Milton Keynes and the geolocation data that Google uses can only trace BT customers in that surrounding region as far as that particular routing centre. The result being that Google Trends reports an awful lot of searches originating from the suburbs of Milton Keynes.
The article needs a stronger caveat that the location of searches may typically be many tens, if not hundreds, of miles out and that the listed cities should be used as an indication of what people in certain geographical regions are searching for. For some end-users geolocation may only be accurate to a few thousand miles.