WebTech: Replies

The Original Entry:

Odd Referrer Spam-Like Behaviour From Microsoft

Wed 29th Aug 2007 20:01 GMT by Andrew from Frome, U.K.

For a few of weeks at least, possibly longer, I've been noticing hits purportedly from search results on Microsoft's LiveSearch website, e.g.: "http://search.live.com/results.aspx?q=SEARCHPHRASE&mrt=en-us&FORM=LIVSOP". Whilst it looks like legitimate traffic a little investigation raises some questions, from which you could deduce that the traffic is referrer spam unless there was a techincal explanation.

The referred LiveSearch web address contains the search term that would ordinarily have been used to generate the page of search results that are being referred. This information is typically extracted by statistical packages such as Awstats that process information about website visits from the webserver's log files. If the referrer hits were referrer spam they'd increase the importance of LiveSearch as a search engine as seen in website statistical packages. As to whether the traffic is legitimate is tricky, but a couple of things suggest that the traffic could be fake: the IP addresses of all of these hits are owned by Microsoft (not those of end-users). Secondly, the search terms are common single words that wouldn't return search results where my site was on the first page of results displayed in LiveSearch (regardless of how many items are displayed per page) ? entire search phrases such as "search", "slightly", "pause", etc.

Initially I thought the traffic was bona fide referrer spam, but I now suspect that it is legitimate traffic. The reason I say this is because for users with javascript enabled LiveSearch overrides the click action of search result URIs, redirecting them to a ?GLinkPing.aspx? page which may be constructing its own HTTP_REFERER after tracking which link the LiveSearch user has used (explains the Microsoft IP block). One reason the referrer search term is a single word could be because the generation of the pass-through referrer has a bug that only catches one word of the search phrase and loses the rest, so I get a referrer that doesn't represent the user's original search and explains why I don't appear as a link on the referrered page of results.

I have to come to this conclusion because the alternative seems unlikely (Microsoft having a farm of robots referrer spamming fake hits for their search engine for PR purposes) and I'm mindful of Hanlon's Razor!

Update

IIS web server logfile of LiveSearch hits from Microsoft servers
An example of odd LiveSearch hits from my server logfile viewer showing hits from Microsoft IPs with Live Search referrers. The ?Search: ? part is an automatically prettified way of writing ?http://search.live.com/results.aspx?q=slightly&mrt=en-us&FORM=LIVSOP? that pulls out the referring search-engine search terms.

Update
Note: Since the time of writing the number of these hits has increased, or at least have become more noticeable in my logs, though the inanity of the referrer search terms has not changed. This leads to the conclusion that it is not an innocent mistake. So much for ?never attribute to malice that which is adequately explained by stupidity?.

[ modified 17th November 2007 12:00 GMT ]

Replies

1. Posted by Michael VanDeMar [ web page link ]

Wed 17th Oct 2007 03:20 GMT
Actually, no, it is referrer spam from Microsoft as you originally suspected. An employee from Microsoft has in fact confirmed that it is indeed them doing it, although for some as of yet unexplained reason, they billed it as some form of “quality check", which given the surrounding circumstances doesn't make sense.

You can read more about the discussion, which has been going on for a while now, at WMW:

webmasterworld.com/msn_microsoft_search/3424476.htm

The Microsoft employee has the username of “msndude".

Write Your Own Reply

NOTE: This entry is archived. You can still submit a comment but it won't be published on the website until it has been approved by an editor.
Add A Comment/Reply







(it's advisable to leave this blank)
(uses cookies, see privacy policy)

Guide to Writing Replies

  • To add a web address to your reply simply write out the address in full including the "http://" part. The address will be converted to a proper 'clickable' link after you submit your message.
  • Some HTML tags are allowed: ("<p>, <i>, <b>, <q>, <strong>, <em>, <cite>, <code>, <sub>, <sup>, <var>"). Also, if you have javascript enabled you may use the shortcut buttons near the comment text box to enter these 'tags' into the text box automatically. Push a button, e.g.: 'q' and enter your text like so: <q>your text</q>
  • I reserve the right to modify or delete your stuff. Advertisements are not allowed.
  • Please read the privacy policy and only submit a reply if you're happy with it.
  • Note to spammers: You'll not gain page rank from spamming this blog entry with your links, you are however likely to be added to a public IP blacklist including your proxy IP addresses.
NOTE: This entry is archived. You can still submit a comment but it won't be published on the website until it has been approved by an editor.

Blog Search