Annoying cyborgs attack, distort analytics [UPDATED][SOLVED-ish]Posted: March 2, 2012
Over the last couple of weeks, I’ve been dealing with a strange phenomenon: a substantial (but not crippling) amount of traffic suddenly came our way. The characteristics of this traffic are:
- it’s direct (i.e. — no referrer and not search traffic)
- it’s all from IE browsers
- it’s nearly all to the homepage
- it’s widely distributed in terms of geography, network etc.
- it’s of very poor quality — low time on site, very high bounce, very low engagement
- its real — confirmed in multiple analytics packages
- it flies under DDos radar because it is less intense than a DDos burst, and rather indistinguishable from real traffic.
This traffic just simply started one day, and has gone up or down a little bit since. Here’s what I’ve been able to conclude:
- It’s likely not human either — the pattern is too uniform and the quality universally crappy.
This traffic has characteristics consistent with both bot and human behavior — I think we should call it cyborg traffic! The pattern is consistent with a voluntary browser-net of some sort (people whoring out their OS’s to a central service — see Roger Dooley’s proposition below) or some kind of malware that is involuntarily opening windows in users’ browsers (less likely.) If this behavior did not seem to include older IE browsers, I’d also speculate that it could be related to prerendering, but that seems unlikely given the facts.
Others have noticed it too, some positing causes:
- This thread on webmasterworld contains lots of people reporting and reflecting on the problem
- Roger Dooley (the fellow who started that thread) has proposed with some good evidence that the whole thing is due to a shady entity called Gomez from a company called Compuware. Roger currently seems to be waiting to hear back from these guys — I hope he does soon, and posts the results of any conversations.
- A post appeared on the google analytics product forums reporting the same behavior
- A response to the webmasterworld thread by @incredibill seems to indicate that he’s found a way, via the request headers, to distinguish this sort of traffic from human traffic. Any chance you could share Bill?
For updates on this situation, see Roger’s Post, or check back here — I’ll update when more info comes to light.
[UPDATE March 5th, 2012]
More consensus that this is a botnet, but little specific additional clarity about the nature of the traffic involved. Good additional discussion appears here.
[UPDATE March 7, 2012]
Here’s the first potentially reasonable mitigation I’ve come across, (from the google product group thread, above.)
We have been getting the same kind of traffic to our homepage now for 17 days. Slow enough that it doesn’t do anything but ruin our analytics and advertising impressions.
One way that we started filtering things out was…
1) If it is an internet explorer user
2) It has no referrer (direct traffic)
Even if a blacklist were not used, one could conditionally load analytics packages in this way … I think.
Additional update: Google seems to be investigating. A google staffer posted:
We’re still investigating this issue and I’ll keep you posted when there are further updates. We appreciate your patience.
[UPDATE April 27, 2012] We’ve found a workable way to exclude this stuff from Analytics. Check it out here.