What you get when you move to the cloud
Posted: May 31, 2012 Filed under: Whatever Leave a comment »Grist’s former web hardware arrived at the office today.
Annoying Robots: A Solution for Google Analytics
Posted: April 27, 2012 Filed under: code, web 26 Comments »Last month I posted about a surge of illegitimate traffic we’ve experienced on Grist. Given that they did things like load JavaScript, these impressions were difficult to distinguish from real traffic, except they were all from IE and all of very low quality.
A large number of people who run websites are experiencing the same problem, which is only really a problem because it can massively distort analytics (like Google Analytics for example) and also skews AdSense to a destructive degree. While many affected folks have simply removed AdSense from the affected pages, until now, I’ve seen no report of anyone excluding the traffic from Google Analytics.
We’ve just begun testing a solution that does this, and I’d like to post about it sooner rather than later so that others may both try it out and potentially benefit from it.
The premise of this solution came from a suggestion in this thread by Darrin Ward who suggested:
1) For IE users only, serve the page with everything loaded in a JS variable and do a document.write of it only when some mouse cursor movmement takes place (GA wouldn’t execute until the doc.write).2) Use the same principle, but only load the GA code when a mouse movement takes place.
While we didn’t exactly do either of these things, we did take the idea of using DOM events that are indicative of a real human (mouse movement, keystroke) to differentiate the zombie traffic from the real. The good news is that this seems — largely — to work. Here’s how to do it:
1. First of all, you must be using the Google Analytics’s current (i.e. — asynchronous) method for this to make any sense. If you’re not, you probably should be anyway, so it’s a good time to quickly switch. Your page loads will improve if you do.
2. We recommend as a first step that you implement some Google Analytics Events to differentiate good traffic from bad. This will continuing tracking impressions on all page loads, but will fire off a special event that will differentiate the good traffic from the bad. Later, once you are happy that the exclusion is happening properly, you can actually exclude impression tracking (see below).
To do so, insert this code in your site header after the code that loads Google Analytics:
//Evil Robot Detection
var category = 'trafficQuality';
var dimension = 'botDetection';
var human_events = ['onkeydown','onmousemove'];
if ( navigator.appName == 'Microsoft Internet Explorer' && !document.referrer) {
for(var i = 0; i < human_events.length; i++){
document.attachEvent(human_events[i], ourEventPushOnce);
}
}else{
_gaq.push( [ '_trackEvent', category, dimension, 'botExcluded', 1, true ] );
}
function ourEventPushOnce(ev) {
_gaq.push( [ '_trackEvent', category, dimension, 'on' + ev.type, 1, true ] );
for(var i = 0; i < human_events.length; i++){
document.detachEvent(human_events[i], ourEventPushOnce);
}
} // end ourEventPushOnce()
//End Evil Robot Detection
4. So how does this help you? Well, now in Google Analytics you’ll be able to tell the good traffic from the bad. The good will have an event. The bad won’t. The easiest way to check this in Google Analytics is to check content -> events -> events overview. Within a few hours of pushing the above code you should see events begin to accumulate there.
5. To restore more sanity to your Google Analtyics, you could also define a goal. (under admin go to goals and define a new goal like this:)

5. Once you implement this goal, Google Analytics will know what traffic has achieved the goal and what hasn’t — based on this you’ve defined a conversion. This means that on any report in Google Analytics, you can restrict the view of the report to only those visits that converted — this is done in the advanced segments menu:

6. Note that this affects only new data that enters Google Analytics — it does not scrub old data unfortunately. In our case, it’s restored Google Analytics to its normal self after a couple of months of frustration.
7. Eventually, you may want to stop Google Analytics from even recording an impression in the case of bad traffic. To do that, just remove the
_gaq.push( [ '_trackEvent', ...
lines above and replace them with
_gaq.push(['_trackPageview']);
Of course, don’t forget to remove the call to _trackPageview from it’s normal place outside the conditional.
I’d love to hear about any ideas for improvement anyone has for this. We don’t use adSense, but in that case you could just use this technique to conditionalize the insertion of adCode into the DOM.
Good luck bot killers!
[UPDATE May 8, 2012] Added the final argument to _trackEvent to precent distortion of bounce rates. Thanks Chase!
Style Tiles
Posted: March 29, 2012 Filed under: Whatever Leave a comment »Catchy name for a good idea. ”Style Guide” always seemed to vague and formal.
How to get up and running on Amazon EC2 quickly (for OSX people)
Posted: March 28, 2012 Filed under: Whatever 2 Comments »So I needed to set up my OSX rig to access AWS, spin up and configure an Ubuntu instance, install Apache, PHP, MongoDB and do various other tasks. Good thing I found these two great resources:
Fist, here’s Robert Sosinki with a great guide on how to get set up with the EC2 command line tools on Mac OSX. Really clear and well done.
Next, here’s a quick guide from RSM on how to turn that brand new instance into a full LAMP (that’s Linux, Apache, Mongo, PHP) stack … though really you could install whatever packages you need.
Antique Boca Juniors Beer Cans
Posted: March 20, 2012 Filed under: Whatever Leave a comment »
What you drink if you are a Boca fan.
CanopyEngine — Grist’s Knight News Challenge entry
Posted: March 20, 2012 Filed under: Whatever Leave a comment »CanopyEngine — Grist’s Knight News Challenge entry
Check out this project I’m involved with at Grist — in fact I’m starting serious work on it next week while I’m in Buenos Aires. It’s all about building an open source platform for realtime/algorithmic news. If you want to be really really nice you could even “like” the project over here.
temporary hemispheric switcheroo
Posted: March 10, 2012 Filed under: Whatever Leave a comment »Headed here for a few weeks:
Photos will be here for those who care.
I’ll be trying to add to my paltry Spanish, playing soccer and inventing something.
