How to Remove Google Analytics Referral Spam
By: SmallBizClub
Referral spam (also referrer or log spam) is bogus information that can skew analytics data and the reports that come from it. The image below is an example from a low-traffic website where the spikes are easy to see.
A legitimate referral is when someone arrives at your site via a link from another site. Legitimate referrals have certain characteristics. For example, let’s say someone has this link on their site: “Get a daily website traffic snapshot from Scormi.” The HTML for that link might be <a href=”http://scormi.net/ snapshot.html”>traffic snapshot</a>
You can see that this link goes to the hostname “scormi” and you can further see that it’s going to the page “snapshot.html.” With a valid referral link your hostname and a target page are present and Google analytics will record that information. All is good.
Enter the bots.
A spammer triggering hits to your website with a script doesn’t come via a referral link—or even come at all—so the record of such a “visit” will not include your hostname or a landing page. This will result in landing pages where the URL appears as (not set).
Related Article: 5 Things You Probably Didn’t Know About Google Analytics
Why do spammers create these fake referral links? The purpose is to drive traffic to a particular site by getting you, the curious webmaster, to click on the referring URL in your analytics report or log file.
Some of the most common referral spam offenders are these or variations of them:
- semalt.com
- buttons-for-website.com
- see-your-website-here.com
- 4webmasters.org
If you think about it, referral spam is the same tactic used by email spammers except that instead of using a curiosity-invoking email subject line, you get a curiosity-invoking referral link in your analytics report. So a hat tip to the dark side for being clever and pivoting off a tried-and-true technique.
Scormi creates its report from Google analytics data via API so any faulty data in Google analytics ends up in the Scormi report. It’s important to keep your Google analytics data clean to ensure your report is accurate.
To filter out referrer spam I use a Google Analytics hostname filter, which has eliminated the problem for the sites I look after.
Before explaining how to set it up let’s review some techniques that won’t work.
Google Analytics Exclude Filters
It’s possible to block the spamming domains in the filter section of your property by making a list of offenders. But the spammers continually shift their domains & IP addresses, requiring ever more filters to keep up with them. It’s not a feasible long-term solution to keep adding new filters.
Htaccess File
This file acts as a gatekeeper for what can and cannot come onto your site. Whereas the Google analytics filters can hide bogus info from your reports, the htaccess file blocks the bad stuff at the front door before it gets in.
Unfortunately, this approach suffers from the same problem as the exclude filters; the file must be continuously updated as spammers shift source domains and IP addresses.
It gets worse. Both of these methods fail when spammers trigger a Google analytics hit without ever visiting your website. How is that possible? The Google analytics tracking code placed on your website includes a unique identifier similar to UA-000000-01. Spammers execute a script to automatically increment the numbers and run through them by the thousands. They do it on their own computers, no need to visit websites.
So scratch the exclude filters and htaccess file as viable solutions for countering referrer spam.
The include hostname filter.
An “include hostname” filter, on the other hand, will eliminate Google analytics hits that seem to be referrals but have no “hostname” associated with the them (recall the referral link example at the top) or have hostnames other than the ones you want.
Adding this filter to my sites eliminated referral spam and the annoying (not set) landing page.
Here’s how to add the filter to a Google analytics property and test it before going live.
Step 1. Sign in to your Google Analytics property.
Step 2. Click on the Admin tab at the top then click Filters on the right.
Step 3. Click the red +New Filter button. (Aside: You can see I already have a few exclude filters set up to screen out employee/partner traffic.)
Step 4. Select “Create new Filter” (A) then name the filter (B). I named it “allowed hostnames.”
Step 5. In the image above you see that Exclude is selected by default. Change that to the “Include” option (A, below) then choose “Hostname” from the drop-down box.
Step 6. A. Now add your allowed hostnames. Referral links on other sites are referring TO YOU so you’ll want to add your own hostname so I would enter “scormi.net and “www.scormi.net” and you will enter your own site names.
If you recently did a 301 redirect from an old domain, say for a rebranding, then add that one too. Separate the names with a vertical line ” | ” and no spaces between the names.
Others have pointed out visitors sometimes come from Google translate or the cached version of you web pages via Google search cache. We will add those too. So you might have something like this in the File pattern box one, continuous one line. (I have line breaks here for this post to format correctly on mobile devices.)
yoursite.com|www.yoursite.com
|translate.googleusercontent.com
|webcache.googleusercontent.com
Step 6. B. Click the Verify button to see the results. In the example below, the filter will screen out more than a few (not set) hits: 8.5% of the total traffic to that site! We also see that the “after filter” results are clear of hostnames we actually do want to include in Step 6A.
Step 7. Save the filter if you are satisfied with the test result. Watch your results after instituting this filter. If you notice anything undesirable then edit it accordingly or delete it altogether.
Given that Google analytics is an enterprise class product, I expect that they will solve this problem just as they have with gmail and web spam.
This article was originally published by Scormi
Author: Dave Goodwin is co-founder of Scormi, a digital marketing analytics service, and the operator of PlaneViz, a popular aviation website. Connect with Dave on Twitter @dgoodwn
Published: July 9, 2015
3532 Views
3532 Views