Blocking referrer spam on IIS with ISAPI Rewrite
Referrer spam is a relatively new nasty which has basically come about due to the wave of new blogs that have come about over the last few years. As with trackback and comment spam, the aim of referrer spam is to place links back to the offending website all around the web to drive traffic and raise search engine ranking. Trackback and comment spam bots can to a large extent be dealt with using human verification technologies such as captcha and kittenauth, as well as spam lookup services such as Akismet.
Referrer spam is a different beast in that its aim is not to post links on the target website directly. Instead it relies on bloggers and the like who like to post statistics on their websites about where their visitors are coming from, i.e. referrers. The referral spam bots simply hits your site with a bogus referrer leading back to their website, and all of a sudden webmasters displaying live referrer statistics have links back to those same undesirable sites that have plagued trackback and comment systems. This is done over and over again and at worst if left unchecked these referrer spam bots can consume massive processing and bandwidth resources on your system to the point of causing a denial of service (as I experienced). At the very least they will fill your referrer logs with bogus information so that you have no idea where your real visitors are coming from. In this article I’ll show you how to use ISAPI Rewrite for IIS to stop those referrers dead in their tracks.
My denial of service
It all started about 3 months ago for me. I often look through my Drupal referral logs to see where visitors are coming from, and often when I don’t recognise a referrer I’ll click on it to see what it is all about. My first mistake was to do this for a spam referrer using a fairly innocent looking address in my logs, and all of a sudden I found myself redirected to a flesh site. My biggest mistake was to do this more than once, and as these sites obviously monitor where their traffic is coming from I then started getting hit hard. Over the next few months my referral spam grew so much that by last week upwards of 90% of referrers in my log were spam. It didn’t stop there though, as a few nights ago I tried to access my site to find that it wasn’t there. Like many who use the Drupal CMS, I use the Throttle module to scale back dynamic content when the site is under load. When I got my site back online one of the last Drupal log entries I found was from the Throttle module initialising itself as there were 102 guests online, which at present is well above my real concurrent usage statistics.
How to tell if you’re getting referral spam
On the positive side, referrer spam is usually pretty easy to spot once you know what you’re looking for. First thing to look for is the referring address, and if it looks suspect it usually means it is. The other thing to look for is how many hits you get from this referrer, as most bots will hit your site repeatedly in order to try and get into your top referrers list or because the bot hasn’t got the smarts to know which sites it has already hit. I’ve found that most of the time bots will use referrer addresses that point to commonly used free webhosting sites which then redirect to the spam site.
Here is an example of referral spam from my Drupal referral logs (notice how the spam stops dead about half way up which marks the point where I applied my ISAPI Rewrite rules);
Here is an example of referral spam from my AWStats referrer page;
Using ISAPI Rewrite to block referral spam
For all the grief these referral spam bots can cause, it is quite surprising then that you can stop them with just two lines near the top of your ISAPI Rewrite httpd.ini file. What the following rules do is a case insensitive check of the referrer of all incoming requests against your list of know referral spam, and if a match is found then no further processing is done and a 404 (page not found) status code is sent.
#Block referral SPAM
#Add keywords between the () below and separate with |
RewriteCond Referer: .*(?:keywords|go|here).*
RewriteRule (.*) $1 [I,F]
All you have to do is simply fill in the brackets with keywords as shown above from the referrer string of the spam bots and separate them with the pipe symbol. For example, if you are getting hit with referrer spam pointing to site1.mortgage.com and site2.mortgage.com, you’d simply enter the keyword of mortgage and those sites and any future ones with the word mortgage in their referral string will be blocked. A strong word of warning though: ISAPI Rewrite does not know a good referrer from a bad one, and any referrer matching the keywords you've set will be blocked. Including legitimate traffic!
The bots will be back
One thing you have to keep in mind is that spam bots are always evolving, and your spam filter rules will need to grow with them. You will always need to keep an eye on your referral logs to make sure that any new referral spam sites are added to your list. For instance I developed a set of rules to keep my current referral spam out a few days ago, and for a few days I had nice clear logs. Then last night they were back and I was hit with a hundred or more bogus blog referrals from blogspot. I added blogspot to my keywords this morning and have not seen them since, although I have little doubt I will see them again.
Once you have your rules in place you will see an immediate change in your referral logs, and if you are getting hit really hard by them you’ll also see a noticeable reduction in system resource usage (particularly bandwidth). Using ISAPI Rewrite to stomp out referral spam might not be the most elegant solution, but it sure is effective.