Most webmasters and website owners know the pain adversaries can do to your website. A user who is in some way misusing your website in some way is known as adversaries. Spammers, scrapers, email harvesters, and malicious bots are example of these type of adversaries.
Web servers allow us to mitigate these bad user agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any falls within this blacklisted agents’ list must be immediately denied access. It can be argued that isn’t the most effective method of securing your site against malicious behavior, it adds an extra layer of protection.
Before making any change here are few points to keep in mind:
- It is easy to change user agent name so spammers it is possible that these bots can change their user agent name to something else so this is ongoing process and we need to keep updating our list of bad agents
- Performance of the site is another reason which needs to be kept in mind, the blacklist grows server needs to check against all these bad bots so there could be some performance issues while performance is important security cannot be compromised
We will talk about how using Apache we can do prevent that, for other web servers using similar technique we can prevent the bad bots. Using .htaccess file you can do it easily without even restarting server.
The per-requisite for this is Apache’s mod_rewrite must be enable for this change to work. You can write something like this in your .htaccess file for each bad bot you will have a rewrite condition.
Perishable Press released Ultimate User-Agent Blacklist which has can be found at https://perishablepress.com/wp/wp-content/images/2009/agent-blacklist/user-agent-blacklist.gif