How to block bad bots using htaccess

Posted on

You may block bad bots in three different ways: by using .htaccess, the robots.txt file, or using Cloudflare bot fight. These three techniques work well and prevent your server from becoming overloaded, which would otherwise slow down your website.

How to block bad bots using htaccess


To block bad bots using .htaccess, you can follow these steps:

  1. Identify Bad Bots: Before blocking any bots, it's essential to identify which bots you want to block. You can find lists of known bad bots online or analyze your server logs to see which bots are causing issues or consuming resources unnecessarily.

  2. Create or Edit .htaccess File: Access your website's root directory and locate the .htaccess file. If you don't have one, you can create a new text file and name it ".htaccess". Ensure that you have the necessary permissions to edit or create files in your server directory.

  3. Block Bots by User-Agent: Use the following code snippet to block specific bots based on their User-Agent string. Replace "badbot" with the actual User-Agent of the bot you want to block. You can add multiple lines to block multiple bots.

# Block Bad Bots by User-Agent
SetEnvIfNoCase User-Agent "^badbot" bad_bot
Deny from env=bad_bot
  1. Block Bots by IP Address: If you notice abusive behavior from certain IP addresses, you can block them directly using their IP addresses. Use the following code snippet to block IP addresses:
# Block Bad Bots by IP Address
Deny from 123.456.789.0
Deny from 987.654.321.0

Replace the IP addresses with the actual IP addresses you want to block. You can add multiple lines to block multiple IP addresses.

  1. Block Bots by Referrer: Some bots might not identify themselves accurately in the User-Agent string, but they may still be identified by their referrer. You can block such bots by checking the referrer field in the HTTP header. Use the following code snippet:
# Block Bad Bots by Referrer
SetEnvIfNoCase Referer "^https://www.badbot.com/" bad_referrer
Deny from env=bad_referrer

Replace "https://www.badbot.com/" with the actual referrer you want to block. You can add multiple lines to block multiple referrers.

  1. Block Bots by Hostname: Similar to blocking by IP address, you can also block bots by hostname if they are consistently causing issues. Use the following code snippet:
# Block Bad Bots by Hostname
Deny from .badbot.com
Deny from .spamdomain.net

Replace ".badbot.com" and ".spamdomain.net" with the actual hostnames you want to block. You can add multiple lines to block multiple hostnames.

  1. Custom Error Page for Blocked Bots: You can create a custom error page to display a message to the blocked bots instead of just denying access. Create an HTML file with your custom message and save it in your website directory. Then, use the following code snippet in your .htaccess file to redirect blocked bots to this custom error page:
# Custom Error Page for Blocked Bots
ErrorDocument 403 /custom_error_page.html

Replace "/custom_error_page.html" with the actual path to your custom error page.

  1. Testing: After implementing the changes in your .htaccess file, it's crucial to test whether the blocking rules are working as expected. You can use online tools or user-agents switcher extensions in browsers to simulate the behavior of different bots and verify if they are blocked.

  2. Regular Maintenance: Keep an eye on your server logs and monitor any new bots or suspicious activities. Update your blocking rules accordingly to ensure that your website remains protected from unwanted bots.

Remember to always keep a backup of your .htaccess file before making any changes, and be cautious when blocking bots to avoid accidentally blocking legitimate traffic.