Robots.txt Ban Help

Robots.txt is pretty straightforward. Bots are supposed to load http://yoursite.com/robots.txt to see if they are allowed to crawl your site.

Below is the usual setup for robots.txt. The ideas is you first allow everything, then deny specific user agents:

User-agent: *
Allow: /

User-agent: BadBot1
Disallow: /

User-agent: BadBot2
Disallow: /

The good bots follow this convention. Bad bots don’t.

To ban bad bots, hopefully they have a unique IP adress, UserAgent string, request method or some other identifier that you can ban in .htaccess using a RewriteRule.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>