Robots.txt Ban Help

Robots.txt is pretty straightforward. Bots are supposed to load to see if they are allowed to crawl your site.

Below is the usual setup for robots.txt. The ideas is you first allow everything, then deny specific user agents:

User-agent: *
Allow: /

User-agent: BadBot1
Disallow: /

User-agent: BadBot2
Disallow: /

The good bots follow this convention. Bad bots don’t.

To ban bad bots, hopefully they have a unique IP adress, UserAgent string, request method or some other identifier that you can ban in .htaccess using a RewriteRule.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>