Banning webcrawlers


#1

Anybody have a url handy to a quick ‘how to ban webcrawlers’ howto?

('I was hoping to be lazy myself… but will read a bigger work if I have to).

Thanks
Byron


#2

Nothing “handy”, but if you’re PHP-aware you can set up your own PHP compilation and use the “browscap” .ini setting in conjunction with the “get_browser” function in PHP. Provided your have an up-to-date browscap file, it’s quite efficient in regards to redirecting the appropriate bots to the appropriate place (also helps in adjusting for the inadequacies in MSIE’s CSS 2 implementation). Feel free to send me a private message if you’d like a copy of the browscap file I’m using.

Another way to do this is using IP/DNS lookups to blacklist specific IPs and ranges of IPs, but that’s a much more complex method.

Note that many bots will honor a “Robots” meta tag, but not all.


#3

I had this bookmarked but haven’t actually bothered to try it myself, so no idea as to how effective it is… Blocking unwanted bots


#4

Thanks for the reference.

That’s the info I was looking for.
Byron