Htaccess and blocking crawlers


#1

Hi!

I want to block some specific user agents from accessing my site. For exam ple, anyone who has “Macintosh” in http_user_agent

I created a .htaccess file in root of my website, and it works (i have some tag in the beginning).

Now I add the following:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Macintosh
RewriteRule ^.* - [F,L]

And this doesnt work. Whats wrong anyone ?


#2

have you seen the article in wiki http://wiki.dreamhost.com/index.php/Htaccess

Save [color=#CC0000]$97[/color] (max discount) on dreamhost plans by using promo code: [color=#CC0000]97CRAZY[/color].


#3

yes, there are no answers to my question in this wiki article


#4

It sounds like you needs some help with regex. Personally, I suck at regex and can only suggest that you try appending a “.$" (…that’s dot-asterisk-dollarsign) at the end of your search string (…for example, "^macintosh.$”), but that might not work so well either.


#5

well, actually i took the example from some other file, and it works for many people (google for “close to perfect htaccess part 4”)


#6

Looks to me like that’s going to match any useragent that starts with Macintosh, with that capitalization. a search on my site’s logs showed zero useragents that start with Macintosh, but plenty that contain it. I suggest using this, which should mean it contains Macintosh with any form of capitalization:

RewriteCond %{HTTP_USER_AGENT} Macintosh [NC]

if that doesn’t work, you can try something a little more complicated:
RewriteCond %{HTTP_USER_AGENT} ^.Macintosh.$ [NC]


#7

looks like mod_rewrite is not compiled on this particular server @ coke :slight_smile: that must have been the reason. ive contacted support.