Blocking a spider


#1

I have a Chinese spider going trough my site IP address is going from a low 124.115.0.-- to a high 124.115.0.-- how can I block this spider?

Thanks


#2

Check the wiki:

http://wiki.dreamhost.com/KB_/Unix/_.htaccess_files#Deny.2FAllow_Certain_IP_Addresses

-±$-±$-±
[color=#CC0000]$97 Cash Back[/color] [color=#0000CC]JOHNGALT97[/color] ][ [color=#0000CC]JOHNGALT[/color] [color=#CC0000]$96 Cash Back[/color] ($1 for me)


#3

Do you think it obeys robots.txt? If so, put the following in your robots.txt file:

User-agent: {fill in the name of the user agent for the spider here with no surrounding braces} Disallow: /If not, use the following in your .htaccess file:

RewriteCond %{REMOTE_ADDR} ^124\.115\.0\.$ RewriteRule .*$ - [F]Free unique IP and $67 off with promo code [color=#CC0000]FLENSFREEIP67[/color] or use [color=#CC0000]FLENS97[/color] for $97 off. Click here for more options


#4

I don’t know the name of the spider I just see a lot of different IP adresses in my FireStats coming from China so I guess

RewriteCond %{REMOTE_ADDR} ^124.115.0.$
RewriteRule .*$ - [F]

In my .htacces file would be the best option?


#5

Yes. Don’t forget to turn the rewrite engine on if it’s not already turned on:

RewriteEngine OnAlso note that rather than using the mod_rewrite engine as I’ve suggested, the wiki espouses the use of the mod_access engine:

order allow,deny deny from 124.115.0 allow from allEither will work. I think you can get finer granularity with a rewrite directive because you have more regular expression capabilities and also can combine conditions involving both the address, hostname, and name of the agent, etc.

Blocking the right set of bots is a whole little hobby in and of itself, much like keeping insects out of your home. Good luck and please feel free to come back for additional advice!

Free unique IP and $67 off with promo code [color=#CC0000]FLENSFREEIP67[/color] or use [color=#CC0000]FLENS97[/color] for $97 off. Click here for more options


#6

If you want to block it completely, try blocking the entire range which is what I did, all of my sites have been copping a hammering from this bot, each visit to the site it comes in on a new IP address from within the IP range I’ve specified below. If you block 124.115. it will just return again using IPs lower in the allocated range for China Telecom.

This kills it entirely. :slight_smile:

order allow,deny deny from 124.114.0.0/15 allow from all
Web Hosting Reviews | Get Around The Net Directory