Blocking bad bots with htaccess

software development

#1

i’ve been getting files leeched by dizzler.com, and found a way to block their spider and others by editing .htaccess:

http://www.evolt.org/article/Using_Apache_to_stop_bad_robots/18/15126/

from here:

i tried using the bad_bot environment variable, and also tried just using dizzler’s IP directly, but both created 500 errors on all pages in subdirectories of my .htaccess file. it think it’s because i’m not correctly specifying paths in my Directory directives.

i wrote this:
<Directory “/mymusicdirectory/”>
Order Allow,Deny
Allow from all
Deny from 66.232.150.219

and put that in an .htaccess file in the parent directory of mymusicdirectory. i tried the same with the environment variable method as well; both cause 500s. can someone please clarify how to specify paths in this context? the directory i put this .htaccess file in is /mysite.com/.

my ftp client (Transmit) doesn’t display paths above the parent of mysite.com, so i’m not sure if i’m missing something like /home/users/ or anything.

in case you hadn’t noticed, i’m a noob with server config stuff, so please go easy :slight_smile:

thanks!


#2

Just put this .htaccess inside mymusicdirectory:

Order Allow,Deny Allow from All Deny from 66.232.150.219

Maximum Cash Discount on any plan with MAXCASH

How To Install PHP.INI / ionCube on DreamHost


#3

Dude,
Your lucky that you have only one deny line “so far”.
I wish you the best of luck as staying spam free.

My website


#4

thanks sXi. i can do this, but it would be nice to be able to specify multiple paths from one .htaccess…any tips on pathing on dreamhost? that’s why i posted on this forum in particular, figure we all have somewhat similar paths…


#5

The Apache HTTP server comes with documentation and if you read it it will tell you that the Directory directive does not apply to .htaccess files. “RTFM” would also tell you that .htaccess files apply to the directory they are in as well as to subdirectories. And if you have multiple directories that don’t share a root just use a script/cron job that replicates the .htaccess files for you.

:cool: openvein.org -//-


#6

What do yall think about this blocklist:

http://perishablepress.com/press/2007/10/15/ultimate-htaccess-blacklist-2-compressed-version/

I like the “condensedness” of it, and it seems pretty well thought / worked out to my untrained eye.

One thing I’m wondering as well is would it work if one were to remove the Rewrite base / from it, as I fear that would perhaps conflict with some subdirectories that may have php apps running in them, would it not?


#7

as i mentioned, atropos7, i’m new to this area (server config). i RTFM’d as much as i could, but didn’t notice that the Directory directive didn’t apply to .htaccess, and found the Directory tip in a thread asking for advice on using .htaccess to limit bots that don’t follow robots.txt rules.

the directories i’m trying to block bad bots from do all share a root. just looking for tips on how to correctly specify the paths to those directories in a standard dreamhost install. either absolute or relative paths would help.

thanks…


#8

Never use someone else’s blocklist on your own site.

If they’re bad, you don’t want them. Block them at root.

Maximum Cash Discount on any plan with MAXCASH

How To Install PHP.INI / ionCube on DreamHost