Seeking some advice on regular expressions

software development

#1

I am currently trying to put together a regular expression but don’t really know what I am doing. I am hoping that someone can give me a little help. Multiple unrecognized clients are frequently hitting my server trying to do some sort of exploit. Most likely it’s the simple “script named as image” stuff, and fortunately DH’s Apache configuration is recognizing this and logging 503 errors in my error.log

Here are examples of a couple entries in my error log…

[quote][Mon Jan 7 21:46:27 2008] [error] [client 84.19.178.201] mod_security: Access denied with code 503. Pattern match “\\.ph(p(3|4)?).*path=(http|https|ftp)\\:/” at REQUEST_URI [severity “EMERGENCY”] [hostname “benconley.net”] [uri “/index.php?option=com_autolinks&Itemid=&mosConfig_absolute_path=http://www.humanesociety.com/media/phpbo.do???”]

[Mon Jan 7 21:49:37 2008] [error] [client 66.249.13.109] mod_security: Access denied with code 503. Pattern match “\\.ph(p(3|4)?).*path=(http|https|ftp)\\:/” at REQUEST_URI [severity “EMERGENCY”] [hostname “benconley.net”] [uri “/index.php?option=com_anjel&Itemid=&mosConfig_absolute_path=http://www.humanesociety.com/media/phpbo.do???”][/quote]
What I want to do is grep through the error logs looking for “[client” and reading from there up until the first occurence of “]”, or perhaps “] mod_security: Access denied with code 503” in order to locate just this type of error. Some of the IP addresses appear quite frequently and I would like to get a list of the unique IPs. Then I can do a count or something on them to identify ones that show up most frequently and possibly block them in my .htaccess file.

Keep in mind that I am a regex novice and this will probably be exceedingly simple for some of you.

I started with this: “grep '[client ’ error.log” which worked, but it returned the entire line. Next I went on to “grep ‘[client ]*’ error.log” which returned the entire line, and “grep ‘[client*]’ error.log” which failed to return anything at all.

Am I headed in a useful direction or am I just completely off base here? Any input would be greatly appreciated.


#2

I’m no command line Ninja. However it is relatively easy to do so with Perl or PHP or the like.

Perl:

[code]#!/usr/local/bin/perl

my ($line, $timestamp, $category, $ip_address);
my %statistics = ();

while ($line = <>) {
if (index($line, ‘mod_security: Access denied with code 503.’) >= 0) {
($timestamp, $category, $ip_address) = $line =~ /[(.+?)] [(.+?)] [client (.+?)]/;
$statistics{$ip_address} += 1;
}
}

foreach $ip_address (sort { $statistics{$b} <=> $statistics{$a} } keys %statistics) {
print “$ip_address $statistics{$ip_address}\n”;
}

exit;[/code]Save it to file, for example /home/username/path/script.pl
Make the permissions 755.
Then from shell you can execute:
/home/username/script.pl error.log

or

~/path/script.pl error.log

:cool: openvein.org -//- One-time [color=#6600CC]$50.00 discount[/color] on [color=#0000CC]DreamHost[/color] plans: Use ATROPOS7


#3

You rock. Thanks a lot for the feedback. I can’t wait to get home and try it.