Regex help - matching IP ranges


#1

I am trying to add an exclusion to a bot trap script in order to avoid banning certain IPs/IP ranges (mostly WAP requests). My exclusion code (shown below) does not seem to correctly match the IP ranges. I would be most grateful if someone more familiar with regular expressions could tell me what I have done wrong and how to fix it.

IP ranges to exclude:
216.239.32.0 - 216.239.63.255
64.157.224.0 - 64.157.227.255

Code: [code]if ($visitor_ip =~ /^216.239.[32-63].[0-255]$|^64.157.22[4-7].[0-255]$/)
{
print “Content-type: text/html\n\n”;
print “\n”;
print “\n”;
print "WAP devices - forward on\n

Error!
Please go back and skip the first link in order to continue.

\n"; exit; }[/code]Many thanks in anticipation, marsbar

#2

You can either re-write those expression to do what you want now that you know how a character class works, or you can use a different method. 

See the [url=http://]NetAddr::IP[/url] module, specifically the 'contains' or 'within' methods:

[code]use NetAddr::IP;
my @addresses = (
new NetAddr::IP '216.239.32.0/255.255.32.0',
new NetAddr::IP '64.157.227.255/255.255.252.0'
);

my $banned = 0;
my $visitor_address = new NetAddr::IP $visitor_ip;
foreach $banned_address (@addresses) {
if ($visitor_address->within $banned_address) {
$banned = 1;
last;
}
}[/code]This module might make it easier to manage if you have to update the address ranges you are looking for.

:cool:  Perl / MySQL / HTML+CSS

You can either re-write those expression to do what you want now that you know how a character class works, or you can use a different method.

See the NetAddr::IP module, specifically the ‘contains’ or ‘within’ methods:

[code]use NetAddr::IP;
my @addresses = (
new NetAddr::IP ‘216.239.32.0/255.255.32.0’,
new NetAddr::IP ‘64.157.227.255/255.255.252.0’
);

my $banned = 0;
my $visitor_address = new NetAddr::IP $visitor_ip;
foreach $banned_address (@addresses) {
if ($visitor_address->within $banned_address) {
$banned = 1;
last;
}
}[/code]This module might make it easier to manage if you have to update the address ranges you are looking for.

:cool: Perl / MySQL / HTML+CSS


#3

Many thanks for coming to my aid again, Atropos7!

Oh dear… it is no wonder the script is still banning WAP requests.

Earlier, after doing some more reading on regular expressions, I re-wrote the expression. Would you mind checking if it now correctly matches the IP ranges (216.239.32.0 - 216.239.63.255 and 64.157.224.0 - 64.157.227.255), please?

[quote][code]use NetAddr::IP;
my @addresses = (
new NetAddr::IP ‘216.239.32.0/255.255.32.0’,
new NetAddr::IP ‘64.157.227.255/255.255.252.0’
);

my $banned = 0;
my $visitor_address = new NetAddr::IP $visitor_ip;
foreach $banned_address (@addresses) {
if ($visitor_address->within $banned_address) {
$banned = 1;
last;
}
}[/code][/quote]
Thank you for suggesting an alternative, a potential time-saver.

I have tried to understand the CPAN documentation for NetAddr::IP, but it is too advanced for a beginner. I have trouble fully understanding your code snippet; I would be most obliged, if you could provide me with some explanatory notes and/or instructions.

With thanks,
marsbar


#4

Well an ip address is a 32 bit number, and address space is split into networks and subnetworks works so you need a ‘mask’ in which to group sets of addresses. Example CIDR format is
192.168.0.1/19

where the /19 is the mask and means the 19 most significant bits determine the network. Now, it might be easier for some to split that mask into 4 bytes (8 bit values), hence
192.168.0.1/255.255.224.0

Where 255 = all 8 bits and 224 = 3 most significant bits (128 64 32=224)

my @addresses = ( new NetAddr::IP '216.239.32.0/255.255.32.0', new NetAddr::IP '64.157.227.255/255.255.252.0' );We’re creating an array and initializing it with two elements. The elements are objects. The objects are created by calling the new() method of the class NetAddr::IP and passing to it the address and mask as an argument.

my $banned = 0; my $visitor_ip = $ENV{REMOTE_ADDR}; # example: '127.0.0.1' my $visitor_address = new NetAddr::IP $visitor_ip;$banned is a boolean. We are setting it to “false” here.
$visitor_address is an object representing the ip address of the visitor.

foreach $banned_address (@addresses) { if ($visitor_address->within $banned_address) { $banned = 1; last; } }This iterates over our list of banned addresses; $banned_address is the iterator.
On each loop, we check to see if the visitor is from a banned address space. To do this we call the within() method on the visitor address object, passing the banned address object as an argument. IE, “is the visitor is within a banned address range” If this metohd returns “true”, we set our boolean $banned to “true”, then break the loop. When the loop is done, if $banned is still “false”, then the visitor is not banned.

:cool: Perl / MySQL / HTML CSS


#5

Thanks awfully, Atropos7, for the simple, plain English explanation. :slight_smile:

Just going back to regex, would you mind checking if I have got the IP ranges correctly matched this time, please?

IP ranges to match 216.239.32.0 - 216.239.63.255 and 64.157.224.0 - 64.157.227.255

Code:

($visitor_ip =~ /^216.239\.(3[2-9]|4[0-9]|5[0-9]|6[0-3])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])$|^64.\157\.22[4-7]\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])$/)Many thanks, again, for your time and expert help,
marsbar


#6

Net::CIDR might be of help too if you can do this within a Perl script. Doing this with regexps should be a last resort. But speaking of this (and by way of an example), check out the ugly pcre regexps I wrote to match our IP ranges:

66.33.(19[2-9]|2([01][0-9]|2[0-3])).\d{1,3}
205.196.2(0[89]|1[0-9]|2[0-3]).\d{1,3}

(for 66.33.192.0/19 and 205.196.208.0/20)

This might match some stuff outside of those ranges, but I don’t think it should match any /valid/ IP address outside of those ranges.


#7

Well it looks OK, but then if I were you, I’d try writing a testcase by looping through the last two bytes of each address range and making sure the regexp is matching where it is supposed too. Out of 65,536 iterations the first range should have 8,192 matches and the second should have 1,024 matches.

But really, you should use something other than regexp, it will be quite costly if you need to match more than a couple addresses using regexp not to mention harder to maintain.

:cool: Perl / MySQL / HTML+CSS


#8

a /20 would be 4096 matches.

Using regexp is the only option here really… I use it to do checks within Postfix for hosts sending HELO “from” our IPs. I suppose once we get Postfix 2 running it might be possible to write a policy daemon, but this works pretty well for the purpose…