Php pattern matching

software development

#1

Ok, I got past my problem with file_get_contents (http://discussion.dreamhost.com/showthreaded.pl?Cat=&Board=forum_programming&Number=62841) by using curl instead. But I can’t seem to figure out pattern matching in eregi. I’d just like to pull the dates and flows into arrays to plot with my data, but in this test script I don’t seem to be able to grab the flows with anything I’ve tried for $pattern. Is it egregi or something else?

Prosser Flow <?php

$theurl=“http://www.usbr.gov/pn-bin/yak/arc3.pl
."?station=YRPW&year=2006&month=4&day=1&year=2006&month=7&day=31&pcode=QD";

// if (!($contents = file_get_contents($theurl)))
//{
// echo ‘Could not open URL’;
// exit;
// }
$ch = curl_init();
$timeout = 5; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $theurl);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);

// display file
echo $contents;

// find the part of the page we want and output it
//$pattern = ‘([0-9]+.[0-9]+)’;
// $pattern = ‘(2122.08)’;
//$pattern = ‘(^[0-9]+.[0-9]+$)’;
//$pattern = ‘[[:digit:]]{4}.[[:digit:]]{2}’;
$pattern = ‘^[[:digit:]]{4}.[[:digit:]]{2}$’;
if (eregi($pattern, $content, $flow))
{
echo "

$flow is: ";
echo $flow[1];
echo ‘

’;
}
else
{
echo ‘

Nothing matched

’;
};

?>

This signature line intentionally blank.


#2

I see a typo:
if (eregi($pattern, $content, $flow))
should be:
if (eregi($pattern, $contents, $flow))
Silk

My website


#3

I think that maybe you’re approaching the task from too complicated a direction. Regular Expressions are very powerful and vital when working with huge data sets. They can also be notoriously problematic when working with data whose formatting isn’t 100% guaranteed. In my experience there is almost ALWAYS chaff in my data wheat. I wrote this script that worked just fine with this particular data sample. Maybe you’ll find it useful.

Yeah, I might have pegged the needle on the dorkness meter when I chose to do this for some sort of obscure entertainment, albeit entertainment that is useful for strengthening a skill set.

[code]<?php
$theurl=“http://www.usbr.gov/pn-bin/yak/arc3.pl
."?station=YRPW&year=2006&month=4&day=1&year=2006&month=7&day=31&pcode=QD";

$ch = curl_init();
$timeout = 5; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $theurl);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);

/******************************************
// I chose to save the content off to a file while working with it, rather than keep hitting the server

file_put_contents("/var/www/curl/sess.txt", $contents);
$contents = file_get_contents("/var/www/curl/sess.txt");

*******************************************/

// The feed appears to have useful parsing point identifiers.
// I just used them to get everything between them (including flags)
$cStartStr = “BEGIN DATA”;
$cEndStr = “END DATA”;
$cPageTail = stristr($contents, $cStartStr);
$nUsefulDataEndPos = strpos($cPageTail, $cEndStr);
$cUsefulData = substr($cPageTail, 0, $nUsefulDataEndPos);

// explode the content using newlines as delimeters
$aContents = explode(chr(10), $cUsefulData);

// i’ll be putting the line items into an array. Two array types are used, choose one according to your preference
$aDateQD1 = array();
$aDateQD2 = array();

// skip the leading and trailing junk
// Prolly don’t want to do all these assignments in the loop. They’re just used for readability
for ($i=3; $i<count($aContents)-1; $i++) {

// Dates are formatted as 10 characters
$cDateStr = substr($aContents[$i],0,10);

// QD is everything in the trimmed value after the last space
$nQDVal = substr($aContents[$i], strrpos(trim($aContents[$i]), chr(32))+1);

// put the QD values into an array keyed with the date string
$aDateQD1[ $cDateStr ] = $nQDVal;

//put each date/QD combination into their own individual array elements
$aDateQD2[] = array($cDateStr, $nQDVal);
}

//peep scene
echo(’

’);
print_r($aDateQD1);
print_r($aDateQD2);
echo(’
’);

?>
[/code]


#4

Hey, “dorkness” be damned, that was a good and useful exercise, and no doubt very helpful to the original poster. Good job! :slight_smile:

–rlparker


#5

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.
jwz


If you want useful replies, ask smart questions.