homepage Welcome to WebmasterWorld Guest from 50.19.169.37
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
Preg match puzzler
Elric99




msg:4232348
 1:08 pm on Nov 19, 2010 (gmt 0)

Hi I've got a string of text like:

"A generous 4GB Memory ensures multitasking isn't a problem while the 500GB Hard Drive provides you with vast amounts of storage space for all your multimedia files..."

And I want to extract the hard drive size with preg_match in php.

if(preg_match("/( [0-9]GB)(.+?)GB Hard Drive/", $description,$hdd)) {
print_r($hdd);
}

I can't figure out how to do it, despite reading multuple regexp tutorials. Anyone know how to code this?

Thanks

Tom

 

Matthew1980




msg:4232508
 8:25 pm on Nov 19, 2010 (gmt 0)

Hi there Elric99,

I had a go at this, regex isn't my strong suit, but so far as I can see, this does the trick, only thing is that it only handles 3 digit values, I haven't managed to work out how to handle terra byte versions..

Here's what I came up with!

$string = "A generous Memory 4GB ensures multitasking isn't a problem while the 300GB Hard Drive provides you with vast amounts of storage space for all your multimedia files...";

if(preg_match("/([0-9]{3}GB Hard Drive)/", $string, $result)){
echo "Matched<br />";
echo $result[1];
}
else{
echo "Not Matched";
}


Hope that helps, and I would be interested to know any other ways that will be offered, as I am currently limited in my regex knowledge.

Cheers,
MRb

bedlam




msg:4232534
 9:25 pm on Nov 19, 2010 (gmt 0)

No doubt somebody will come along and correct me, but I don't see a lot of room for improvement in Matthew's version. However, I think the following is very slightly better if what's wanted is just the numeric value (and it anticipates the possibility of a space between the value and the units, and the possibility of the units being lower or mixed case):

/([0-9]{3})\s*GB/i

This says 'match any string of three digits followed by zero or more spaces and the letters "GB" no matter their case'.

-- b

Readie




msg:4232701
 1:07 pm on Nov 20, 2010 (gmt 0)

The only alteration I would make to Matt's is to add multi-line and case insensitive flags, a variable number length for the hard drive size, and to allow for TB hard drives, like so:

if(preg_match('/([0-9]{1,3}[GT]B Hard Drive)/im', $string, $result)){


As "GB" is fairly commonly used, I suspect it is possible to pick it up out of the context of hard drives, so I would say the words "Hard Drive" is required.

bedlam




msg:4232749
 5:42 pm on Nov 20, 2010 (gmt 0)

Those are very good refinements. But about this:

As "GB" is fairly commonly used, I suspect it is possible to pick it up out of the context of hard drives, so I would say the words "Hard Drive" is required.

My expression does include "Hard Drive", but Elric99 wanted to match "the hard drive size," so I'd say your sub-expression is still too inclusive.

Orignally, I assumed the unit was known, but as you correctly pointed out, it could be GB or TB these days, so I think the unit may be required in the match. If we're dealing with a source whose form may change, I'd also say the optional space character between the numbers and the unit is required.

All that said, I'd only make the following very small modifications to your regex:

/([0-9]{1,3}\s*[GT]B) Hard Drive/im

Given the above expression and the string "lorem ipsum 300 GB Hard Drive dolor sit amet", $result[0] will still return, "300 GB Hard Drive", but $result[1] will return "300 GB".

-- b

Matthew1980




msg:4233095
 7:19 pm on Nov 21, 2010 (gmt 0)

Hi all,

Well I was hoping that I would have my method improved a little, from my *very* limited knowledge of this syntax, I knew as it could be improved, I also had thought about the inclusion of TB, but I left off multiline as the OP stated string, but I agree with case insensitivity.

I was just wanting to know how to be more 'flexible' when checking the digits, I was wanting to state '1 or more', but had no idea of the syntax.

I think now that Elric99 has something to go on now!

Cheers,
MRb

rocknbil




msg:4234083
 4:35 pm on Nov 23, 2010 (gmt 0)

[0-9] is a range, a character class is not needed, you want digits, use \d. For old timer's sake, throw M in there too for megabytes. :-)

80 GB Hard Drive
80GB H.D.
80 GB Disk
80 GB Storage
80 GB SCSI
80 GB Striped Raid

The list goes on, including the incorrect but potential for

80 G.B. Hard Drive

/(\d{1,3}\s*[GTM]\.?B\.?)\s+[\w\.]+/im

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved