Forum Moderators: coopster

Message Too Old, No Replies

Regex or StriStr? Extract values from URL

         

bgordon

7:27 pm on Dec 21, 2009 (gmt 0)

10+ Year Member



http://10.10.10.10/media/image.asp?id=61169&img=77_9_59_38-PMsomeimagename.jpg&w=640&h=480&client_id=989

Ok gurus. I figured I should be using Regex with a preg_match to pull out the image url but cannot find the right solution.

I need to get the "77_9_59_38-PMsomeimagename.jpg" out of the url, essentially, parsing everything between the "img=" and the "&w=" I have a bunch of these in an xml file I am parsing. I can download the file ok, but unless I can grab the filename out of the url, I have to make my own filename and I cannot always assume it is a JPG. By pulling the filename value and extension out, I can download the file and save it using the original filename.

It would extra special if I could grab the filename and extension separatedly in regex but I always explode it and grab the values from the array.

What is the best way to do this? The image name could contain any combination of numbers, letters, underscores and dashes in any case...

Thanks for any help you can be!

IanKelley

7:51 pm on Dec 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This expression is untested but should work:

&img=([^.]+)\.([^&]+)

This assumes that there will always be a file extension for the image.

bgordon

8:07 pm on Dec 21, 2009 (gmt 0)

10+ Year Member



Dang... that is way shorter and simpler than I had envisioned! It works but it also grabs the &img= as part of the capture... how to I exclude that while using it as start point?

IanKelley

8:30 pm on Dec 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You're probably accessing the first element of the output array? Instead look at $output[1] and $output[2] which should contain the filename and the extension respectively.

IanKelley

8:34 pm on Dec 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Here take a look at this example:
preg_match ('~&img=([^.]+)\.([^&]+)~','http://10.10.10.10/media/image.asp?id=61169&img=77_9_59_38-PMsomeimagename.jpg&w=640&h=480&client_id=989',$output);
print '<pre>'; print_r($output); print '</pre>';

bgordon

9:25 pm on Dec 21, 2009 (gmt 0)

10+ Year Member



Yep... you are right... they are in [1] and [2]. For my benefit and those of others reading this, how does this bit of regex capture this into three components... could I trouble you to explain each part of that little jewel? (not the php).

IanKelley

10:30 pm on Dec 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In a regex a pattern surrounded by parenthesis is captured as a sub pattern. In preg_match the first value of the array is the complete pattern, subsequent values are sub patterns.

So for instance the pattern: [^.]+ captures 1 or more of anything that is not a period. When a period is reached the matching stops. Putting it in parenthesis ([^.]+) causes it to be saved as a sub pattern match.