| Regex help please!
|
Matthew1980

msg:4273807 | 11:08 am on Feb 28, 2011 (gmt 0) | Hello all, src="http://subsub.sub.sub.adomain.com/path/to/123456.text.gif" I would like to know what the regex would be to capture the filename irrespective of what it is, but retrieve the file type so that I have a file name to work with. Then I can feed that into another function so that I can run a cron job and periodically 'check' this file. If anyone has a better suggestion, please let me know. Hope that makes sense. Cheers, MRb
|
g1smd

msg:4274122 | 8:36 pm on Feb 28, 2011 (gmt 0) | The filename will always be after the final slash. The extension will always be after the final period. There will be no need for any (.*) patterns at all; stuff like (([^/]+/)*) will strip the folder structure. When you have found the filename and extension, (([^.]+\.)+)([^./]+) will split the filename and extension.
|
timster

msg:4274280 | 11:48 pm on Feb 28, 2011 (gmt 0) | Just to be sure we are on the same page, I am assuming you want to match the full filename and also the file extension from an image tag that points to a specific directory on a specific domain. Also note, this solution will only match the first image tag found in the supplied string.
preg_match( '/src\=\"http\:\/\/subsub\.sub\.sub\.adomain\.com\/path\/to\/([\w\.]+\.(\w+))\"/', $string, $matches );
$filename = $matches[1]; # Contents of the first backreference $filetype = $matches[2]; # Contents of the second backreference
Note the parentheses:
([\w\.]+\.(\w+)) These identify “backreferences” which are returned by the matches array starting at index 1.
|
|
|