Forum Moderators: coopster
I've been trying to write some code to (1) remove a chunk of HTML from a string and (2) to grab all the URLs from a bunch of markup whilst distinguishing between links and images. I'm totally new to regex patterns and I really have no idea what I'm doing!
The regex I'm using to find the markup to be removed is as follows:
<p class="submission">\\s*<a href="http://www\\.example\\.com/deviation/\\d*/">
<span class="shadow"><img src="http://thumbs\\.example\\.com/\\w*/\\w*\\.example\\.com/
[a-z]*/\\d+/\\d+/\\d/[a-z]*/\\w*\\.\\w*"
width="\\d*" height="\\d*" alt="[a-z0-9]*" /></span></a>\\s*</p> I have also tried using this regex pattern:
<p class="submission">\s*<a href="http://www\.example\.com/deviation/\d*/">
<span class="shadow"><img src="http://thumbs\.example\.com/\w*/\w*\.example\.com/
[a-z]*/\d+/\d+/\d/[a-z]*/\w*\.\w*"
width="\d*" height="\d*" alt="[a-z0-9]*" /></span></a>\s*</p> I've tried using both preg_replace() and eregi_replace() but neither works. Is there anything obvioulsy wrong with the patterns? What's more, I'd prefer if I could match the second URL with only the [thumbs.example.com...] part so that it doesn't matter about the other slashes.
As I mentioned earlier, I'm also trying to extract the URLs in a bunch of HTML. I've managed to get a regex pattern to work for links [using preg_match_all() to get the data] but I also seem to be saving the rest of the link tag to the array, which I don't want to do. I can't seem to get the image one to work quite right but in contrast to the links, I'd like to save any height and width properties in the HTML tag to the array so I can grab it and use it. Here are my patterns for this:
LINKS:
/(href[= = ])(.*?)(>)(.*?)(<\/a>+)/iIMAGES:
/(img[= = ])(.*?)(>)(.*?)(</>+¦>)/i
As you can probably tell with the images pattern, I have no idea. If any one could give me a hand I'd really appreciate it. And sorry this has been so long winded ;).
Cheers,
AlanC
<edit>Fixed sidescroll</edit>
[edited by: coopster at 3:52 pm (utc) on Aug. 21, 2004]
[edit reason] generalized urls [/edit]
Regex's can be daunting but hang in there. Some previous threads that are relative to your question may get you started.
Regex for matching links [webmasterworld.com]
Regular Expressions [webmasterworld.com]
Also, there are a couple of regular expression tutorials in Learning PHP - Books, Tutorials and Online Resources [webmasterworld.com].