Welcome to WebmasterWorld Guest from 54.234.38.8

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

regex: extract email embedded images

     
10:06 pm on Oct 16, 2013 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 20, 2006
posts: 109
votes: 0


hi.. need some help..

how do i extract the src value using preg_match_all() in a string like...

d asda dsa as <img border=0 width=1024 height=291 id="_x0000_i1025" src="cid:image001.jpg@01CECA91.641305C0">d asda dsa das<br>d asd adsa a <img border=0 width=1024 height=727 id="_x0000_i1026" src="cid:image002.jpg@01CECA91.641305C0">d asdada dasd asd sad d ads ada


thanks for your answer in advance.. :)
10:32 am on Oct 17, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12696
votes: 244


src ?= ?"([^"]+)"

The part in parentheses.

How consistent is the overall pattern? I'm seeing

<img border=0 width=\d+ height=\d+ id="\w+" src="cid:\w+\.jpg@\w+\.\w+">

but don't know how much of that would remain accurate in other cases. Is it always @01CECA91.641305C0 ?
2:45 pm on Oct 17, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member swa66 is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 7, 2003
posts:4783
votes: 0


You probably need to address the situation with multiple spaces and/or new lines between the "src" and the "=" (and same of the "=") and you probably need to also avoid it picking up on <script src="javahere.js"> -> so you need to make sure it's inside an <img> tag.
It's all essentially the same technique: you specify what's allow to match, and the part(s) between parentheses determine what you get as "output"