Forum Moderators: coopster

Message Too Old, No Replies

html pictures do not appear in email

         

jackvull

4:25 pm on Feb 2, 2010 (gmt 0)

10+ Year Member



I use curl to download some files and send them to me by email. However, none of the html pictures can be viewed in the email. Maybe this is because of relative links in the HTML file? How can I chaqnge all the relative links through curl?

#! /usr/bin/php
<?php

//echo date("G");
if (date("G") >= 21 || date("G") < 13) {exit();}

$fh = fopen("/usr/local/sbin/myscripts/ITMS_PTWL.html", 'w') or die("can't open file fh");

//INIT CURL
$ch = curl_init();

// SET URL FOR THE POST FORM LOGIN
curl_setopt($ch, CURLOPT_URL,
'https://www.example.com/login.php');

// ENABLE HTTP POST
curl_setopt ($ch, CURLOPT_POST, 1);

// SET POST PARAMETERS : FORM VALUES FOR EACH FIELD
curl_setopt ($ch, CURLOPT_POSTFIELDS,
'_username=myuser&password=mypassword');

// IMITATE CLASSIC BROWSER'S BEHAVIOUR : HANDLE COOKIES
curl_setopt ($ch, CURLOPT_COOKIEJAR, '/usr/local/sbin/myscripts/cookie.txt');

# Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL
# not to print out the results of its query.
# Instead, it will return the results as a string return value
# from curl_exec() instead of the usual true/false.
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);

// EXECUTE 1st REQUEST (FORM LOGIN)
$store = curl_exec ($ch);

// SET FILE TO DOWNLOAD
curl_setopt($ch, CURLOPT_URL,
'https://www.example.com/pro_trader_watch_list_prem.php');
// EXECUTE 2nd REQUEST (FILE DOWNLOAD)
$content = curl_exec ($ch);
fwrite($fh, $content);
fclose($fh);

// CLOSE CURL
curl_close ($ch);

$to = 'aaa@example.net';
$subject = 'Pro Trader Watchlist';
$random_hash = md5(date('r', time()));
$headers = "From: webmaster@example.co.uk\r\nReply-To: webmaster@example.co.uk";
$headers .= "\r\nContent-Type: multipart/alternative; boundary=\"PHP-alt-".$random_hash."\"";
ob_start(); //Turn on output buffering
?>
--PHP-alt-<?php echo $random_hash; ?>
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
<?
$page = file_get_contents('/usr/local/sbin/myscripts/ITMS_PTWL.html');
echo $page;
?>
--PHP-alt-<?php echo $random_hash; ?>--
<?
$message = ob_get_clean();
$mail_sent = @mail( $to, $subject, $message, $headers );

?>

[edited by: eelixduppy at 6:31 pm (utc) on Feb. 2, 2010]
[edit reason] exemplified [/edit]

CyBerAliEn

6:38 pm on Feb 2, 2010 (gmt 0)

10+ Year Member



Could you provide a little more information on your process?

If you sending yourself an email with HTML in it, and it contains images... you must be sure the "SRC" attribute contains the full URL. If it is just a relative value, it will not work.

A quick solution to this specific issue would be something like:
$html = '<img src="/something.jpg">'; /*the html source you are sending*/
$urlbase = 'http://www.yoursite.com';
$html_fix = str_replace('src="',"src=\"{$urlbase}",$html);


This will fix all the SRC tags from relative values to an absolute URL value.

Note however that this would break any links that are already absolute. To resolve this point, you would need to likely do some work with regular expressions. But since you state that "none of the html pictures can be viewed", it seems safe to assume none of your SRC values are absolute; so this is unlikely to be a problem for you.

And this may not address your specific problem... but given what little I know of your situation, it seems an appropriate solution to the likely problem.

jackvull

6:52 pm on Feb 2, 2010 (gmt 0)

10+ Year Member



Thanks.
I login and grab all the html into the $contents variable.
Can I just use that expression on the $contents variable or do I need to do something else?
I tried this but it didn't work:
$str=preg_replace('#(href|src)="([^:"]*)("|(?:(?:%20|\s|\+)[^"]*"))#','$1="http://example.com/$2$3',$str);

[edited by: eelixduppy at 10:04 pm (utc) on Feb 5, 2010]
[edit reason] disabled smileys [/edit]

CyBerAliEn

7:08 pm on Feb 2, 2010 (gmt 0)

10+ Year Member



I would work with the 'page' variable near the end of the code. Right after you grab 'page', just before you echo it... run the code to modify it (such as your preg_replace call).

I have verified that your regular expression does not work. I am not sure why (regular expressions are not my strong point). I'll see if I can look into it, but other users here with more regular expression experience may be better able to help you if I cannot figure out.

rocknbil

9:20 pm on Feb 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



$str=preg_replace('#(href|src)="([^:"]*)("|(?:(?:%20|\s|\+)[^"]*"))#','$1="http://example.com/$2$3',$str);


First note that this will fail because it's single quoted, these variables are not interpolated.

'$1="http://example.com/$2$3'

You have half a chance with this. :-)

"$1=\"http://example.com/$2$3"

As said, you don't want to break existing full links. If you expect any of this data to have full links, a one-liner won't work. It may require parsing the file line by line with some if's if http.... is found.

I prefer / as a regex delimiter, but it doesn't matter. In PHP # is every bit as much a comment as // is.

Untested, but my version of your regex. This will **only** work if you don't expect a full URL.

$str=preg_replace('/(href|src)\s*=\s*"*\'*\/*([^"\'\s\>]+)["\'\_a-z0-9\s]*(\>)/i',"$1=\"http://example.com/$2\"$3",$str);

An examination,

'/ = PHP string delimiter ', regex delimiter /

(href|src) = store href or src in $1

\s*=\s*"*\'*\/* = this complicated part is to support the following variations and combinations:

src=some-image.jpg
src = some-image.jpg
src = "some-image.jpg"
src = 'some-image.jpg'
src="/some-image.jpg"

In all cases, note \' is escaped within the regex, it's my string delimiter.

([^"\'\s\>]+) = zero or more of anything not a ", ', space, or >, which should manage malformed attributes, per the above. A space should only occur if there's a title or target attribute . . . store this in $2.

["\'\_a-z0-9\s]* = Zero or more of ", ', _, letters, numbers, spaces. See above.

(\>) = Closing carat, end of pattern store in $3

/i' = closing regex delim, case inSenSiTivE modifier, closing PHP string delimiter

"$1=\"http://example.com/$2\"$3" Replace with this. Note ending " for attribute.

[edited by: eelixduppy at 10:04 pm (utc) on Feb 5, 2010]
[edit reason] disabled smileys [/edit]

jackvull

9:49 pm on Feb 2, 2010 (gmt 0)

10+ Year Member



Seemed to partially work thanks.
However, it is not converting files when the HTML looks like this:
<div style="text-align: center;"><img width="604" height="351" src="/userfiles/image/XOM02_01_10.jpg" alt="" /><br />

Presumably that's because it's a div rather than a href?

I just reran it using img|src in the preg_replace but it didn't pick it up in the file.

rocknbil

3:17 am on Feb 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No it's probably because of the width and height coming first. I think. Will take a closer look tomorrow - or maybe someone can tweak it to get it working . . .

jackvull

2:56 pm on Feb 3, 2010 (gmt 0)

10+ Year Member



Shouldn't that preg_replace only look to match when it finds src?

rocknbil

8:17 pm on Feb 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Correct. :-P I think it was mostly related to the alt info after and the XML style closing. Here's 4 working examples to test.


<?php
header("content-type:text/html");
//
$regex = '(href|src)\s*=\s*"*\'*\/*([^"\'\s\>]+)(.*?)\s*\/*\s*\>';
//
$str='<div style="text-align: center;"><img width="604" height="351"
src="/userfiles/image/XOM02_01_10.jpg" alt="" /><br />';
echo htmlentities($str) . '<br><br>';
echo (preg_match("/$regex/i",$str))?"MATCH<br><br>":"NO MATCH<br><br>";
$str=preg_replace("/$regex/i","$1=\"http://example.com/$2\"$3>",$str);
echo htmlentities($str) . '<br><br>';
//
$str='<img width="604" height="351" alt="" src="userfiles/blah/bleah.jpg"><br>';
echo htmlentities($str) . '<br><br>';
echo (preg_match("/$regex/i",$str))?"MATCH<br><br>":"NO MATCH<br><br>";
$str=preg_replace("/$regex/i","$1=\"http://example.com/$2\"$3>",$str);
echo htmlentities($str) . '<br><br>';
//
$str='<img width=604 height=351 src=userfiles/blah/bleah.jpg><br>';
echo htmlentities($str) . '<br><br>';
echo (preg_match("/$regex/i",$str))?"MATCH<br><br>":"NO MATCH<br><br>";
$str=preg_replace("/$regex/i","$1=\"http://example.com/$2\"$3>",$str);
echo htmlentities($str) . '<br><br>';
//
$str="<img width='604' height='351' src='userfiles/blah/bleah.jpg'><br>";
echo htmlentities($str) . '<br><br>';
echo (preg_match("/$regex/i",$str))?"MATCH<br><br>":"NO MATCH<br><br>";
$str=preg_replace("/$regex/i","$1=\"http://example.com/$2\"$3>",$str);
echo htmlentities($str) . '<br><br>';
?>

jackvull

8:43 pm on Feb 3, 2010 (gmt 0)

10+ Year Member



here's the problem.
Somehow it's getting double quotes at the end of some of the src:

1
<div style="text-align: center;"><img width="604" height="351" src="/userfiles/image/XOM02_01_10.jpg" alt="" /><br />

MATCH

<div style="text-align: center;"><img width="604" height="351" src="http://example.com/userfiles/image/XOM02_01_10.jpg"" alt=""><br />

2
<img width="604" height="351" alt="" src="/userfiles/image/XOM02_01_10.jpg"><br>

MATCH

<img width="604" height="351" alt="" src="http://example.com/userfiles/image/XOM02_01_10.jpg""><br>

3
<img width=604 height=351 src=/userfiles/image/XOM02_01_10.jp><br>

MATCH

<img width=604 height=351 src="http://example.com/userfiles/image/XOM02_01_10.jp"><br>

4
<img width='604' height='351' src='/userfiles/image/XOM02_01_10.jpg'><br>

MATCH

<img width='604' height='351' src="http://example.com/userfiles/image/XOM02_01_10.jpg"'><br>

jackvull

8:52 pm on Feb 3, 2010 (gmt 0)

10+ Year Member



I changed it to this taking out 1 quotation mark:
$str=preg_replace("/$regex/i","$1=\"http://example.com/$2\$3>",$str);

but cannot get 3 and 4 to work

rocknbil

1:43 am on Feb 4, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



LOL . . . hey I'm not a proofreader. Sorry.

No, leave that part alone. Your change has escaped the $ of 3, which will replace it with the literal number three with a dollar sign to the left of it.

Use the original demo I gave you, change only the $regex:

$regex = '(href|src)\s*=\s*"*\'*\/*([^"\'\s\>]+)"*\'*(.*?)\s*\/*\s*\>';

What's different is I added a zero or more on the quotes between $2 and $3:

...]+)"*\'*(.....

I looked a little closer at the output, not seeing anything wrong w. it at the moment, but then I'm off the clock. :-P