Forum Moderators: coopster

Message Too Old, No Replies

Line feed in mailto body via header problem

IE 6.0 says server unreacheable

         

Notawiz

1:13 pm on Jun 24, 2008 (gmt 0)

10+ Year Member



In my current project, the client doesn't want an online message form, but a call to the email agent instead.
He also wants that the message subject and body fields are filled in with changing text according to the page the link is triggered from.

To avoid email harvesting, the link triggers a standalone mailto.php script, populating the correct subject and body fields through a switch {} construction, and then feeding the complete email mailto link in a header ("Location: mailto:to@domain?subject=the_subject&body=the_body");

So far so good.
When new lines are generated with \n and/or \r, FireFox handles them well, but IE 6.0 ignores all the content after this crlf.

If the CR/LF is encoded in hexadecimal (%0a and/or %0d), IE 6.0 responds "server not reacheable" when this mailto link comes from a header("Location:...."); construction.
When pasted into the address bar of the same IE 6.0 browser, things go fine.

Even the "ancient" trick given by the Microsoft support to use %250a instead of %0a is not working properly. A new email message window is opened and the body is populated, but the line feed is ignored.

I've even tried to prepend some other headers like:


header('Content-Type: text/plain; charset=iso-8859-15');
or
header('Content-Type: text/plain; charset=utf-8');

before the header with the mailto, but to no avail.

Another funny thing is that any other character can be encoded in hexa, and will be handled correctly by IE 6.0, like e. g. %20 for spaces.
But once there is a %0a or a %0d around, and the mailto is sent back to the IE 6.0 browser by the php script, he goes bezerk...

Question: what did I forget, or will it NEVER work with that (still widespread) browser?

cameraman

6:55 pm on Jun 24, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I duplicated exactly what you described. I'm not understanding how this prevents harvesting, however, unless you're using javascript.
This works in both FF2 and IE6:
<a href="mailto:admin@example.com?subject=well%20well%20well&body=I%20don%27t%20want%20a%20pickle.%0A Just%20want%20to%20ride%20on%20my%20motorsickle.">mail</a>

What if you were to use javascript to insert them into the document?

Notawiz

7:26 am on Jun 25, 2008 (gmt 0)

10+ Year Member



Hello Cameraman,
As I said, when an encoded string is pasted into the address bar of IE 6.0, it is handled correctly (even the %0a line feed).
But when the same string is sent to the browser by a script in a
header ("Location: ...the mailto string");

construction, IE 6.0 is unable to manage it, and responds "server not reacheable.

The only working solution I found, though not an elegant one, is to replace the line feeds with a HTML line break <br />.

As for the harvesting of email addresses, in the webpage the link is calling a PHP script with a $_GET variable, and thus not an email address. Using JavaScript would not avoid a robot to read the email, while this technique will.

[edited by: Notawiz at 7:27 am (utc) on June 25, 2008]

cameraman

5:17 pm on Jun 25, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I understand what you're saying about the problem because, like I said, I was able to duplicate it.

But if you have a link to your script, what prevents the spambot from following the link?

It's my understanding that [most] spambots don't execute javascript, so if the link is inserted into the document via javascript, the spambot wouldn't 'see it':
<pseudoscript>
onload() {
target.innerHTML = "<a href=" + "mail" + "to:" + target.innerHTML;
}

I'm not trying to be contentious; I've actually got the same issue that you're trying to prevent, so I'm interested in possible solutions. In my case, I have a customer support email address listed on a page. That address is piped to a script that creates a user account if necessary, creates a customer support ticket, then informs about 3 email addresses that the ticket has been created.

What I came up with yesterday was a simple text-based captcha (what's hotter, fire or ice, etc) before revealing the address, then in the email script I allow only the subjects that are prepopulated from the contact page (so I didn't even wind up using javascript ;) ).

Notawiz

5:52 pm on Jun 25, 2008 (gmt 0)

10+ Year Member



Oh well, maybe I misunderstood it, but I was told that bots actually didn't "click" on links.
The link on the page is something like
<a href="mailto.php?to=variable" target="_blank"><img src="email_us.gif" /></a>

According to the value of $_GET['to'], the script mailto.php creates an complete mailto:.... link, and passes that to a header("Location: the mailto_link"); construction.

That's sent to a blank browser window, which in turn opens a new email compose window...

My belief was that a harvesting bot will not be able nor willing to do that to extract the email address from there?

While humans are still able to "read" the email address on the image, and will not be surprised as the behavior is the same as with a "normal" embedded mailto: link.

Did I overlook something?

cameraman

6:47 pm on Jun 25, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Functionally it looks great, although I'd change the name of mailto.php to something more cryptic.

I googled spambot. They definitely do follow links. Three of the articles I read said they don't execute javascript, and one said they likely never will. That one was written 5 years ago but it makes sense - I can't imagine the spammers would want to spend that much processing effort since javascript is so widely used to fly stuff around pages and animate menus, etc.

One drawback to javascript is what to do about people who have it turned off. That's what led me down the road I finally took - I started to think up a solution for both, then came up with one that didn't matter. I had figured I'd use captcha for non-javascript users and use the late insertion for those who had it turned on. As I was going about writing the thing I decided I may as well just use captcha for everyone.

Notawiz

7:24 am on Jun 26, 2008 (gmt 0)

10+ Year Member



In fact, the real production script has an obscure name, but I renamed it to mailto.php just for the matter of this post.
And I came to this approach because of the fct that some users may have JavaScript turned off.
As for the spambots following links, so far none seemed able to harvest this kind of email address.
One funny thing to mention, is that on another project, the PDF version of a Word document was put online. The original Word file was a letter, with header and all.
Google of course turned the PDF into HTML, and bingo, there we had the email address of the letter header in clear.
Spammed that very same day !
The cure was quite easy, copy the letter header in the Word document, paste it back with Paste > Special > Image, and then re-create a PDF.
But the client had to create a new email account.