Forum Moderators: coopster
To avoid email harvesting, the link triggers a standalone mailto.php script, populating the correct subject and body fields through a switch {} construction, and then feeding the complete email mailto link in a header ("Location: mailto:to@domain?subject=the_subject&body=the_body");
So far so good.
When new lines are generated with \n and/or \r, FireFox handles them well, but IE 6.0 ignores all the content after this crlf.
If the CR/LF is encoded in hexadecimal (%0a and/or %0d), IE 6.0 responds "server not reacheable" when this mailto link comes from a header("Location:...."); construction.
When pasted into the address bar of the same IE 6.0 browser, things go fine.
Even the "ancient" trick given by the Microsoft support to use %250a instead of %0a is not working properly. A new email message window is opened and the body is populated, but the line feed is ignored.
I've even tried to prepend some other headers like:
header('Content-Type: text/plain; charset=iso-8859-15');
or
header('Content-Type: text/plain; charset=utf-8');
Another funny thing is that any other character can be encoded in hexa, and will be handled correctly by IE 6.0, like e. g. %20 for spaces.
But once there is a %0a or a %0d around, and the mailto is sent back to the IE 6.0 browser by the php script, he goes bezerk...
Question: what did I forget, or will it NEVER work with that (still widespread) browser?
What if you were to use javascript to insert them into the document?
header ("Location: ...the mailto string"); The only working solution I found, though not an elegant one, is to replace the line feeds with a HTML line break <br />.
As for the harvesting of email addresses, in the webpage the link is calling a PHP script with a $_GET variable, and thus not an email address. Using JavaScript would not avoid a robot to read the email, while this technique will.
[edited by: Notawiz at 7:27 am (utc) on June 25, 2008]
But if you have a link to your script, what prevents the spambot from following the link?
It's my understanding that [most] spambots don't execute javascript, so if the link is inserted into the document via javascript, the spambot wouldn't 'see it':
<pseudoscript>
onload() {
target.innerHTML = "<a href=" + "mail" + "to:" + target.innerHTML;
}
I'm not trying to be contentious; I've actually got the same issue that you're trying to prevent, so I'm interested in possible solutions. In my case, I have a customer support email address listed on a page. That address is piped to a script that creates a user account if necessary, creates a customer support ticket, then informs about 3 email addresses that the ticket has been created.
What I came up with yesterday was a simple text-based captcha (what's hotter, fire or ice, etc) before revealing the address, then in the email script I allow only the subjects that are prepopulated from the contact page (so I didn't even wind up using javascript ;) ).
<a href="mailto.php?to=variable" target="_blank"><img src="email_us.gif" /></a> According to the value of $_GET['to'], the script mailto.php creates an complete mailto:.... link, and passes that to a header("Location: the mailto_link"); construction.
That's sent to a blank browser window, which in turn opens a new email compose window...
My belief was that a harvesting bot will not be able nor willing to do that to extract the email address from there?
While humans are still able to "read" the email address on the image, and will not be surprised as the behavior is the same as with a "normal" embedded mailto: link.
Did I overlook something?
I googled spambot. They definitely do follow links. Three of the articles I read said they don't execute javascript, and one said they likely never will. That one was written 5 years ago but it makes sense - I can't imagine the spammers would want to spend that much processing effort since javascript is so widely used to fly stuff around pages and animate menus, etc.
One drawback to javascript is what to do about people who have it turned off. That's what led me down the road I finally took - I started to think up a solution for both, then came up with one that didn't matter. I had figured I'd use captcha for non-javascript users and use the late insertion for those who had it turned on. As I was going about writing the thing I decided I may as well just use captcha for everyone.