Forum Moderators: phranque
I was just wondering how effective this is? I guess this relies on the idea that email harvestering spiders can follow links but they can't press form buttons - is this true?
Perhaps this is a good solution. I just thought it was unlikely that spiders would be stopped in their tracks by a simple button.
<style type="text/css">
A.email:link {
}
A.email:active {
BACKGROUND : url('mailto:info@yourdomain.com?subject=whatever');
}
</style>
I tried it and it does work. I guess if you robot.txt your CSS as off limits, you're safe.
*Probably best to use your email address in case it does not work. At least the person can read it.
Marshall
If you have PHP on your server you can use a standard script - no experience required.
Encode the email address in unicode.
The harvesters can't read uni-code.
But the mail-to link will still work.
Ex:
joe@widgets.com
<a href="mailto:joe@widgets.com.com">109;ailt
o:joe@widg
ets.com">Ema
il Me</a>
Also you if you have separate contact page, use a robots.txt file not to crawl the contact page.
Ex:
User-agent: *
Disallow: /contact.html
pixelpusher256
[edited by: tedster at 3:56 am (utc) on July 4, 2007]
[edit reason] line breaks added to prevent side-scroll [/edit]
The harvesters can't read uni-code.
You're safest with a JavaScript obfuscation these days if you have to put your address out there.
If you can get your customers to use contact forms that's safest of all.
Kaled.
The only way to keep from getting an email address spammed is to not post your email address in the first place and use a contact form instead. Unfortunately, there are automated scripts out there hunting down those contact form pages and spamming them so don't forget to include a captcha on the contact form page to block those spammers as well.
Spam spam spam spam...
This enables you to shut-off addresses that receive spam.
There are commercial services that do this, which has the advantage of directing the email away from your domain. When you "shut off" an address, it's THEIR server that gets hit with the overhead of rejecting the mail.
I don't know how practical the commercial services would be for this, as they have various service levels depending on the number of disposable addresses you need. It may not be practical for many thousands of disposable addresses.
I place a text field on the form that is actually hidden from real users. On submission of the form if anything is entered into the textbox then I know it's spam.
This method has worked great so far.
My normal procedure is to utilize a form for all communications, works like a charm. And, now that I'm working with ASP.NET forms a little more, my programmers tell me that there are built in measures to prevent the bots from spamming the forms. I'm learning more about them now.
I place a text field on the form that is actually hidden from real users. On submission of the form if anything is entered into the textbox then I know it's spam.
using this idea, could you not use it in combination with a sscript that if anything is in the field it won't submit?
<script type="text/javascript">
var submitcount=0;
function reset() {
document.emailform.validate.value="";}
// field validation - checks if fields are blank.
function checkFields() {
if ((document.emailform.validate.value!="") )
{
alert("Sorry, form does not validate.");
return false;
}
</script>
<form type=..... onSubmit="return checkFields()">
<div style="display:none">
<input type="text" name="validate" />
</div>
</form>
Just an idea.
Marshall
me [at] example.com
which should stop the spambots AND force those who are lazy and want to waste your time NOT to contact you, either because they aren't smart enough to figure that out, or they are too lazy to copy/paste and fix it right in their e-mail client.
For forms we have setup a 3-way robot detector to reject form-injection spam. Basically it means we don't need to use irritating Captchas:
1.) Use the css hidden text field (mentioned above): If it is completed reject the form.
2.) If any of the following appear in inappropriate fields, reject the form: 'http', 'www', '[', '@' (e.g. '@' is allowed in the email field but no other).
3.) A hidden field with a simple code (e.g. xyz). If the field does not contain this value: reject the form.
So far 100% effective :)
Instead of captchas I started using a method I found on the web that I had not seen before.I place a text field on the form that is actually hidden from real users. On submission of the form if anything is entered into the textbox then I know it's spam.
and...
For forms we have setup a 3-way robot detector to reject form-injection spam. Basically it means we don't need to use irritating Captchas:
Time to do my civic duty with a PSA about Captcha's and shed a few myths just posted.
The word CAPTCHA stands for "Completely Automated Public Turing Test to Tell Computers and Humans Apart" so anything you do that creates a TEST, and tests can come in many forms, which allows humans to pass but blocks a computer that can't interpret the problem is a form of CAPTCHA.
So you're all using CAPTCHAs, enjoy! :)
And humans can get their browser to execute javascript -- so it doesn't matter how complex the code is.
I'm not directly concerned about human harvesters -- my material is far enough down the popularity scale (I'd feel differently if I were trying to protect a website like, say, the ODP or Wikipedia.) But as it is, my only concern is the spiders. (This isn't what you'd do if you're trying to hide your email address from humans, which is what some of the other proposals address.)
Now, assuming spiders can execute Javascript but can't click buttons, all you need is a button that executes a (trivial) Javascript function that scans your own page for links to a fixed address (say, nospam.htm) and replaces those links by appropriate "mailto" links.
If you keep that Javascript function in a separate file, and you have just enough obfuscation for obvious regular expressions to not work (that is, no occurrences of "email" or ".com" or "@*.com" in strings), then no conceivable bot is going to be analyzing your script, because there's no automatically detectable traces of the presence of e-mail links. (There ARE no e-mail links until the button is clicked, and there is nothing suggesting that the "button" function should be executed.)
As for hiding "visible" links from robots but making them easily readable by humans, I'd suggest putting the different parts of the e-mail address in different cells of a table; and using a subscripted middot instead of a period. No robot is going to parse out table cells with "rowspan" and "colspan" attributes to see which un-email-looking fragments happen to line up visually on the screen.
Here's my external JS file:
---------------------------------
// This is simply an address spelt backwards.
var revmail='moc.elpmax' + "e@sser" + 'dda';
// This is simply any portion of a link on an html page.
var replaceme = "nospamhere.html"
//This is simply a reverse-string function
function revert(a) {
var z = ""
var i = a.length;
for (i=a.length; i>=0; i--) z = z + a.charAt(i)
return z; }
//This is the "button" function which finds the links to be replaced
//with actual references to the m.a.i.l_t.o protocol.
function alertml() { // figure this out, spambot!
var i=0;
for (i=document.links.length-1; 0<=i; i--) {
if (document.links[i].href.indexOf(replaceme) >= 0) {
document.links[i].href = revert(revmail + ":otliam");
} } }
------------------------------------------------------
We will continue to get spam until service providers are persuaded to stop it at source. If every server owner adopted a no-tolerance policy this would come about very quickly.
BTW - first post, new reader, looking forward to pubcon.
-Mark
[edited by: tedster at 1:06 am (utc) on July 7, 2007]
[edit reason] make link live [/edit]