how to prevent spambots grabbing email addresses

Forum Moderators: open

Message Too Old, No Replies

how to prevent spambots grabbing email addresses

a simple script to hide email addresses

Frank_Rizzo

11:02 am on Jul 5, 2002 (gmt 0)

I was worried about spambots crawling for email addresses. I found the following which works well:

<P>hey buddy, mail
<SCRIPT>
m1 = "frank_rizzo";
m2 = "mysite";
m3 = ".com";
document.write('<a href=\"mailto:' + m1 + '@' + m2 + m3 + '\">');
document.write('frank rizzo<\/a>');
</SCRIPT>
but dont expect a reply
</P>

Because the script renders the mailto: on the fly, spambots never get to see the real email address.

This is also useful for preventing attacks from virus such as Sircam. Sircam looks at your cache files and scans the page for emails but won't find them. It they will find is mailto:' + m1 etc.

Torben Lundsgaard

11:27 am on Jul 5, 2002 (gmt 0)

I have tried a script like yours and I also found that it works.

Another way to keep the spambots away is to uni-encode the email adress.

Before: mailto:frank_rizzo@mysite.com

After: mailto:frank
_rizzo@mysit
e.com

Provided the browser supports HTML 3.2, you should only see the clear text on your screen.

The unicode works regardles of whether JavaScript is enabled or not.

luma

12:44 pm on Jul 5, 2002 (gmt 0)

Some suggestions I read were:

Using a GIF image
Using a JavaScript open() link for the contacts page
Using some invalid tags, e.g., frank_rizzo<spam>@<spam>mysite.com. Browsers will normally just drop the <spam> tag and display the email address. but iirc, the code will not valididate

Sinner_G

12:46 pm on Jul 5, 2002 (gmt 0)

Using a GIF? Could you please explain this one?

vitaplease

12:48 pm on Jul 5, 2002 (gmt 0)

Torben,

how do I unicode an email address?

thanks.

coconutz

1:20 pm on Jul 5, 2002 (gmt 0)

Try this e-mail link code generator [willmaster.com]

rogerd

1:26 pm on Jul 5, 2002 (gmt 0)

How effective is the unicode approach? I've just assumed that spambots would have figured this out by now...

DrOliver

1:28 pm on Jul 5, 2002 (gmt 0)

This is a great encoder:

http://www.hivelogic.com/safeaddress.php [hivelogic.com]

DrOliver

1:30 pm on Jul 5, 2002 (gmt 0)

The Hivelogic encoder (URL see above post) combines JavaScript with Unicode - maybe the spambots will figure out sometime, but you're free to change the JavaScript a bit, or use a mix of Unicode and clear text. Win the Spam Arms Race!

Torben Lundsgaard

1:31 pm on Jul 5, 2002 (gmt 0)

In order to encode an email adress in unicode you simply use the following table:

Unicode table
@ @
. .
- -
A A
B B
C C
D D
E E
F F
G G
H H
I I
J J
K K
L L
M M
N N
O O
P P
Q Q
R R
S S
T T
U U
V V
W W
X X
Y Y
Z Z
a a
b b
c c
d d
e e
f f
g g
h h
i i
j j
k k
l l
m m
n n
o o
p p
q q
r r
s s
t t
u u
v v
w w
x x
y y
z z

Torben

vitaplease

1:32 pm on Jul 5, 2002 (gmt 0)

Great info!

can anyone name a spambot or help what to look for so that I could recognise in my stats?

How often have you folks been hit by these awfull animals?

fathom

1:59 pm on Jul 5, 2002 (gmt 0)

Changing the @ to unicode @ would be as effective as making the whole email unicode.

As well older browsers do not support unicode, especially MAC OS 9 and earlier.

[unicode.org...]

rogerd

2:03 pm on Jul 5, 2002 (gmt 0)

Vitaplease, EmailSiphon and WebSnake are a couple of e-mail harvesters. How often you are hit will probably depend on how well-linked your site is and how well it does in common search results. The easier it is to find the site, the more often it will get hit.

Torben Lundsgaard

2:19 pm on Jul 5, 2002 (gmt 0)

I haven't spent much time on tracking spam bots myself because it requires a lot of time and effort if you want to keep them out effectively.

Anyway here�s a list of some of the useragents to look for:
EmailCollector
WebEMailExtractor
EmailSiphon
ExtractorPro

You can use all tricks suggested above but if you come across a smart spam bot, which is able to parse JavaScript, Unicode etc. they are worthless. Also they usually ignore the robots.txt file.

Advanced spam bots identify them selves as a normal MS browser, which makes it harder to detect them. So the only way to detect a spam bot is by analysing the activity. Does the visitor act like a robot? Real user normally don�t visit 100 pages pr Second ;)

To prevent �attacks� from spam bots you really need advanced IP blocking/delivery. This is actually quite similar to detecting and maintaining a list of SE bots when dealing with cloaking. However, it takes a lot of time and effort to maintain such a list so I recommend that you subscribe spam bot list.

I�m sure that Brett has got a lot experience keeping spam bots out of WmW.

Nick_W

2:25 pm on Jul 5, 2002 (gmt 0)

Anyone tried this: Close to perfect .htaccess ban list [webmasterworld.com]

Nick

g1smd

2:38 pm on Jul 5, 2002 (gmt 0)

I have implemented a longer script on several pages written for other people, and seen zero spam. These techniques do work (at present).

As for the GIF image route, this is simply an image that shows your email address. It isn't clickable, can't be scanned, simply requires a human to read it with eyeballs and use fingers to type the address into the To field of the mail program. Nothing automated at all, but inconvenient to lazy web surfers.

There is another way: have a link with a 'dummy address' (like root@localhost) in it, then use a piece of JavaScript to have an OnClick event that inserts the correct address when a user clicks it. When combined with the other techniques already mentioned above, it should be quite hard for a robot to crack.

incywincy

2:47 pm on Jul 5, 2002 (gmt 0)

yes, i've tried .htaccess and so far so good. i still get the occasional one-man crawler who sets his UA to circumvent .htaccess traps but this is minimal. it's great, i redirect unwanted UAs to a gay sex directory, they deserve it.

it's easy to test too. just use a crawler where you can configure the User Agent and check it out.

Torben Lundsgaard

2:50 pm on Jul 5, 2002 (gmt 0)

g1smd,

I agree that the GIF image solution is very effective. However, the inconvenience of having to type in the email address manually may be an important usability issue.

I like your OnClick solution. Good thinking.

Torben

rewboss

5:45 pm on Jul 5, 2002 (gmt 0)

The single best way to stop spam bots from finding your address is to use a contact form.

I've also seen people spell out their e-mail addresses, and actually write "joe dot smith at mydomain dot com".

If you can't get rid of spam, one tactic is to have one e-mail address that you use for sigining up to bulletin boards etc, and another e-mail address for putting up on websites. These two will attract spam, of course, the former more than the latter most probably. You only need to check the former when you actually need to retrieve confirmation e-mails etc. The other you need to check regularly.

You would then have another e-mail address you never publish anywhere for normal correspondance.

If you don't mind looking a bit cheap, you can set up a Hotmail account for a contact e-mail addy for websites. Hotmail has a very effective spam filter which successfully diverts most unwanted mail into a Junk Mail folder. Occasionally other mail gets in there too, but the web interface has a "This is not junk mail" button to deal with that. It's surprisingly effective.

Yet another technique is to write your address like this: joe.smithKILLSPAM@mydomain.com and ask people to remove everything in capital letters from the address before sending the mail.