Welcome to WebmasterWorld Guest from 107.20.122.81

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

EmailSiphon spider

EmailSiphon/193.251.190.3

     

stanleytk

1:10 pm on May 2, 2001 (gmt 0)

10+ Year Member



Is anyone familiar with this spider?
It sounds like spider that grabs email addresses off web documents. Is this true?
Should I be banning this spider from my site. I resolved the IP address to find out it origiantes from www.wanadoo.fr, but that is all I know.

theperlyking

1:15 pm on May 2, 2001 (gmt 0)

10+ Year Member



Yes it grabs email addresses, yes you should (in my opinion) ban it. Its a tool used by individuals so it could originate from any IP.

stanleytk

1:17 pm on May 2, 2001 (gmt 0)

10+ Year Member



Thank you, I appreciate the help.

designfreak

1:22 pm on May 2, 2001 (gmt 0)

10+ Year Member



how would one ban it? I've seen solutions that only work on unix boxes but not NT - any solutions?

stanleytk

1:30 pm on May 2, 2001 (gmt 0)

10+ Year Member



This is a good point.
Generally, I would place the the text;
User-agent: EmailSiphon
Disallow: /
in the robots.txt page.
But, I went back to my logs and found that this UA did not request the robots.txt!
Does anyone know of any other way to keep him from revisiting my site?
I have modified my robots.txt to disallow him if he ever request it, but i don't think he ever will.

Eric_Lander

1:26 pm on May 8, 2001 (gmt 0)

10+ Year Member



I too have noticed this spider running rampant on a number of my client's sites... And have recently noticed a barrage of email traffic bogging down our internal POP3... Not exactly the best situation that you want to be in, you know?

Any luck for anyone banning this? Any solutions?
Any updates would be much appreciated.

Thanks!

stanleytk

1:29 pm on May 8, 2001 (gmt 0)

10+ Year Member



I had placed that disallow statement in my robots.txt, but I haven't seen EmailSiphon back at my site since to know if he will obey the robots.txt or not.

toolman

2:52 pm on May 8, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It may or may not request robots.txt but one sure fire way to beat anybody or anything that tries to get at your email addy's is to remove it from the html. Use gif's with your email in them to display to visitors and encase your email addy in a formmail cgi or php script so no one...not even a snooping human...can get to it. Scour the servernside scripting forum for a hack.

Eric_Lander

4:45 pm on May 8, 2001 (gmt 0)

10+ Year Member



It seems though that this is an unlikely alternative to protecting an email address from such abuse.

How in your mind, could this be done through scripts residing on the server?

scott

6:12 pm on May 8, 2001 (gmt 0)

10+ Year Member



I really HATE that thing! I tried to ban it using robots.txt and apparently it "ignored" it. I got a JS program to camoflage my email addersses here:

[assmaker.50megs.com...]

Jury is still out, though. Siphon has come and gone since I installed the script, but can't really gauge if the spam I get now is from new crawls, or just leftover from before.

Now if I can only figure out how to stop them from spamming me at my e-amil address listed on my resume at monster.com....

Edited by: scott

mivox

8:26 pm on May 8, 2001 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Does anyone know if emailsiphon can 'read' unicode? If not, you can replace all the "@" signs in your HTML code with &-#-6-4-; (remove the hyphens first ;)... I had to break up the string to prevent it from being 'deciphered' by the forum script).

Browsers will read the unicode string and display "@" in it's place, email links will work like normal, but perhaps emailsiphon won't recognize it as an email address?

toadhall

5:21 am on May 9, 2001 (gmt 0)

10+ Year Member



You can scan the USER_AGENT with a script, then compare it to a list of unwanted visitors, and redirect them to a page with no email addresses, but that means all your pages will have to be wrapped in a script. Better to force the redirects within the server if you have that kind of access. See Charles Brabec's site at [mosa.unity.ncsu.edu...] for more information and a list of harvesters' id strings. If you're interested in a PHP solution, say so, and I'll post what I've written so far.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month