Welcome to WebmasterWorld Guest from 54.146.201.80

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

EmailSiphon spider

EmailSiphon/193.251.190.3

     

stanleytk

1:10 pm on May 2, 2001 (gmt 0)

Inactive Member
Account Expired

 
 


Is anyone familiar with this spider?
It sounds like spider that grabs email addresses off web documents. Is this true?
Should I be banning this spider from my site. I resolved the IP address to find out it origiantes from www.wanadoo.fr, but that is all I know.
1:15 pm on May 2, 2001 (gmt 0)

Preferred Member

10+ Year Member

joined:Feb 21, 2001
posts:419
votes: 0


Yes it grabs email addresses, yes you should (in my opinion) ban it. Its a tool used by individuals so it could originate from any IP.

stanleytk

1:17 pm on May 2, 2001 (gmt 0)

Inactive Member
Account Expired

 
 


Thank you, I appreciate the help.

designfreak

1:22 pm on May 2, 2001 (gmt 0)

Inactive Member
Account Expired

 
 


how would one ban it? I've seen solutions that only work on unix boxes but not NT - any solutions?

stanleytk

1:30 pm on May 2, 2001 (gmt 0)

Inactive Member
Account Expired

 
 


This is a good point.
Generally, I would place the the text;
User-agent: EmailSiphon
Disallow: /
in the robots.txt page.
But, I went back to my logs and found that this UA did not request the robots.txt!
Does anyone know of any other way to keep him from revisiting my site?
I have modified my robots.txt to disallow him if he ever request it, but i don't think he ever will.
1:26 pm on May 8, 2001 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 29, 2001
posts:406
votes: 0


I too have noticed this spider running rampant on a number of my client's sites... And have recently noticed a barrage of email traffic bogging down our internal POP3... Not exactly the best situation that you want to be in, you know?

Any luck for anyone banning this? Any solutions?
Any updates would be much appreciated.

Thanks!

stanleytk

1:29 pm on May 8, 2001 (gmt 0)

Inactive Member
Account Expired

 
 


I had placed that disallow statement in my robots.txt, but I haven't seen EmailSiphon back at my site since to know if he will obey the robots.txt or not.
2:52 pm on May 8, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 25, 2000
posts:1786
votes: 0


It may or may not request robots.txt but one sure fire way to beat anybody or anything that tries to get at your email addy's is to remove it from the html. Use gif's with your email in them to display to visitors and encase your email addy in a formmail cgi or php script so no one...not even a snooping human...can get to it. Scour the servernside scripting forum for a hack.
4:45 pm on May 8, 2001 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 29, 2001
posts:406
votes: 0


It seems though that this is an unlikely alternative to protecting an email address from such abuse.

How in your mind, could this be done through scripts residing on the server?

6:12 pm on May 8, 2001 (gmt 0)

New User

10+ Year Member

joined:July 27, 2004
posts:2
votes: 0


I really HATE that thing! I tried to ban it using robots.txt and apparently it "ignored" it. I got a JS program to camoflage my email addersses here:

[assmaker.50megs.com...]

Jury is still out, though. Siphon has come and gone since I installed the script, but can't really gauge if the spam I get now is from new crawls, or just leftover from before.

Now if I can only figure out how to stop them from spamming me at my e-amil address listed on my resume at monster.com....

Edited by: scott

8:26 pm on May 8, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Dec 6, 2000
posts:3928
votes: 0


Does anyone know if emailsiphon can 'read' unicode? If not, you can replace all the "@" signs in your HTML code with &-#-6-4-; (remove the hyphens first ;)... I had to break up the string to prevent it from being 'deciphered' by the forum script).

Browsers will read the unicode string and display "@" in it's place, email links will work like normal, but perhaps emailsiphon won't recognize it as an email address?

5:21 am on May 9, 2001 (gmt 0)

Preferred Member

10+ Year Member

joined:May 9, 2001
posts:416
votes: 0


You can scan the USER_AGENT with a script, then compare it to a list of unwanted visitors, and redirect them to a page with no email addresses, but that means all your pages will have to be wrapped in a script. Better to force the redirects within the server if you have that kind of access. See Charles Brabec's site at [mosa.unity.ncsu.edu...] for more information and a list of harvesters' id strings. If you're interested in a PHP solution, say so, and I'll post what I've written so far.