homepage Welcome to WebmasterWorld Guest from 54.227.20.250
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
EmailSiphon spider
EmailSiphon/193.251.190.3
stanleytk

10+ Year Member



 
Msg#: 615 posted 1:10 pm on May 2, 2001 (gmt 0)

Is anyone familiar with this spider?
It sounds like spider that grabs email addresses off web documents. Is this true?
Should I be banning this spider from my site. I resolved the IP address to find out it origiantes from www.wanadoo.fr, but that is all I know.

 

theperlyking

10+ Year Member



 
Msg#: 615 posted 1:15 pm on May 2, 2001 (gmt 0)

Yes it grabs email addresses, yes you should (in my opinion) ban it. Its a tool used by individuals so it could originate from any IP.

stanleytk

10+ Year Member



 
Msg#: 615 posted 1:17 pm on May 2, 2001 (gmt 0)

Thank you, I appreciate the help.

designfreak

10+ Year Member



 
Msg#: 615 posted 1:22 pm on May 2, 2001 (gmt 0)

how would one ban it? I've seen solutions that only work on unix boxes but not NT - any solutions?

stanleytk

10+ Year Member



 
Msg#: 615 posted 1:30 pm on May 2, 2001 (gmt 0)

This is a good point.
Generally, I would place the the text;
User-agent: EmailSiphon
Disallow: /
in the robots.txt page.
But, I went back to my logs and found that this UA did not request the robots.txt!
Does anyone know of any other way to keep him from revisiting my site?
I have modified my robots.txt to disallow him if he ever request it, but i don't think he ever will.

Eric_Lander

10+ Year Member



 
Msg#: 615 posted 1:26 pm on May 8, 2001 (gmt 0)

I too have noticed this spider running rampant on a number of my client's sites... And have recently noticed a barrage of email traffic bogging down our internal POP3... Not exactly the best situation that you want to be in, you know?

Any luck for anyone banning this? Any solutions?
Any updates would be much appreciated.

Thanks!

stanleytk

10+ Year Member



 
Msg#: 615 posted 1:29 pm on May 8, 2001 (gmt 0)

I had placed that disallow statement in my robots.txt, but I haven't seen EmailSiphon back at my site since to know if he will obey the robots.txt or not.

toolman

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 615 posted 2:52 pm on May 8, 2001 (gmt 0)

It may or may not request robots.txt but one sure fire way to beat anybody or anything that tries to get at your email addy's is to remove it from the html. Use gif's with your email in them to display to visitors and encase your email addy in a formmail cgi or php script so no one...not even a snooping human...can get to it. Scour the servernside scripting forum for a hack.

Eric_Lander

10+ Year Member



 
Msg#: 615 posted 4:45 pm on May 8, 2001 (gmt 0)

It seems though that this is an unlikely alternative to protecting an email address from such abuse.

How in your mind, could this be done through scripts residing on the server?

scott

10+ Year Member



 
Msg#: 615 posted 6:12 pm on May 8, 2001 (gmt 0)

I really HATE that thing! I tried to ban it using robots.txt and apparently it "ignored" it. I got a JS program to camoflage my email addersses here:

[assmaker.50megs.com...]

Jury is still out, though. Siphon has come and gone since I installed the script, but can't really gauge if the spam I get now is from new crawls, or just leftover from before.

Now if I can only figure out how to stop them from spamming me at my e-amil address listed on my resume at monster.com....

Edited by: scott

mivox

WebmasterWorld Senior Member mivox us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 615 posted 8:26 pm on May 8, 2001 (gmt 0)

Does anyone know if emailsiphon can 'read' unicode? If not, you can replace all the "@" signs in your HTML code with &-#-6-4-; (remove the hyphens first ;)... I had to break up the string to prevent it from being 'deciphered' by the forum script).

Browsers will read the unicode string and display "@" in it's place, email links will work like normal, but perhaps emailsiphon won't recognize it as an email address?

toadhall

10+ Year Member



 
Msg#: 615 posted 5:21 am on May 9, 2001 (gmt 0)

You can scan the USER_AGENT with a script, then compare it to a list of unwanted visitors, and redirect them to a page with no email addresses, but that means all your pages will have to be wrapped in a script. Better to force the redirects within the server if you have that kind of access. See Charles Brabec's site at [mosa.unity.ncsu.edu...] for more information and a list of harvesters' id strings. If you're interested in a PHP solution, say so, and I'll post what I've written so far.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved