Welcome to WebmasterWorld Guest from 54.162.248.199

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Spiders I Hate

I really HATE to see this in my logs!

     
1:46 pm on Jan 9, 2001 (gmt 0)

10+ Year Member



40bc0ab8.dsl.flashcom.net.......64.188.10.184......EmailSiphon

I really would love it if someone could explain/show me how to poison these things with bogus addresses, or at least protect my own. I just despise SPAM!

4:54 pm on Jan 9, 2001 (gmt 0)

10+ Year Member



You don't usually want to poison them with bogus addresses, from what I've heard, this usually backfires, or at the least, doesn't work.

What you can do is look at Littleman's profile, copy the cloaking script he posted, and then if the user agent matches an email siphon, or email stealing agent, give it a blank page, or send it to yahoo or something...I'm sure that would be interesting.

If you have any questions with Perl, which is the language that script is written in, there are plenty of moderators here who would trip over themselves answering your questions. And a forum for questions just like that!

Hope this helps,

Cheers,
Han Solo

7:54 pm on Jan 9, 2001 (gmt 0)

10+ Year Member



Some people give email snatchers a page with the email addreses of antispam organizations and agencies.
Pretty evil thing to do ;)
9:36 pm on Jan 9, 2001 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



That's a great idea! I have 9 visitors listed by WebLog as "E-mail Harvester" for user/agent...

I'm going to check my raw logs, and see if they all have something in common I can use to send them to some anti-spam groups...

Heh, heh, heh...

10:03 pm on Jan 9, 2001 (gmt 0)

WebmasterWorld Senior Member nffc is a WebmasterWorld Top Contributor of All Time 10+ Year Member



"Sugarplum is an automated spam-poisoner. Its purpose is to feed realistic and enticing, but totally useless or hazardous data to wandering address harvesters such as EmailSiphon, Cherry Picker, etc"

[devin.com]

Don't that sound great, automated spam-poisoner!

10:37 pm on Jan 9, 2001 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



OK... here's the info on the visitors WebLog flagged as Email Harvesters:

ip-173-161.nyc-apt.primenet.com
-Crescent Internet ToolPak HTTP OLE Control v.1.0

cx970082-a.dnpt1.occa.home.com
- Crescent Internet ToolPak HTTP OLE Control v.1.0

mail.pcguru.com
- Crescent Internet ToolPak HTTP OLE Control v.1.0

mic-gws.hood.edu
- webbandit/4.35.0

aph-aug-101-1-1-246.abo.wanadoo.fr
- Mozilla/3.Mozilla/2.01 (Win95; I)

modem249.gtepacifica.net
- Mozilla/4.0 (compatible; BullsEye; Windows 95)

as1-6-159.peaknet.net
- Mozilla/3.Mozilla/2.01 (Win95; I)

212.234.180.5
- EmailSiphon

63.210.161.34 (did a full crawl of site)
- Microsoft URL Control - 6.00.8169

64.182.209.125
- Mozilla/3.Mozilla/2.01 (Win95; I)

The EmailSiphon visit is pretty self expanatory, but wha tthe heck is "Crescent Internet ToolPak HTTP OLE Control v.1.0"???

Any info on any of them?

11:13 pm on Jan 9, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On my site, since I only need one contact email, I use a "contact us" perl script that use a form. The visitor just types in the subject, their addy, the body text and hits submit. The address is totally hidden in the source.

This really cut down on my delete keystrokes.

Definitely not a solution for all sites (especially with many addys, or if you want to spawn the email client), but may work for you.

12:27 am on Jan 10, 2001 (gmt 0)

10+ Year Member




If you are running a cloaking script you can enter UA's you want to ban or redirect, or even IP's if you have unwanted spidering by an individual or company.

In the past I have banned or served up different pages to people running email siphoning software, WebZip, Teleport Pro, etc. My script can simply ban the user agent or ban it AND add the user IP to a blocked IP blacklist. It sends them a page telling them they have been blocked and if they want access to the site to contact the site administrator etc.

In my experience most people running this softare have little knowledge of TCP/IP and don't know how to fake the User Agent. Teleport Pro and others do allow this in the config but they probably don't realize this is how we are finding them or else you would presume they would turn it off ;)

mnw

12:37 am on Jan 10, 2001 (gmt 0)

10+ Year Member



I find it interesting that one of these is running from an .edu (mic-gws.hood.edu - webbandit/4.35.0). I wonder how easy it would be to shut that one down with a "frank" discussion with the President of the Hood College about the evils of e-mail harvesting and what the bad publicity would do to the college if it ever became "a news topic". How active is this bad boy?
12:42 am on Jan 10, 2001 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The thing is, I don't know if they're all actually email harvesters or not... that's the way WebLog identified them, but I don't know how accurate WebLog is about figuring these things out...

Most of them only hit my root directory (Including the EmailSiphon one), so they're not hunting too hard...

That's why I was hoping someone had heard of any of them. I really don't know if WebLog knows what it's talking about...

8:19 pm on Jan 10, 2001 (gmt 0)

10+ Year Member



Just an idea. do they follow the robot exclusion standard?

If so, it's easy:

user-agent: EmailSyphon
disallow: /

in robots.txt

problem fixed.

Skirril

9:26 pm on Jan 10, 2001 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



None of the visitors on that list logged any hits to robots.txt... I've noticed many spiders are unlikely to bother with manners unless they're a big player OR an .edu research spider.
8:56 pm on Jan 14, 2001 (gmt 0)

10+ Year Member



Mivox wrote:
" wha tthe heck is "Crescent Internet ToolPak HTTP OLE Control v.1.0"???

That is an activeX control which programmers use to write their own browsers or crawlers. Another ID like that is "Microsoft URL Control". Since hundreds or thousands of programmers might use that, each for a different program, some programs might be harvesters, some might be site grabbers for offline perusal, and some might even be do-it-yourself browsers.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month