Welcome to WebmasterWorld Guest from 54.226.146.15

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Mercator spider

     

\Lizard!

3:34 pm on Sep 25, 1999 (gmt 0)



Hi,

This little spider just hit me today (Have not seen it before):

Agent name: MERCATOR-1.0
IP: 204.123.13.65

Says she's from:

www.research.digital.com/SRC/mercator/

Im especially interested in this: (from the page above)

"One important interface allows new modules to be written to fetch documents using different network protocols, such as HTTP, FTP, and Gopher"

And this:

"Although the web contains a finite number of static documents (i.e., documents that are not generated on-the-fly), there are an infinite number of retrievable URLs. Three frequent causes of this inflation are URL aliases, session IDs embedded in URLs, and crawler traps. We have developed techniques to overcome some of those problems, but more innovation will be required, especially to recognize and avoid intentional crawler traps."

Anybody getting hit's for dynamic pages?

Brett_Tabke

5:25 pm on Sep 25, 1999 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



About 6 months ago, they hit one of our
sites completely. It was on the old isp and I didn't have access there to robots.txt. So, I whipped up a perl script to "toy" with them (letting stuff timeout, sending them junk - that sort of them). They came back a week later and hit everything in site with a slightly different user agent (including cgi's). I whipped up another script that was a dynamically created loop off .htm files (create a file, pull a file, create a file...). It sat there pulling that page loop for a full 20 hours. They've kept coming back and back. Evidently they think of me as a test site now. They routinely walk down the tree of all those dynamic graphics pages and most are form style posted urls once you get two pages deep.

\Lizard!

9:32 pm on Sep 25, 1999 (gmt 0)



The pages it's hitting for me is also dynamic - I had been toying around with the index.asp page at my site, letting it write out the date, referer, user_agent and stuff in a comment tag, maybe thats a thing that makes Marcator curious....

I have been hit by a ton of new spiders today, dunno why but here's the bunch:

User-Agent: WWW-Collector-E/0.10970
Via: 1.0 ccuproxy1.ccu.edu.tw:3128 (Squid/2.2.STABLE4), 1.0 cache.ccu.edu.tw:3128 (Squid/2.2.STABLE4)
X-Forwarded-For: 127.0.0.1, 140.123.5.16
Cache-Control: max-age=259200
IP: 163.28.96.11

----

User agent: KANSMEN
No rererer
IP: 38.170.72.194
Don
----

LWP-TRIVIAL/1.27
No referer
192.41.61.81
resolves to: virtualpromote.com
(Now theres a funny one eh..., good ole Jim stopping by,
maybe his logs made him curious ;)
----

No user agent
No Referer
IP: 210.155.98.76
Resolves to: nat7.aitai.ne.jp
----
(this one is an old frequent visitor)
LIBWWW-PERL/5.44
No Referer
IP: 209.67.119.9
Resotlves to: oracle.vcommunities.com

\Lizard!

12:42 pm on Sep 27, 1999 (gmt 0)



Found out what this one was:

No user agent
No Referer
IP: 210.155.98.76
Resolves to: nat7.aitai.ne.jp

Got spammed today, good attempt at forging headers incl ref's to
msn and hotmail, but some detective work got me aitai.ne.jp

So this appears to be an email harvester...

Brett_Tabke

3:57 pm on Sep 27, 1999 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I have just about given up on email harvestors. By the time you figure out who is who, its too late. Except email siphon, its fun to *play* with them.

Where can I find an up-to-date list of known spammer email address's? There used to be a list ciruclated on usenet, but I don't see it anymore. I like to push spammer emails at email siphon.

\Lizard!

8:08 pm on Sep 27, 1999 (gmt 0)



Hi Brett,

Dont know where to find it either

Maybe you can find something at CAUCE somewhere, they had
a bunch of links to anti spam sites last time i visited.

[cauce.com...]

 

Featured Threads

Hot Threads This Week

Hot Threads This Month