Welcome to WebmasterWorld Guest from 54.158.54.179

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Marvin/1.0 - arthur4.sda.t-online.de

     
7:48 pm on Feb 5, 2001 (gmt 0)

10+ Year Member



Used to know who this was,

62 hits in one minute. Looks like it did not get robots.txt

Ideas?

10:44 pm on Feb 5, 2001 (gmt 0)

10+ Year Member



www.sda.t-online.de is the portal site of t-online inm Germany, which uses Infoseek germany (www.infoseek.de) to search.

What is still a mystery to me is why Marvin (the Paranoid Android :) ) isn't programmed to 'ingest' the sites a
little less 'vigourously', ie. wait a little between requests.

Mailed them about their robot's behaviour. no response yet.
(no referrals either)

6:06 am on Feb 6, 2001 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



62 is agressive. There has been alot of backroom talk about what to do about Googles and Fast's spiders. If you have multiple domains (hundreds), it isn't uncommon for Google to hit 500-1000 pages a minute. Fast used to be worse, but they starte some randomization routine on the url addition order that eliminates most of the problems. Google is still a pretty big problem.

If you have ted-bill.com, tedbill, teds-bill, teds-bills-advneture, and etc up to even a few dozen domains - Google can unleash a torrent of requests in a short time.

Thanks for info on Marvin.

5:22 pm on Feb 6, 2001 (gmt 0)

10+ Year Member



yep, google can take down any server in no time, they go by hosts so whether you have large number of domains or one domain with large number of subdomains they will pound like there is no tomorrow, partial solution is to limit MaxClients, httpd will still be busy but the whole server won't crash down
6:27 pm on Feb 6, 2001 (gmt 0)

WebmasterWorld Senior Member littleman is a WebmasterWorld Top Contributor of All Time 10+ Year Member



>partial solution is to limit MaxClients
Yeah, that is exactly what I did.
1:34 pm on Mar 28, 2001 (gmt 0)

10+ Year Member



The problem is:
the german provider T-online is the owner of Infoseek.de (25%) and IS is the searchengine on the homepage from T-online, okay, but I donīt think that this autmatically means itīs a spider from IS.de. Might be, that a new spider is around with a connection via T-online.
(really, I hope that itīs a spider from IS, because it hasnīt been on its way for month)

the datas for Infoseek Sidewinder are
#UA Infoseek Sidewinder/0.9
idefix.sda.t-online.de
195.145.119.24
#UA Infoseek Sidewinder/0.9
miraculix.sda.t-online.de
195.145.119.25

and the datas for Marvin/1.0 are quite different (as i figured out):
#UA Marvin/1.0
212.184.44.10
212.184.44.13

3:33 pm on Mar 28, 2001 (gmt 0)

10+ Year Member



checked my logfiles, and Iīd like to say:
yes, I am almost sure that it is a IS.de spider!!
 

Featured Threads

Hot Threads This Week

Hot Threads This Month