homepage Welcome to WebmasterWorld Guest from 54.242.231.109
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Alternative Search Engines
Forum Library, Charter, Moderators: bakedjake

Alternative Search Engines Forum

    
What is excite up to now?
slamming my robots.txt all weekend...
mivox




msg:463782
 9:10 pm on May 29, 2001 (gmt 0)

Three excite spiders hammered my robots.txt file all weekend...
Never requesting anything else:

jung.excite.com.......57 requests for robots.txt
daal.excite.com.......46 requests for robots.txt
tympani.excite.com....29 requests for robots.txt

Anyone else see anything like this? Any ideas what excite might stand to gain by requesting a single robots.txt file over 100 times in the course of a weekend?

 

msgraph




msg:463783
 12:30 pm on May 30, 2001 (gmt 0)

Saw the same thing yesterday with their spider and was wondering what was happening myself.

Hit about 10 of my sites going after just the robots.txt file and the root html page. A few times it would request the robots file 10x in a row. I think the total amount of requests were in the 70-80 range per site. Just on those two files alone.

Mike_Mackin




msg:463784
 12:59 pm on May 30, 2001 (gmt 0)

>Three excite spiders

I think that the 3 excite spiders are just trying to remember how to index. Given time, they may figure it out.

mivox




msg:463785
 6:20 pm on May 30, 2001 (gmt 0)

just trying to remember how to index

LOL... true. They are a bit out of practice, aren't they?

Nice to know I wasn't the only one who site was being thusly molested though... Would have seemed awfully sinister if they were leaving everyone else alone.

erikt




msg:463786
 8:25 am on May 31, 2001 (gmt 0)

I have this problem as well. This is an overview of the access attempts on robots.txt on May 30 ONLY:

...4221 marcuse.excite.com
...4191 pascal.excite.com
...4163 pierce.excite.com
....243 jung.excite.com
....227 daal.excite.com
....126 tympani.excite.com
.....88 rorty.excite.com
.....81 dosa.excite.com
.....41 triangle.excite.com
.....43 (14 others under 10)

which adds up to 13424 requests on a single day, 50 times more than any other file on the site. This number has increased every day since May 22. I've tried to find an e-mail address at excite.com where I can notify them about the problem but have been unsuccessful. Help would be appreciated.

mivox




msg:463787
 6:14 pm on May 31, 2001 (gmt 0)

They eased off after their first assault on my site, so I let it slide... you could always try firing off a letter to a likely "generic" email address like "support@excite.com" or something of the like...

dogboy




msg:463788
 6:53 pm on Jun 1, 2001 (gmt 0)

"just trying to remember how to index"... rofl

yeah, been visited, but not hard. they were looking for the robots.txt but also grabbed the index pages

littleman




msg:463789
 2:00 am on Jun 10, 2001 (gmt 0)

Mivox, did you ever get in?

jeremy goodrich




msg:463790
 12:51 pm on Jun 10, 2001 (gmt 0)

Add me to the list of questors, wanting to know if mivox got into excite.

Anxiously awaiting the results, just in case I see Architext knocking at my robots.txt...

mivox




msg:463791
 7:54 am on Jun 11, 2001 (gmt 0)

A couple months ago Excite started listing the same pages from our site that all the Inktomi sites are showing... for a while it was just our index page, and then (last week or so?) the exact same four out-of-date pages (all either 404 or redirects now) that were apparently dredged up out of Ink's dustbin files appeared in Excite...

Excite hasn't a) spidered my site itself, b) sent me any traffic, or c) shown any pages differing in any way from Ink results for our company name. I've pretty much written them off, but the sudden assault on my robots.txt definitely threw me off.

gekoviola




msg:463792
 5:39 am on Jun 13, 2001 (gmt 0)

Thousands of fast requests for non existant urls on my logs too.. Seem like the mindless spider confused my sites whit others, or the requests comes semi-random generated, or maybe this is a super-intelligent multi-level scan eh..
I remember this excite spider behavior happened also 2 years ago.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Alternative Search Engines
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved