Welcome to WebmasterWorld Guest from 3.84.139.101

Forum Moderators: mademetop

Message Too Old, No Replies

what's the point of a spider...

...if all it does is hit your root directory?

     
2:36 am on Jan 20, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Dec 6, 2000
posts:3928
votes: 0


<rant>
I see tons of spider visits in my logs where all the little bugger does is request robots.txt, or log a hit on my root directory, or maybe (if it's feeling REALLY inquitisitive) on BOTH robots.txt and the root directory, but it never requests a SINGLE html file...

I can see hitting the root once in awhile to check for dead links, but today I had FIVE different inktomi spiders hit my site, each requesting either the robots.txt, the root directory or both (one spider sent TWO requests EACH for robots.txt & /)... and NOTHING ELSE! I can't imagine they need to verify a site's exitence 10 times a day...

WHAT'S THE POINT????
</rant>

Please, someone enlighten me...

5:11 am on Jan 20, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member littleman is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 17, 2000
posts:2924
votes: 0


I understand your sentiment when it comes to Ink. They have a tremendous bandwidth suck.. I really don't know how they could afford all that spidering. As incredible as it may seem, I've had over 45 thousand requests from inktomi on one server in a 24 hour period. Seriously!

Now that would be cool if it only happened once every month or so, but they come in every day, several thousands of times day after day.

Small suggestion, try changing your index page and see if it hits any of your sub-pages.

Air

5:16 am on Jan 20, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 10, 2000
posts:1253
votes: 0


Method to their madness? Maybe. They could be looking for bait and switch techniques, or dynamic pages that change on each spider request, or pages that change their "last update" marker or page size every time they are requested ... or maybe the spiders are just canvassing for a donation :)
4:36 pm on Jan 20, 2001 (gmt 0)

Full Member

10+ Year Member

joined:July 22, 2000
posts:329
votes: 0


Little I have the same thing.. 40K a day and always nothing that I have submitted in the last month.

I have to agree with Air. Maybe the bait an switch.

They could be fishing..

5:47 pm on Jan 21, 2001 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38251
votes: 111


It wouldn't be so bad if they were sending out that many referrals on a regular basis, but there are days when they out spider their own referrals. Google is almost as bad at times. Fast used to be the worst, but they have finally got a clue.

The repititious spidering of Ink is baffeling to me. Day-in day-out with a different agent name. It just makes me wonder what they are really doing with that data. like repackaging data for someone else privatly (govt? has always been my guess with Ink - they have the highest world coverage of any se).

8:30 am on Jan 22, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member littleman is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 17, 2000
posts:2924
votes: 0


I am sure there is _some_ reason for this madness. But, you would think they'd go into conservation mode with the dot-com economy tightening its belt so much, but instead they are doing the opposite.

They have actually over loaded my server a couple of times and have made me have reconfigure apache to tighten down with 'MaxClients' restrictions.

6:57 pm on Jan 22, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Dec 6, 2000
posts:3928
votes: 0


> Small suggestion, try changing your index page and see if it hits any of your sub-pages

Actually, the day I posted that message, I had changed my index page... actually added new text links to my main sub-pages in addition to my navigation buttons. Still nothing but a hit on my root & index page.

The bait & switch theory makes the most sense to me, but it does seem like it would be a terrific waste of their company resources to spend so much time doing it.