Forum Moderators: open

Message Too Old, No Replies

what's the point of a spider...

...if all it does is hit your root directory?

         

mivox

2:36 am on Jan 20, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<rant>
I see tons of spider visits in my logs where all the little bugger does is request robots.txt, or log a hit on my root directory, or maybe (if it's feeling REALLY inquitisitive) on BOTH robots.txt and the root directory, but it never requests a SINGLE html file...

I can see hitting the root once in awhile to check for dead links, but today I had FIVE different inktomi spiders hit my site, each requesting either the robots.txt, the root directory or both (one spider sent TWO requests EACH for robots.txt & /)... and NOTHING ELSE! I can't imagine they need to verify a site's exitence 10 times a day...

WHAT'S THE POINT????
</rant>

Please, someone enlighten me...

littleman

5:11 am on Jan 20, 2001 (gmt 0)



I understand your sentiment when it comes to Ink. They have a tremendous bandwidth suck.. I really don't know how they could afford all that spidering. As incredible as it may seem, I've had over 45 thousand requests from inktomi on one server in a 24 hour period. Seriously!

Now that would be cool if it only happened once every month or so, but they come in every day, several thousands of times day after day.

Small suggestion, try changing your index page and see if it hits any of your sub-pages.

Air

5:16 am on Jan 20, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Method to their madness? Maybe. They could be looking for bait and switch techniques, or dynamic pages that change on each spider request, or pages that change their "last update" marker or page size every time they are requested ... or maybe the spiders are just canvassing for a donation :)

drbill

4:36 pm on Jan 20, 2001 (gmt 0)

10+ Year Member



Little I have the same thing.. 40K a day and always nothing that I have submitted in the last month.

I have to agree with Air. Maybe the bait an switch.

They could be fishing..

Brett_Tabke

5:47 pm on Jan 21, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It wouldn't be so bad if they were sending out that many referrals on a regular basis, but there are days when they out spider their own referrals. Google is almost as bad at times. Fast used to be the worst, but they have finally got a clue.

The repititious spidering of Ink is baffeling to me. Day-in day-out with a different agent name. It just makes me wonder what they are really doing with that data. like repackaging data for someone else privatly (govt? has always been my guess with Ink - they have the highest world coverage of any se).

littleman

8:30 am on Jan 22, 2001 (gmt 0)



I am sure there is _some_ reason for this madness. But, you would think they'd go into conservation mode with the dot-com economy tightening its belt so much, but instead they are doing the opposite.

They have actually over loaded my server a couple of times and have made me have reconfigure apache to tighten down with 'MaxClients' restrictions.

mivox

6:57 pm on Jan 22, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Small suggestion, try changing your index page and see if it hits any of your sub-pages

Actually, the day I posted that message, I had changed my index page... actually added new text links to my main sub-pages in addition to my navigation buttons. Still nothing but a hit on my root & index page.

The bait & switch theory makes the most sense to me, but it does seem like it would be a terrific waste of their company resources to spend so much time doing it.