Forum Moderators: open

Message Too Old, No Replies

Should a spider take more than a few pages at a time?

whole site not been indexed

         

nickc001

9:10 am on Jul 5, 2002 (gmt 0)

10+ Year Member



Hi,

I have just set up a asp script that logs into a databsae the 'user agent', 'IP', 'time visited' and 'file requested' when the user agent is NOT Mozilla.

I therefore figure that this should catch all spidering of my site and so I can see which search engines are indexing me.

Since it's implementation yesterday I have has lycos, google, and scooter (altavista) spider come and visit my site but they only took a few pages of it at a time (10 pages over 6 hours by scooter and 12 pages over 6 hours by Google).

Is this normal? The files they have spidered seem to have no obvious linking hierarchy.

My site uses javascript pop-up menus but the <a href> tag that calls the menu is also linked to an index page containing the links. The spiders seem to find these index pages OK but are not taking all the linked pages from them. Is this normal?

thanks,

Nick

Brett_Tabke

11:14 pm on Jul 6, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



> Is this normal?

Yes. They could be doing a slow and normal update of urls already in the database. They revisit those urls slowly, looking for fresh content and/or dead content (404's).

Sooner or later, you'll see a burst of activity when a full crawl hits.

misosoph

6:44 am on Jul 7, 2002 (gmt 0)

10+ Year Member



Re: "... when the user agent is NOT Mozilla ... should catch all spidering of my site"

Remember that it won't catch Inktomi this way. For example:

j3410.inktomisearch.com - - "Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]

nickc001

8:10 am on Jul 8, 2002 (gmt 0)

10+ Year Member



thanks guys, will make the change to the script to account for inktomi and then be patient i suppose.