Forum Moderators: open

Message Too Old, No Replies

number of spiders accessing spiders.txt

Amazed at number of spiders

         

tstalker

9:11 pm on Aug 15, 2001 (gmt 0)



I just finished doing a search and sort of accesses to my robots.txt file. It turns out I've had a little over 600 spiders accessing my site at different times over the past year. Is this possible? And how can I determine which ones are good for my site? How can I determine which ones are bad?

Also, is it possible that some of these agents and IPs are not spiders and how can I tell the difference?

Help, please.

Tim

Bolotomus

11:50 pm on Aug 15, 2001 (gmt 0)

10+ Year Member



The "bad spiders" are the ones that didn't ask for your robots.txt.

Some of these inquiries for robots.txt are just Joe Blow user sitting at home with his Mindspring account, using some program like an offline-browser.

Brett_Tabke

7:54 pm on Aug 17, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



That many spiders is very much possible. My most importan job around here is not promoting the site, not participating in the site, but dealing with the rogue spiders. 600 sounds kinda low. I have a couple of sites you can add a zero to that figure.

Bolotomus

3:57 am on Aug 20, 2001 (gmt 0)

10+ Year Member



Brett,

What do you mean "dealing with the rogue spiders."? What's to deal with? Do they really consume that much resources? Seems to me it's just like one more person using the box.

Bolot

Air

4:12 am on Aug 20, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Consider this; very few people can go through every page on your site at the rate of multiple pages per second (and some spiders seem to go through a number of times on the same visit for good measure), no only does it impact real users' response, but it is a terrible waste of bandwidth.

Tstalker,

IMO the only good ones are from places that will bring you traffic. Look for Googlebot, Scooter (some will laugh at this one), Slurp, FAST-WebCrawler, to name a few. The rest are either joe blow's spider, or some research gathering effort, not to mention the various analyzers out there, as well as e-mail harvesters.

BTW I see this is your first post, so welcome to WmW.