Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Spider Identification



1:22 am on Feb 23, 2000 (gmt 0)

Inactive Member
Account Expired


I understand that Mozzilla is a spider. Which one? There has to be at least two dozen that have visited my site. I can identify some as Netscape and others as Microsoft browsers, but beyond that I'm lost.

Can you identify an agent in a Robots.txt file by it's IP address as opposed to it's name?

If I disallow some files to all agents (for example cgi-bin)will this cause all agents to ignore all files?

I have seen a number of agents that only visit the home page. I have recently heard that in addition to the change information, that some agents are not spidering only the home page, then after collecting all the links re-visiting at a later date to spider the other pages. Has anyone got any insight into this?

If I have a number of doorway pages tailored to the major engines, what can I do to help identify the best doorway for minor engines? I really don't want to have them spider all the doorways, but without a lot of analysis that maybe isn't worth it, I don't know how to choose what they should spider.

Given that some spiders cover multiple engines (such as Slurp/3.0-AU (slurp@inktomi.com; [inktomi.com...] and that each engine has different ranking criteria and thus need different doorways, how do I control the spidering? An example robots entry would help.

Does every doorway that you want spidered, need a robots text meta tag?

Thanks in advance.