I understand that Mozzilla is a spider. Which one? There has to be at least two dozen that have visited my site. I can identify some as Netscape and others as Microsoft browsers, but beyond that I'm lost.
Can you identify an agent in a Robots.txt file by it's IP address as opposed to it's name?
If I disallow some files to all agents (for example cgi-bin)will this cause all agents to ignore all files?
I have seen a number of agents that only visit the home page. I have recently heard that in addition to the change information, that some agents are not spidering only the home page, then after collecting all the links re-visiting at a later date to spider the other pages. Has anyone got any insight into this?
If I have a number of doorway pages tailored to the major engines, what can I do to help identify the best doorway for minor engines? I really don't want to have them spider all the doorways, but without a lot of analysis that maybe isn't worth it, I don't know how to choose what they should spider.
Given that some spiders cover multiple engines (such as Slurp/3.0-AU (firstname.lastname@example.org; [inktomi.com...] and that each engine has different ranking criteria and thus need different doorways, how do I control the spidering? An example robots entry would help.
Does every doorway that you want spidered, need a robots text meta tag?
Thanks in advance.