Forum Moderators: goodroi
We have never used front page extensions, but a previous web host put these extensions on their server and somehow added them to everyone's account. At that time, we had open sub-directories (no index page) and I assume that Inktomi picked up these front page extensions by spidering those open directories and finding the front page directories within our regular directories. When this previous host ran a backup, they backed ALL of our current files up and added all of our files and sub-directories to their front page extensions. This created a spidering nightmare...
About one week ago, I wrote a Robots.txt file to ask ALL robots to quit trying to find files in those directories. All search engines have stopped trying to spider pages within these front page directories except, Inktomi.
I went through my server logs and it appears that each IP that Inktomi uses gets their own copy of a web sites Robots.txt file. Several of their IP's have requested and received the robots.txt file and have stopped trying to spider those directories. However, some Inktomi IP's haven't checked for my robots.txt file in several days and they continue to try to spider those non-existent directories.
Does anyone out there know the default number of days an Inktomi IP waits before it request another copy of a robots.txt file? It seems Google, AltaVista, Ask Jeeves and most of the others check for the robots.txt file at least once per day or before they start spidering. However, Inktomi doesn't seem to follow that same pattern. Anyone with any useful information, please post a reply and let me know the Inktomi schedule for updating their robots.txt information. Thank you...
(edited by: MarkHutch at 6:54 pm (utc) on Mar. 26, 2002)