WarmGlow

msg:1525657 | 1:39 am on May 18, 2003 (gmt 0) |
| How often (in hours, days, bursts or whatever measurement) do you feel it would be appropriate for the robots.txt to be checked? |
| I would be very happy with a robots.txt refresh in 24 hours or less.
|
SinclairUser

msg:1525658 | 1:56 am on May 18, 2003 (gmt 0) |
Jeeze, I wish I had your problems. Can't get googlebot to visit no matter what. Even when it comes, it crawls some obscure stuff I don't want crawled. Send in the bots! - NOW!
|
jdMorgan

msg:1525659 | 1:56 am on May 18, 2003 (gmt 0) |
jrobbio, If you want to make the robots check more often, set the server Expires header for robots.txt to a shorter time. I had mine set too short last year, and wondered why the robots checked it before each and every file they requested! Jim
|
SinclairUser

msg:1525660 | 1:58 am on May 18, 2003 (gmt 0) |
JD, What happens if you have no robots.txt? Chris.
|
jrobbio

msg:1525661 | 2:03 am on May 18, 2003 (gmt 0) |
Thanks jd I didn't know you could do that. However, I would appreciate your input on the question at hand.
|
jdMorgan

msg:1525662 | 2:07 am on May 18, 2003 (gmt 0) |
SinclairUser, You get a lot of 404 errors from 'bots trying to find it, cluttering up your error logs and hiding real errors! Other than that, the lack of a robots.txt file is interpreted by robots to mean, "request anything you like." A good default robots.txt file which allows unlimited access but prevents all those 404s is:
User-agent: * Disallow: Follow the "Disallow:" line with one blank line - some obscure old robots require it. Jim
|
SinclairUser

msg:1525663 | 2:25 am on May 18, 2003 (gmt 0) |
JD, RE: no robots.txt. Googlebot can crawl into the darkest recesses of my site - just so long as it crawls everything! How long does it take just to get one decent crawl! Paid inclusion and PPC looks pretty good from here - in comparison to waiting forever to get indexed! Chris.
|
|