Forum Moderators: open
Googlebot will crawl most days this month also. At the moment it is finishing up the monthly update. Once that is done the Google "Freshbot" will be out and crawling....sooner the better IMO;)
The Google Freshbot is still a new kid on the block and she doesn't work everyday, especially during the "dance" period. However, we all love her and like you I can't wait to see her again:)
crawl13.googlebot.com - - [01/Jan/2003:08:35:28 -0500] "GET /robots.txt HTTP/1.0" 200 48 "-" "Googlebot-Image/1.0 (+http://www.googlebot.com/bot.html)"
crawl13.googlebot.com - - [01/Jan/2003:08:35:28 -0500] "GET /images/index_r1_c01.jpg HTTP/1.0" 200 10732 "-" "Googlebot-Image/1.0 (+http://www.googlebot.com/bot.html)"
crawl16.googlebot.com - - [01/Jan/2003:14:42:54 -0500] "GET /images/index_r1_c08.jpg HTTP/1.0" 200 4981 "-" "Googlebot-Image/1.0 (+http://www.googlebot.com/bot.html)"
crawl6.googlebot.com - - [03/Jan/2003:20:25:00 -0500] "GET /robots.txt HTTP/1.0" 200 25 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
crawl6.googlebot.com - - [03/Jan/2003:20:25:00 -0500] "GET / HTTP/1.0" 200 6579 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
crawl7.googlebot.com - - [05/Jan/2003:07:53:36 -0500] "GET /robots.txt HTTP/1.0" 404 275 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
crawl7.googlebot.com - - [05/Jan/2003:07:53:38 -0500] "GET /dialup.phtml HTTP/1.0" 200 6896 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
Seems like it took google a bit of time and pulling the / page a few times before it started to go deeper than the / path.
The one other thing i had done from when it last visited me on the 3rd was delete my robots.txt file
All reference i see to robots.txt is to exclude content and not actually add it in. I had found one reference on how to actual "ALLOW" the bot via the robots.txt.
What i find interesting is this:
crawl7.googlebot.com - - [05/Jan/2003:07:53:36 -0500] "GET /robots.txt HTTP/1.0" 404 275 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
crawl7.googlebot.com - - [05/Jan/2003:07:53:38 -0500] "GET /dialup.phtml HTTP/1.0" 200 6896 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
Notice the 404 on the robots.txt and how it downloaded another page? Didnt do that the other 2 times when i served the robots.txt with the allow directive.