Forum Moderators: mack

Message Too Old, No Replies

msnbot only requesting robots.txt

         

eRAZOR

12:56 pm on Nov 19, 2007 (gmt 0)

10+ Year Member



msnbot has visited my relatively new site a couple times now but all it does is this:

domain.com 65.55.208.135 - - [18/Nov/2007:15:32:27 +0100] "GET /robots.txt HTTP/1.0" 200 35 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"

domain.com 65.55.208.135 - - [18/Nov/2007:15:32:28 +0100] "GET / HTTP/1.0" 301 1 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"

Interestingly, only 1 byte is transferred for the 301 request according to the log. Google is doing fine on the site. Yahoo is requesting by far less pages than Google, but crawling as well.

[edited by: eRAZOR at 12:59 pm (utc) on Nov. 19, 2007]

jbinbpt

1:01 pm on Nov 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How about posting the contents of the robots.txt file here?

eRAZOR

1:03 pm on Nov 19, 2007 (gmt 0)

10+ Year Member



I did verify the robots.txt with the Live Webmaster Tools. It says the file is fine:

User-agent: *
Disallow: /cgi-bin/

walrus

3:24 pm on Nov 19, 2007 (gmt 0)

10+ Year Member



<It says the file is fine:>

They have also been reporting valid robots.txt as not fine. :)
A few of us have reported weirdness with the robot.txt validator they use.
They are definitely trying to improve, so I imagine MSNdude will post in one of these threads soon reporting they have been working to correct a few issues with this.

vincevincevince

3:32 pm on Nov 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Where does the 301 response go? If you are doing a 301 from / to / (for example) then any bot is going to give up.

Try using a server header checking tool or typing the request yourself in telnet. See what comes back.

Do you have some mod_rewrite rules to add trailing slashes to directories perhaps? Or, are you redirecting the visitor straight into the /cgi-bin/?

eRAZOR

5:10 pm on Nov 19, 2007 (gmt 0)

10+ Year Member



Like I said. Google and Yahoo doing fine. The 301 is caused by Mediawiki which redirects to something like /index.php?title=Main_Page if you hit it with "" or with "/".

[edited by: tedster at 7:15 pm (utc) on Nov. 19, 2007]
[edit reason] no personal urls [/edit]