Welcome to WebmasterWorld Guest from 54.205.75.60

Forum Moderators: goodroi

Message Too Old, No Replies

Incorrect Robots.txt URL in server Logs

   
4:59 pm on Sep 17, 2012 (gmt 0)

5+ Year Member



Hey Guys,

I'm seeing some seriously strange stuff in our log files.

After receiving warning in GMT about robots.txt inaccessible, we checked the server logs and are seeing the following:

66.249.73.200 www.example.com - [16/Sep/2012:12:21:54 -0400] "GET /exampleproducts/product-2012.htmlrobots.txt HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"

Any idea why Google would request incorrect URLs like this? Anyone seeing anything similar?

Thanks,

-t
5:06 pm on Sep 17, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



That request is being redirected.

You should check where to. That could be an even bigger problem.
5:21 pm on Sep 17, 2012 (gmt 0)

5+ Year Member



It gets redirected to the products page. Hence why we get errors, but why would google request a bogus URL like this?
6:42 pm on Sep 17, 2012 (gmt 0)

5+ Year Member



Another clue: All the URLs seem to have vanity tld URLs redirecting to them. Is G trying to access the robots.txt of these URLs and instead requesting it from the deep page? Seems like a rather dumb idea for such a smart algorithm.

examplevanityurl.com -> 301 -> example.com/deepURL.html
examplevanityurl.com/robots.txt -> example.com/deepURL.html/robots.txt

Seems silly, no?
7:57 pm on Sep 17, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



if examplevanityurl.com is yours i would look for why that server is doing essentially a sitewide redirect to a subdirectory of example.com and fix it so it redirects to the root specifically for robots.txt request.