Welcome to WebmasterWorld Guest from 54.144.80.75

Forum Moderators: goodroi

Message Too Old, No Replies

Incorrect Robots.txt URL in server Logs

     
4:59 pm on Sep 17, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 8, 2008
posts:107
votes: 0


Hey Guys,

I'm seeing some seriously strange stuff in our log files.

After receiving warning in GMT about robots.txt inaccessible, we checked the server logs and are seeing the following:

66.249.73.200 www.example.com - [16/Sep/2012:12:21:54 -0400] "GET /exampleproducts/product-2012.htmlrobots.txt HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"

Any idea why Google would request incorrect URLs like this? Anyone seeing anything similar?

Thanks,

-t
5:06 pm on Sept 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


That request is being redirected.

You should check where to. That could be an even bigger problem.
5:21 pm on Sept 17, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 8, 2008
posts:107
votes: 0


It gets redirected to the products page. Hence why we get errors, but why would google request a bogus URL like this?
6:42 pm on Sept 17, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 8, 2008
posts:107
votes: 0


Another clue: All the URLs seem to have vanity tld URLs redirecting to them. Is G trying to access the robots.txt of these URLs and instead requesting it from the deep page? Seems like a rather dumb idea for such a smart algorithm.

examplevanityurl.com -> 301 -> example.com/deepURL.html
examplevanityurl.com/robots.txt -> example.com/deepURL.html/robots.txt

Seems silly, no?
7:57 pm on Sept 17, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10543
votes: 8


if examplevanityurl.com is yours i would look for why that server is doing essentially a sitewide redirect to a subdirectory of example.com and fix it so it redirects to the root specifically for robots.txt request.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members