Welcome to WebmasterWorld Guest from 126.96.36.199
[edited by: brotherhood_of_LAN at 12:48 am (utc) on Jan 2, 2014]
[edit reason] Using example.com [/edit]
i prefer not to set the preferred domain in GWT and prefer to fix the non-canonical hostname request problem with a 301 server redirect.
So why not take the belt-and-suspenders approach.
[edited by: aakk9999 at 4:07 pm (utc) on Jan 3, 2014]
[edit reason] Unlinked URLs (and replaced with example.com) [/edit]
Or should I tell Google Webmaster Tools to view my http://www.example.com site with a www and for http://example.com to view the site without a www?
[edited by: Robert_Charlton at 5:35 am (utc) on Jan 11, 2014]
[edit reason] fixed typo in url to delink it [/edit]
The settings for example.com are "Display URLs as example.com". The settings for www.example.com are "Display URLs as www.example.com".
[edited by: onlinesource at 12:47 am (utc) on Jan 7, 2014]
I'm in the process of checking the access logs now.
[Wed Dec 25 15:37:32 2013] [error] [client 188.8.131.52] File does not exist: /usr/local/apache/htdocs/robots.txt
[Wed Dec 25 23:12:42 2013] [error] [client 184.108.40.206] File does not exist: /usr/local/apache/htdocs/robots.txt
[Fri Dec 27 07:05:30 2013] [error] [client 220.127.116.11] File does not exist: /usr/local/apache/htdocs/robots.txt
[Fri Jan 03 07:12:28 2014] [error] [client 18.104.22.168] File does not exist: /usr/local/apache/htdocs/robots.txt
It should point to my document root for my domain "/home/diplomac/public_html/robots.txt". Not sure what about the "robots.txt" file is causing this or if something else is?
BUT both domains (with or without www) in Site Settings are set to "let Google decide" how to display. If I switch either to display as www or not, it just defaults them back to "let Google decide.
technically speaking, you don't really "submit" a robots.txt to GWT - you just put it in the document root directory and googlebot (and other compliant crawlers) will find it.
while you may want to serve a robots.txt from the non-www hostname to avoid a lot of 404s, you probably don't want to have any exclusions in the http://example.com/robots.txt file because you don't want to block crawlers from requesting a url to get the canonical hostname redirect.
if the non-www gets redirected to www.example.com you should submit the disavow for the www.example.com hostname.