homepage Welcome to WebmasterWorld Guest from 54.197.215.146
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Submit www and non-www site to GWT?
onlinesource




msg:4634469
 12:26 am on Jan 2, 2014 (gmt 0)

I've heard mixed things on this and maybe I'm confused.

So, let's say my official website appears as http://example.com. I have my site listed in Google Webmaster Tools as example.com. I can go into SETTINGS > Perferred domain and set "Don't set preferred domain" which to me would mean GWT would recognize the site with or without www.

Although, I have been told by some SEO experts that I actually need to add another site to GWT for www.example.com on top of example.com. Then, www.example.com has to have no preferred domain set for it.

To me to this makes no sense, Google is set to read example.com with or without www. Although, some SEO experts are telling me that if you have backlinks coming from www.example.com, GWT will not spot them without a www.example.com profile.

Can somebody please clarify this before I add another site to my list and have more site to manage with GWT.

[edited by: brotherhood_of_LAN at 12:48 am (utc) on Jan 2, 2014]
[edit reason] Using example.com [/edit]

 

lucy24




msg:4634480
 2:36 am on Jan 2, 2014 (gmt 0)

If you want to set a preferred domain form, you MUST register both versions with wmt.

Although www. is usually treated as an optional addon, it is technically a subdomain just like any other. So it is theoretically possible for example.com and www.example.com to be under the control of different people.

Adding the alternative version should take thirty seconds, tops. If both forms live on the same server and you've got a redirect in place, you don't even need to upload a fresh verification file; they'll reuse the one that's already there. After those thirty seconds, you never need to look at the non-standard site again. But you may want to glance at it occasionally to make sure it isn't picking up links in its own right. All other information will be the same.

phranque




msg:4634573
 12:28 pm on Jan 2, 2014 (gmt 0)

i always recommend registering both www and non-www hostnames in GWT.
the similarities and differences in what are reported are informative and often indicate technical problems that should be addressed.
i prefer not to set the preferred domain in GWT and prefer to fix the non-canonical hostname request problem with a 301 server redirect.

lucy24




msg:4634671
 8:07 pm on Jan 2, 2014 (gmt 0)

i prefer not to set the preferred domain in GWT and prefer to fix the non-canonical hostname request problem with a 301 server redirect.

They're not mutually exclusive. If you don't set a preference, and if someone links to the wrong form of your name, then that wrong form can potentially show up in searches.

phranque




msg:4634678
 8:46 pm on Jan 2, 2014 (gmt 0)

That would only be true if you used robots.txt to exclude googlebot from crawling your 301 response.

lucy24




msg:4634699
 10:42 pm on Jan 2, 2014 (gmt 0)

If an index listing is based on someone else's linking text, will google list the URL as given in the link, or the URL you 301 to?

phranque




msg:4634707
 11:25 pm on Jan 2, 2014 (gmt 0)

how often have you seen an index listing based on someone else's linking text that didn't also involve a robots.txt exclusion?

lucy24




msg:4634719
 12:26 am on Jan 3, 2014 (gmt 0)

Probably never :) I'm just talking hypothetically. There's no harm in telling g### your preferred form in addition to the 301s you'd be issuing anyway. So why not take the belt-and-suspenders approach. Belt-and-braces, for you Brits.

phranque




msg:4634728
 1:34 am on Jan 3, 2014 (gmt 0)

So why not take the belt-and-suspenders approach.


because it might hide a solvable technical deficiency in your configuration.
to me it is equivalent to using a link rel canonical element when you will never intentionally serve a non-canonical url.

onlinesource




msg:4634757
 6:12 am on Jan 3, 2014 (gmt 0)

So I just added www.example.com. I noticed that using my same GWT account, it clones the same sitemap.xml site but shows different "Links to your Site" and a new list of crawl errors to take care of.

I am interested in scanning some of these new links for a separate disavow.

phranque




msg:4634767
 8:08 am on Jan 3, 2014 (gmt 0)

what would googlebot see when requesting http://www.example.com/robots.txt or http://www.example.com/sitemap.xml and have you checked your server access log to see if googlebot requested either url?

onlinesource




msg:4634860
 3:56 pm on Jan 3, 2014 (gmt 0)

When I view my site, by typing example.com, it goes to http://example.com and all of my canonical urls are pointed to, for example, http://example.com/page.html.

In my shopping cart, the config file has the domain set as example.com and not www.example.com.

I'm in the process of checking the access logs now. Still, what I don't get is, in Google Webmaster Tools, for the http://example.com settings, I have http://www.example.com set to display URLs as example.com and http://example.com display URLs as example.com too. Should this not make Google see my site as http://example.com? Or should I tell Google Webmaster Tools to view my http://www,example.com site with a www and for http://example.com to view the site without a www? I didn't know if that would allow me to find problems exclusive to http://www.example.com and http://example.com?

Right now, in Google Webmaster Tools, I am seeing 9 links to my site to http://www.example.com and roughly 4,811 to http://example.com. Obviously Google sees these two differently, but if they are both set to display with or without a www, this difference in links is surprising.

[edited by: aakk9999 at 4:07 pm (utc) on Jan 3, 2014]
[edit reason] Unlinked URLs (and replaced with example.com) [/edit]

aakk9999




msg:4634863
 4:24 pm on Jan 3, 2014 (gmt 0)

Or should I tell Google Webmaster Tools to view my http://www.example.com site with a www and for http://example.com to view the site without a www?


The idea is that you go to Google Webmaster Tools and add a new site as www.example.com to the SAME WMT account you have for example.com

You then go to settings and select "Choose example.com as my prefered domain".

What this will allow you (as phranque said above) is to separately look for issues on www.example.com and example.com (without www) in your Google Webmaster Tools by selecting one or the other domain from the initial screen shown when you log in to the WMT account.
.

[edited by: Robert_Charlton at 5:35 am (utc) on Jan 11, 2014]
[edit reason] fixed typo in url to delink it [/edit]

onlinesource




msg:4634872
 5:32 pm on Jan 3, 2014 (gmt 0)

Sorry, for making this so difficult. :) So right now under Google Webmaster Tools account where it shows my sites and has a button that says ADD A SITE, I have example.com as one site and www.example.com as another. The settings for example.com are "Display URLs as example.com". The settings for www.example.com are "Display URLs as www.example.com".

This may all relate to my issue with robots.txt file. Apparently, my robots.txt page is not configured properly. While browsing the access log, it shows below mentioned file "/usr/local/apache/htdocs/robots.txt", which is wrong one.
======
[Wed Dec 25 15:37:32 2013] [error] [client 66.249.76.209] File does not exist: /usr/local/apache/htdocs/robots.txt
[Wed Dec 25 23:12:42 2013] [error] [client 141.101.80.254] File does not exist: /usr/local/apache/htdocs/robots.txt
[Fri Dec 27 07:05:30 2013] [error] [client 173.245.56.28] File does not exist: /usr/local/apache/htdocs/robots.txt
[Fri Jan 03 07:12:28 2014] [error] [client 141.101.80.254] File does not exist: /usr/local/apache/htdocs/robots.txt
======

It should point to my document root for my domain "/home/diplomac/public_html/robots.txt". Not sure what about the "robots.txt" file is causing this or if something else is?

onlinesource




msg:4634892
 8:10 pm on Jan 3, 2014 (gmt 0)

This may be worthy of a new thread, not sure. So if googlebot tries to load "www.example.com" it shows an error for me. I just ran checks right now and that is what's coming up. So, I guess I need to change my redirects rule or change settings in googlebot accordingly to resolve this issue?


ANy clue on how that's done?

lucy24




msg:4634901
 8:42 pm on Jan 3, 2014 (gmt 0)

The settings for example.com are "Display URLs as example.com". The settings for www.example.com are "Display URLs as www.example.com".

NOOO. The point is to give the same preference for both sites. Otherwise you're saying "A is B but B is not necessarily A."

onlinesource




msg:4635631
 11:46 pm on Jan 6, 2014 (gmt 0)

Right now, I have both domains with and without www listed. This brings up a second problem... Google Analytics. So, I can assign 1 GA property to each domain but can't use the same property for both sites.

So, if Analytics is setup to track example.com and the example.com GWT profile has the GA example.com property assign to it, that is it. Do I need to create a second GA property for www.example.com and then assign the www.example.com GA site to that property? The other problem is, each site in GA can only have one tracking code.

[edited by: onlinesource at 12:47 am (utc) on Jan 7, 2014]

lucy24




msg:4635657
 1:22 am on Jan 7, 2014 (gmt 0)

:: backtracking ::

I'm in the process of checking the access logs now.

If you're talking about your raw site logs, this is one of the few things the logs will not tell you (assuming shared hosting). Logs only list the part after the domain name. If logs say something like

12.34.56.67 [time-stamp-here] "GET / HTTP/1.1" 301 et cetera

it probably means that the request was redirected between "with" and "without" forms. It won't say explicitly.

[Wed Dec 25 15:37:32 2013] [error] [client 66.249.76.209] File does not exist: /usr/local/apache/htdocs/robots.txt
[Wed Dec 25 23:12:42 2013] [error] [client 141.101.80.254] File does not exist: /usr/local/apache/htdocs/robots.txt
[Fri Dec 27 07:05:30 2013] [error] [client 173.245.56.28] File does not exist: /usr/local/apache/htdocs/robots.txt
[Fri Jan 03 07:12:28 2014] [error] [client 141.101.80.254] File does not exist: /usr/local/apache/htdocs/robots.txt
======

It should point to my document root for my domain "/home/diplomac/public_html/robots.txt". Not sure what about the "robots.txt" file is causing this or if something else is?

You can't blame robots.txt by itself, unless you've gone and put in a weird redirect and/or rewrite, or you've said something like

RewriteRule robots\.txt - [R=404]

which frankly does not seem likely. You can look in your logs-- the access logs, not the error logs-- and see if those 404'd requests are immediately preceded by something else from the same IP.

Do the quoted lines look different from an ordinary 404 response? The kind you'd get if you just make up an URL and request it with a browser. What do you see-- in logs and onscreen-- if you manually request /robots.txt at your site?

netmeg




msg:4635663
 1:38 am on Jan 7, 2014 (gmt 0)

Ok first you pick which one you prefer - sounds like that's http://example.com.

Then you add BOTH example.com and www.example.com to your GWT, and specify that example.com is your preferred domain (in both profiles)

Then you make sure you 301 redirect any www.example.com URL to its example.com counterpart. If you do this right, then www never shows up in an address bar again.

And with that, the issue of the GA tracking code goes away, because you only have ONE set of valid URLs - the example.com.

onlinesource




msg:4635667
 1:57 am on Jan 7, 2014 (gmt 0)

Thanks @netmeg

"Then you add BOTH example.com and www.example.com to your GWT, and specify that example.com is your preferred domain (in both profiles) "

OK. I did. BUT both domains (with or without www) in Site Settings are set to "let Google decide" how to display. If I switch either to display as www or not, it just defaults them back to "let Google decide.

"And with that, the issue of the GA tracking code goes away, because you only have ONE set of valid URLs - the example.com. "

Since I do not have a default, technically listed, does this still hold true?

aakk9999




msg:4635668
 2:03 am on Jan 7, 2014 (gmt 0)

Overlapped with netmeg's post and onlinesource reply above

Google Webmaster Tools

Firstly, lets get back to Google Webmaster Tools. I believe you have added
www.example.com to the same Google Webmaster Tools account where you already have example.com (without www).

After you log in to WMT, firstly click on example.com site, then click on the Settings icon on the top right and select "Site settings". Make sure the Preferred Domain says Display URLs as example.com

Now click on the site drop-down (to the left of the Help button) and select
www.example.com. Now make sure the Preferred Domain also says Display URLs as example.com

Google Analytics

Before we do anything about this, we need to check if the example.com and
www.example.com are displaying the same content for the same URL.

Open the tab in the browser and enter in the address bar example.com
Open the second tab in the browser enter in the address bar
www.example.com

Compare the page content of these two tabs. We need answers to few questions:

a) is the content of the tabs the same ?
b) the second tab, where you have entered
www.example.com , has the www remained in the address bar or has it disappeared?

Firstly, if the content of both tabs is the same and the www has disappeared from the address bar of the second tab, then your www is redirecting to non-www and only non-www Analytics account is used, so you do not need to worry.

If the content of both tabs is the same and www has NOT disappeared from the address bar of the second tab then your www domain is not doing redirect to non-www version (ideally, it should). However, as you are seeing the same content, it is likely that both domains have exactly the same HTML, in which case both domains are sending their data to the SAME Google Analytics account. To verify this, right-click on each tab and select "View Page Source". Scroll to where your GA code is and check that GA code is the same on pages that are in both tabs.

Providing it is the same GA code, then what you should know is that even though Google Analytics will show you only one domain after you log in (presumably non-www), the data inside will be the cumulative of both, example.com and
www.example.com

I hope my explanation makes sense.

aakk9999




msg:4635669
 2:07 am on Jan 7, 2014 (gmt 0)

BUT both domains (with or without www) in Site Settings are set to "let Google decide" how to display. If I switch either to display as www or not, it just defaults them back to "let Google decide.

This happens after you click on [SAVE] once you change the Preferred domain option? If so, do you get any error message below the Preferred domain option settings?

onlinesource




msg:4635675
 2:19 am on Jan 7, 2014 (gmt 0)

OK. Thanks for the update.

My main site is now www. So if you try to type in my site without a www, it directs you to www.example.com. In GWT, I have both domains - example.com and www.example.com - set to display as www.example.com.

As far as analytics, the settings for my domain are set to www.example.com as the DEFAULT domain,

So, I guess everything is correct for me?

aakk9999




msg:4635677
 2:23 am on Jan 7, 2014 (gmt 0)

Yes, from what you said, all seems fine :)

onlinesource




msg:4635680
 2:39 am on Jan 7, 2014 (gmt 0)

Great.

onlinesource




msg:4636059
 1:02 am on Jan 9, 2014 (gmt 0)

Ok, so one more question. if my main site is www.example.com, in GWT, do I need to submit a robots.txt file under both the www and non-www site listing or just the www site listing, because that is how I am officially listed?

Same question with disavow? Should I submit the disavow list to the www or create a list of toxic links and domains from the www and non-www versions of my domain and upload them separately?

phranque




msg:4636063
 1:20 am on Jan 9, 2014 (gmt 0)

technically speaking, you don't really "submit" a robots.txt to GWT - you just put it in the document root directory and googlebot (and other compliant crawlers) will find it.


while you may want to serve a robots.txt from the non-www hostname to avoid a lot of 404s, you probably don't want to have any exclusions in the http://example.com/robots.txt file because you don't want to block crawlers from requesting a url to get the canonical hostname redirect.


if the non-www gets redirected to www.example.com you should submit the disavow for the www.example.com hostname.

onlinesource




msg:4636064
 1:41 am on Jan 9, 2014 (gmt 0)

technically speaking, you don't really "submit" a robots.txt to GWT - you just put it in the document root directory and googlebot (and other compliant crawlers) will find it.


I didn't know the technical term for providing a sitemap to GWT, but that being said I will leave the listed sitemap.xml details with both the www and non-www site.

while you may want to serve a robots.txt from the non-www hostname to avoid a lot of 404s, you probably don't want to have any exclusions in the http://example.com/robots.txt file because you don't want to block crawlers from requesting a url to get the canonical hostname redirect.


I don't believe I have an exclusions. The only thing I do have in the robots.txt file would be "sitemap: http://www.example.com/sitemap.xml". This would pull up whether somebody accessed http://example.com/robots.txt or http://www.example.com/robots.txt.

if the non-www gets redirected to www.example.com you should submit the disavow for the www.example.com hostname.


Thanks. I will keep this in mind.

lucy24




msg:4636070
 2:13 am on Jan 9, 2014 (gmt 0)

If you have a full-spectrum with/without www redirect in place (on my own site it's the only universal .* conditionless RewriteRule) the same robots.txt, sitemap, favicon and so on will do for both. In fact I have to assume the search engine checks to see whether a request for robots.txt or sitemap.xml under the "wrong" hostname will get redirected along with all other requests.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved