Forum Moderators: Robert Charlton & goodroi
I submitted my new site to Google for the indexing process. after a month or so, I come across a weird issue about my pages being indexed:
If I ask google to show me pages for: www.mydomainname.com it returns only a few of results;
If I ask instead, for: mydomain.com it returns many more page indexed.
I have also a subdomain:If I search for www.sub.domain.com it returns no results, while if I search for sub.domain.com it returns almost all the web site pages!
What does it means!?!?
I have submitted to my partners the www addresses to get inbound link from them: Whygoogle find me with no www. prefix?
Isit a new algo issue?
Should I modify the url adresses for linking?
I was out of SEO world for a while, maybe did I miss something?
Thanks for any useful feedback.
Sincerely
We've discussed this a lot here, and the main thread I'd refer you to is: Why Does Google Treat "www" & "no-www" As Different? [webmasterworld.com]. that thread is part of our "Hot Topics" area that is always pinned to the top of this forum's index page.
The "canonical fix" is to choose one or the other version of your urls (with or without the www) and 301 redirect every url of one type to the other type. In addition, Google Webmaster Tools also allows you to select your preferred version, and this can also help. But the 301 redirect fix should be the first line of defense against what is really a duplicate url problem -- two urls resolving to the same content. There are two great threads in Hot Topics about Duplicate Content - those should have all the information you need.
Now onward. You asked about www.sub.example.com -- if you have not set up a subdomain with the exact name www.sub on your server, then such a url does not resolve. As I said earlier, the "www" is not a mere prefix. Setting up sub does not set up www.sub.
And finally, know that the site: operator can give you buggy results, as can all Google's reporting functions. The search results are Google's primary focus, and those can be buggy too -- so their secondary reporting is naturally susceptible to more frequent issues. Trust your server logs and your own traffic data more than what you get from Google.
Setting up sub does not set up www.sub.
It does if default templates are snafu on the DNS server.
I suspect this is very common on sites doing their own DNS and using certain well used server control systems.
YMMV but I've seen it on cPanel/WHM setups along with other wonderful things.
The fault is with defaults ;-).
As Tedster said, the fix is to redirect all non-canonical domain variants to the canonical domain. In other words, pick one and use it consistently for all linking and citations.
Jim
I find your replies very helpful.
Only a couple of things:
1) You said that site: operator results and/or serp results could be buggy. So, how can I know wich one of the two version (with or without "www")to entrust?
Apparently google seems to prefer the "non www" version, but is that thrue?
Besides, I submitted my site to Google and to my partners for linking as www version, so why Google instead, recognize it as non www?...
2) does redirecting all the non-entrusted urls on the entrusted ones mean to redirect every single web site page? If yes, how to achieve that result? What do you mean as 301? A .htaccess command?
For an add-on virtual domain that I recently dealt with, I see that the add-on domain could be resolved many ways:
123.123.123.123/foldername/
hostdomain.com/foldername/
www.hostdomain.com/foldername/
anythingyouwantrandom.hostdomain.com/foldername/
www.anythingyouwantrandom.hostdomain.com/foldername/
foldername.hostdomain.com/
www.foldername.hostdomain.com/
mainsite.com/
www.mainsite.com/ <=== Canonical!
main-site.com/
www.main-site.com/
.
In the above example:
- 123.123.123.123 is the IP address.
- hostdomain is the main domain name that the hosting is registered under. Other domains are added-on by hosting them in a folder on this site.
- foldername is the FTP folder that the add-on domain is hosted in, one folder for each different site.
- anythingyouwantrandom is anything that you want to type, anything at all (e.g. "aaaaaa" "xyzabc123" or "a.b.c.e.f.g.h.i.j").
- mainsite is the domain name that we want the site to be known as.
- main-site is a common typo of the well known name.
The "anythingyouwantrandom" entry is the most worrying. It allowed the site to resolve at an infinite number of URLs. A wildcard 301 redirect fixed that.
Now, only URLs at www.mainsite.com are listed and indexed.
.
The fix was less than 20 lines of code in the .htaccess file to redirect everything to www.mainsite.com each time. All of the originally requested folder and file names are preserved in the redirect.
Whatever, the site-wide 301 redirect will fix that in a few weeks or so. Beware, that once the redirect is applied, many of your existing non-www listings will turn Supplemental and then hang around for a year or so (some will drop out soon after the redirect is applied, and re-appear a few weeks later as Supplemental too). That is NOT a problem. Supplemental Results are often used for URLs that used to have content, but now just redirect. Google "remembers" and shows them for up to a year before discarding them.
Your measure of success is in seeing how many of your www URLs are listed as normal results, not in seeing how long the non-www URLs take to drop out of the index, nor in obsessing with Supplemental URLs.
Each url I submitted/distributed was in the form: [mydomain.com...] instead of www.mydomain.com, so probably the first url form is read as "non-www"...could it be a valid reason?
Anyway:
Any particular reason to prefer "www" instead of "non-www" version?
Because the most part of indexed page are currently "non-www"; so I could continue entrusting as canonical domain the "non-www" one...
The highest number of indexed pages is for the "non-www" domain (especially subdomain).
Links, as already said are for [mydomain.com...] and for [sub.mydomain.com...]