Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Strange indexing issue - with and without "www" and subdomains

.

         

specter

10:34 pm on Jul 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello,

I submitted my new site to Google for the indexing process. after a month or so, I come across a weird issue about my pages being indexed:

If I ask google to show me pages for: www.mydomainname.com it returns only a few of results;
If I ask instead, for: mydomain.com it returns many more page indexed.

I have also a subdomain:If I search for www.sub.domain.com it returns no results, while if I search for sub.domain.com it returns almost all the web site pages!

What does it means!?!?

I have submitted to my partners the www addresses to get inbound link from them: Whygoogle find me with no www. prefix?

Isit a new algo issue?
Should I modify the url adresses for linking?

I was out of SEO world for a while, maybe did I miss something?

Thanks for any useful feedback.

Sincerely

tedster

11:40 pm on Jul 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The first issue to be clear about is that the commonly seen "www" is not just a "prefix", but technically it is a subdomain. Google has named ranking issues around this a "canonical problem" or "canonical root problem."

We've discussed this a lot here, and the main thread I'd refer you to is: Why Does Google Treat "www" & "no-www" As Different? [webmasterworld.com]. that thread is part of our "Hot Topics" area that is always pinned to the top of this forum's index page.

The "canonical fix" is to choose one or the other version of your urls (with or without the www) and 301 redirect every url of one type to the other type. In addition, Google Webmaster Tools also allows you to select your preferred version, and this can also help. But the 301 redirect fix should be the first line of defense against what is really a duplicate url problem -- two urls resolving to the same content. There are two great threads in Hot Topics about Duplicate Content - those should have all the information you need.

Now onward. You asked about www.sub.example.com -- if you have not set up a subdomain with the exact name www.sub on your server, then such a url does not resolve. As I said earlier, the "www" is not a mere prefix. Setting up sub does not set up www.sub.

And finally, know that the site: operator can give you buggy results, as can all Google's reporting functions. The search results are Google's primary focus, and those can be buggy too -- so their secondary reporting is naturally susceptible to more frequent issues. Trust your server logs and your own traffic data more than what you get from Google.

theBear

11:54 pm on Jul 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Err Tedster,

Setting up sub does not set up www.sub.

It does if default templates are snafu on the DNS server.

I suspect this is very common on sites doing their own DNS and using certain well used server control systems.

YMMV but I've seen it on cPanel/WHM setups along with other wonderful things.

The fault is with defaults ;-).

jdMorgan

12:17 am on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Also, if wild-card subdomains are pointed to the server in DNS, and the server is set up to accept them and point them all to DocumentRoot, then all subdomains like sub.example.com, and all sub-subdomains like www.sub.example.com or sub.www.example.com will resolve.

As Tedster said, the fix is to redirect all non-canonical domain variants to the canonical domain. In other words, pick one and use it consistently for all linking and citations.

Jim

specter

10:59 am on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks guys!

I find your replies very helpful.

Only a couple of things:

1) You said that site: operator results and/or serp results could be buggy. So, how can I know wich one of the two version (with or without "www")to entrust?
Apparently google seems to prefer the "non www" version, but is that thrue?
Besides, I submitted my site to Google and to my partners for linking as www version, so why Google instead, recognize it as non www?...

2) does redirecting all the non-entrusted urls on the entrusted ones mean to redirect every single web site page? If yes, how to achieve that result? What do you mean as 301? A .htaccess command?

g1smd

12:59 pm on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The redirect is a 301 redirect in .htaccess and needs just a couple of lines of code to cater for every URL on the site.

The code is posted every couple of weeks in some thread or other in the forum (mostly in the Apache forum).

I always use www URLs for the website.

g1smd

1:13 pm on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There are many variations of URL that can lead to the same content.

For an add-on virtual domain that I recently dealt with, I see that the add-on domain could be resolved many ways:

123.123.123.123/foldername/
hostdomain.com/foldername/
www.hostdomain.com/foldername/
anythingyouwantrandom.hostdomain.com/foldername/
www.anythingyouwantrandom.hostdomain.com/foldername/
foldername.hostdomain.com/
www.foldername.hostdomain.com/
mainsite.com/
www.mainsite.com/ <=== Canonical!
main-site.com/
www.main-site.com/

.

In the above example:

- 123.123.123.123 is the IP address.
- hostdomain is the main domain name that the hosting is registered under. Other domains are added-on by hosting them in a folder on this site.
- foldername is the FTP folder that the add-on domain is hosted in, one folder for each different site.
- anythingyouwantrandom is anything that you want to type, anything at all (e.g. "aaaaaa" "xyzabc123" or "a.b.c.e.f.g.h.i.j").
- mainsite is the domain name that we want the site to be known as.
- main-site is a common typo of the well known name.

The "anythingyouwantrandom" entry is the most worrying. It allowed the site to resolve at an infinite number of URLs. A wildcard 301 redirect fixed that.

Now, only URLs at www.mainsite.com are listed and indexed.

.

The fix was less than 20 lines of code in the .htaccess file to redirect everything to www.mainsite.com each time. All of the originally requested folder and file names are preserved in the redirect.

specter

4:46 pm on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks.

I'd ask for that in the proper forum.

What I don't understand is why G prefers my non www pages when I distributed my url to everyone as www...

Google mistery...

g1smd

4:56 pm on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Maybe one of your internal links points at the non-www version, or the highest PR external site that links to your site (or maybe the first link that Google ever found) pointed at the non-www version.

Whatever, the site-wide 301 redirect will fix that in a few weeks or so. Beware, that once the redirect is applied, many of your existing non-www listings will turn Supplemental and then hang around for a year or so (some will drop out soon after the redirect is applied, and re-appear a few weeks later as Supplemental too). That is NOT a problem. Supplemental Results are often used for URLs that used to have content, but now just redirect. Google "remembers" and shows them for up to a year before discarding them.

Your measure of success is in seeing how many of your www URLs are listed as normal results, not in seeing how long the non-www URLs take to drop out of the index, nor in obsessing with Supplemental URLs.

specter

7:22 pm on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maybe I see why G indexed my "non-www" urls:

Each url I submitted/distributed was in the form: [mydomain.com...] instead of www.mydomain.com, so probably the first url form is read as "non-www"...could it be a valid reason?

Anyway:
Any particular reason to prefer "www" instead of "non-www" version?
Because the most part of indexed page are currently "non-www"; so I could continue entrusting as canonical domain the "non-www" one...

g1smd

8:07 pm on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Which version has the highest PR?

It might not be the same as the one with the most listings.

.

Which version already has the most incoming links?

specter

8:17 pm on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



well,

The highest number of indexed pages is for the "non-www" domain (especially subdomain).

Links, as already said are for [mydomain.com...] and for [sub.mydomain.com...]

g1smd

9:24 pm on Jul 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Which one has the highest PR, www or non-www?

.

Two out of three gets the vote.