Welcome to WebmasterWorld Guest from 54.236.246.85

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's SITE functionality w/ or w/o WWW

Why is Google only Indexing my non-www sites?

     
4:20 pm on May 16, 2005 (gmt 0)

New User

10+ Year Member

joined:Apr 8, 2005
posts:20
votes: 0


When I do a Google "site" search for my domain, with the "www"... I get no pages indexed... ie:

site: www.mydomain.com
results: 0 indexed

However, when I do the same search but without the "www" prefix... I get a lot of pages... ie:

site: mydomain.com
results: 10,000 indexed

I tried this on some of the big players (amazon..etc), and they have results in "both" searches.
What am I doing wrong?... or does this even matter?

Thanks!

4:35 pm on May 16, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 6, 2003
posts:2523
votes: 0


You should probably considering adding a .htaccess file in order to force all requests for non-WWW to www - or vice versa - because it *can* cause problems with duplicate content.

Is your navigation relative or absolute? Changing your navigation to absolute will force the spiders to crawl the rest of the pages of your site with the WWW, even if a site links to you without the WWW.

Possibly, a high PR site has linked to you without the WWW, and Google has decided that non-www is the authoritative version.

4:43 pm on May 16, 2005 (gmt 0)

New User

10+ Year Member

joined:Apr 8, 2005
posts:20
votes: 0


Thanks Patrick!

I do have an .htaccess file doing a bunch of rewrite stuff already... and my navigation is relative.

I'm assuming that I'd simply add in an entry to point all non-WWW to WWW pages... and that should take care of it?

Also, Google could think 1 pages with different prefixes (one WWW, one not) is actually different pages... resulting in a duplicate content ding?

5:16 pm on May 16, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 20, 2003
posts:167
votes: 0


And the redirect that you do should be 301 permanent redirect.

Or else, you may mess up even more.

6:01 pm on May 16, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member steveb is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 20, 2002
posts:4652
votes: 0


"What am I doing wrong?..."

Nothing. That is how you want it. TRYING to get both www and non-www pages crawled is suicidal.

8:20 pm on May 16, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 12, 2002
posts:697
votes: 2


And the redirect that you do should be 301 permanent redirect.

What set the 301 redirect within the .htaccess file?

Sorry not that techie minded! :)

10:06 pm on May 16, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


Solution for the split site problem:

Search your server software documentation for canonical hostnames:

For the Apache webserver what follows applys.

There are other considerations such as is mod_rewrite installed and active and the direction that you wish the rewrite to go in.

Oh and there are witespace issues there is a space before each! and between the ) and http: in what follows.

Canonical Hostnames

Description:
The goal of this rule is to force the use of a particular hostname, in preference to other hostnames which may be used to reach the same site. For example, if you wish to force the use of www.example.com instead of example.com, you might use a variant of the following recipe.
Solution:

# For sites running on a port other than 80
RewriteCond %{HTTP_HOST}!^fully\.qualified\.domain\.name [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteCond %{SERVER_PORT}!^80$
RewriteRule ^/(.*) [fully.qualified.domain.name:%{SERVER_PORT}...] [L,R=301]

# And for a site running on port 80
RewriteCond %{HTTP_HOST}!^fully\.qualified\.domain\.name [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteRule ^/(.*) [fully.qualified.domain.name...] [L,R=301]

2:07 am on May 17, 2005 (gmt 0)

Full Member

10+ Year Member

joined:June 29, 2004
posts:306
votes: 0


Erm, this is the simplest version if someone is confused.

RewriteEngine on
RewriteCond %{http_HOST} ^mydomain\.com
RewriteRule ^(.*) [mydomain.com...] [L,R=301]

2:25 am on May 17, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0



Erm, this is the simplest version if someone is confused.

RewriteEngine on
RewriteCond %{http_HOST} ^mydomain\.com
RewriteRule ^(.*) [mydomain.com...] [L,R=301]

And can lead to not covering all possible cases that can cause problems if other than domain and www.domain are valid.

For example 10.0.5.7 (yes, I know this isn't allowed on the exposed internet) could be a valid server alias that would not be covered.

2:46 am on May 17, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


An example that might highlight this is to look at a Google site search:

[google.com...]

which is the IP address of dmoz.org.

This is about 47,000 pages that probably shouldn't be in the index.

5:38 am on May 17, 2005 (gmt 0)

New User

10+ Year Member

joined:Apr 8, 2005
posts:20
votes: 0


thank you, especially theBear for the excellent rewrite examples.

steveb quoted my "what am I doing wrong" with an answer of "nothing"... but obviously if I'm just some mom-and-pop-shop online and "didn't" have this entry in .htaccess... I would be dinged?

It is sounding like this is a necessary step in preventing duplicate content from being indexed... no?

3:45 pm on May 17, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


dylans,

The other alternative WAS to remove the all of the unused server aliases from the httpd.conf (talking Apache here) when the site was FIRST setup and BEFORE the bots were allowed in.

But once the site is live it like the current situation a catch 22, the rewrite rules are the best way.

4:02 pm on May 17, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Thanks for the note about the ODP IPs showing.

I'll pass that back through the system. Don't know if anyone was aware of that.

7:35 pm on May 17, 2005 (gmt 0)

Full Member

10+ Year Member

joined:June 29, 2004
posts:306
votes: 0


# For sites running on a port other than 80
RewriteCond %{HTTP_HOST}!^fully\.qualified\.domain\.name [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteCond %{SERVER_PORT}!^80$
RewriteRule ^/(.*) [fully.qualified.domain.name:%{SERVER_PORT}...] [L,R=301]

# And for a site running on port 80
RewriteCond %{HTTP_HOST}!^fully\.qualified\.domain\.name [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteRule ^/(.*) [fully.qualified.domain.name...] [L,R=301]

How to check whether site is running on port 80?

7:37 pm on May 17, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Everything runs on Port 80 unless you specifically told it in the configuration files not to do so.
7:54 pm on May 17, 2005 (gmt 0)

New User

10+ Year Member

joined:Apr 8, 2005
posts:20
votes: 0


theBear, well put. That makes perfect sense. Thank you!
12:16 am on May 18, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 31, 2004
posts:43
votes: 0


PatrickDeese:

You should probably considering adding a .htaccess file in order to force all requests for non-WWW to www

Question: how many of the folks following this thread have had success with the permanent 301 to rid non-www urls? I saw that g1smd reported success with it, anyone else?

I mentioned this previously in another thread, but my 301 redirect has been up 6+ months with no results, ie no dropping of non-www urls.

Thank you to both PatrickDeese for the advice and theBear for the 301 explanation. theBear - do you want a space: {HTTP_HOST}!^fully between the } and the ! ?

Thanks.

1:49 am on May 18, 2005 (gmt 0)

Senior Member

joined:Oct 27, 2001
posts:10210
votes: 0


Question: how many of the folks following this thread have had success with the permanent 301 to rid non-www urls? I saw that g1smd reported success with it, anyone else?

I mentioned this previously in another thread, but my 301 redirect has been up 6+ months with no results, ie no dropping of non-www urls.

My 301 redirect of www to non-www URLs (my default) was added to .htaccess toward the end of March, after a 75% drop in Google referrals starting March 23. The number of unwanted www URLs in a site:www.sitename.com search actually increased from 759 on April 16, when I started keeping notes, to a high of 1,540 on April 20. The trend since April 20 has been a very gradual decline to 1,130 today.

I wouldn't call this a success, but over the past four weeks the numbers do seem to be headed (though at a snail's pace) in the right direction. Of course, there's no way of knowing what kind of surprise--good or bad--tomorrow may bring.

2:28 am on May 18, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 24, 2005
posts:71
votes: 0


Erm, this is the simplest version if someone is confused.

RewriteEngine on
RewriteCond %{http_HOST} ^mydomain\.com
RewriteRule ^(.*) [mydomain.com...] [L,R=301]

hello

i check my htaccess

it's like this

RewriteEngine On
RewriteCond %{HTTP_HOST}!^www\.domain\.com
RewriteRule (.*) [domain.com...] [R=301,L,NS]

only 2 differences. is it wrong? what mean the last NS?

thx

3:35 am on May 18, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


europeforvisitors,

Average daily hits

Avg - Nov - 04 498671
------ Dec - 04 128620 Boom
------ Jan - 05 136779
------ Feb - 05 101225 Boom again
------ Mar - 05 117890 301's installed 3/12
------ Apr - 05 178188
------ May - 05 185000
301 went from non www to www and included a vanity domain.

We had from 2 to 5 copies (including ip address) of 750 pages.

This split also affected PR being passed downstream.

alexo,

Your rewrite rule looks like it would work fine the NS is this according to the Apache documentation:

'nosubreq¦NS' (used only if no internal sub-request)

For a detailed discussion of rewrite visit the Apache website at:

[httpd.apache.org...]

Use a header checker to test it>

7:18 pm on May 18, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


The site that I set up the redirects on in mid-March had, at that time, only about just over half of the pages of the site indexed, and then had most of those duplicated as both www and non-www. Many of the entries in the SERPs were URL-only too.

Within days of setting up the redirect to non-www (the opposite of what I normally do), all the non-www pages were indexed, all with title and description, and the number of www pages started declining.

After about a week the number of www pages rapidly increased to back over 100. At that point, I set up a sitemap page pointing to all the URLs that we did not want listed. We installed that sitemap page on another site, and the number of www URLs listed then declined at a rate of 6 to 8 URLs every 3 days or so.

The number of www URLs in the SERPs became zero a few weeks back, but then, after a few days half a dozen URL-only www pages reappeared back in the index. This included a page that has not existed for over 18 months, as well as several pages that were removed from the site at the same that the 301 redirect was installed.

Apart from those minor problems, all the real pages of the site are listed, and all have a title and description, and all are presented as the non-www version of the URL.

At the time that the redirect was set up (actually it was a few days later), all of the links on the site were changed so that they all included a trailing / on the URL. Every page of the site is an index page in a folder. The index filename is not included in the link.

The addition of the trailing / to each link was needed because the owner of the site wanted to promote the non-www URL, but the server configuation had the www version as the default site name. This meant that a link that pointed to domain.com/folder was redirected by the default server configuration to www.domain.com/folder/ before the 301 redirect kicked in and changed it to domain.com/folder/ instead.

When running Xenu the intermediate step showed up as 112 extra "pages" all with the title "301 Moved Permanently". Changing all internal links to point directly to domain.com/folder/ including the trailing / on the URL, easily fixed this problem. It then took a few weeks for all those www URLs to drop out of the index. A few needed an inbound link to the "wrong version" in order to help Google "see" the redirect.

11:42 pm on May 18, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 30, 2002
posts:404
votes: 0


Before going into the htaccess route you might want to check your server configuration to verify that your host has your "servername" in the config file set to the www.domain.com version and not the domain.com - a lot of hosts out there screw this up in the configs and it causes people that use relative addressing on their sites to have the issue of Google interpreting the redirect that apache does internally as a different page
ex. you have a directory link, i.e.

href="/newdir"

apache sees that it is a directory, issues a 301 redirect with the
ServerName. Then google is finding the other domain (yes www.domain.com is a different domain than domain.com) and splitting things up for you (using their famous dupe page algo)

Just a thought since it has been going on since July 2004 and has affected many people with larger sites that had their server config set up wrong

12:53 am on May 19, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 24, 2005
posts:71
votes: 0


hello

can i ask the question related to 301 redirect

as i told before in this forum, 1 month ago i have huge problems with my host and as result .. loss all my KY and decreased traff from google

it was 30-40 days ago
after this i put 301 for all my non www pages.

all this time ... when i try to search in google www.domain.com i get only 1 page (3-4 years old homepage: like this domain.com/news/)

during this 40 days google visit my site, make a little indexing (only 600 pages) and don't touch to my homepage [domain.com/index.php]

search in google domain.com give me 1500 results about my sites and at the end some results and from my domain.

site:domain.com ... as i told u 500-600 results

btw. 30 days ago i intall 301 redirect to this page too

now domain.com/news/ redirected to domain.com/

now the question.

30 days i get this result [domain.com/news/] when i try to search www.domain.com/

today i try to search and get "Sorry, no information is available for the URL "

is it ban or something serious?

all this 30 days , google visit my site 7-10 times and index a lot of pages.

btw, results from yourcashe.com . regarding

www.domain.com

12 may 560 - 570
13 may 555 - 560 pages
17 may 555 - 10300 pages (6 datacenters)
19 may 563 - 565

domain.com

12 may 943 - 953
13 may 838 - 840 pages
17 may 555 - 588 pages
19 may 632 - 634

can u explain ,.. what all this mean?

1. why even my old homepage disappear from google database
2. why when 30 days my site is redirecting to www version, pages from non www version are more?
3. may be it's ban or something else (like penalty)

thank you

p.s.sorry for my english : (it isn't my native language)

12:57 am on May 19, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


Marval,

See [webmasterworld.com...] message #12