Forum Moderators: Robert Charlton & goodroi
Oh. My. God. [72.14.207.99...]
Changed again since this morning. Still get very wierd results for this DC, but when I add &filter=0 all (and I do mean "ALL") of the old supplemental results (for 404 pages and expired domains) are thrown away.
Normal Google search: 1 to 15 of 15.
This DC with ordinary search: 1 to 12 of 40 000 (all wierd results - you CANNOT get beyond 12).
This DC with &filter=0 applied: 1 to 1 of 1 - the CORRECT result for this search term.
[edited by: g1smd at 3:42 pm (utc) on Mar. 30, 2006]
[edited by: tedster at 9:56 pm (utc) on Mar. 30, 2006]
I'm disappointed about the drop but extactic about the move into the main index.
Most of these pages still have themed, relevant IBLs so hopefully they will rebuild their PR & rankings over time.
I've just checked my site that got hit with a [duplicate content/manual/Googleplex has sand in its vag*na] penalty last February.
I checked it a week ago and at long last it was showing about 900 pages, which is approximately correct. Previously it was showing between 9000-14000 pages which it never had.
I just checked again today, Google are back to showing 9k pages. Stepping through the results, many of the links are dead, even ones not in supplemental.
So what happens now? Has there been a rollback? Is there a deep crawl going on? What?
It seems an utter mess.
I think most of us have now given up on a fix to the bugs that are in the index. I read your other thread - it seems that you probably have been hit by a split PR penalty like a lot of sites that have disappeared from the serps.
(I am guessing at that - but a lot of the time when innocent sites go missing that is the reason)
Check domain.com and www.domain.com in a PR data checking tool - they are probably different - a sure sign of a Split PR penalty (Has to be called a penalty I suppose rather than Bug as Google is penalizing sites for it and they know it is happening but dont do anything about it.)
1] My AdWords budget was 50% utilised prior to then, but it is now at 100%
2] Many keywords that show supplemental on site: search are not showing as supplemental when doing an individual keyword search in the SERPs
3] Excluding the AdWords traffic, general trffic has gone up by almost 100%.
Conclusions:
1]Certain filters have been put in place to change the overall situation.
2] The site is a year old this month and perhaps been given more recognition.
3] Robots are deep crawling each page, and have been doing so for the last 4 days.
The good link [ just happens to be ours :) ] is at the bottom of the list - however, to be fair we have only just been indexed.
However, if we didn't exist - this is bad Google content. I don't believe it's isolated.
Google has to fix this urgently if it's reputation it's to keep the interest of it's users
Google Canonical Problem results in two Sites
It is now clear to me that the Google Canonical problem effectively results in two sites on the same domain. Non-www and www.
Obv. this causes loads of confusion for Google and results in sites losing positions.
Now - Previously I thought that it was just the pages which were listed that had problems but it appears that Google treats pages differently on the non-www and the www even if they are not listed (as long as Google knows the sub-domain exists).
So pages within the non-www that are not in the serps are just pages that have not been crawled from the non-www site! These are not pages that have been correctly canonicalised (sp?) - just pages that Google has not crawled from the Shadow site
If you can find any page anywhere on your site that has different PR on the non-www and the www (Even if it is not listed in the serps then you have a split site problem)
If you can find any page when you do a cache without the www and it does not show the www page you have a split site problem.
The pages that are not even listed are split from the original www pages and given their own PR and Cache.
It is like you have a Ghost site that is pulling your main site down.
And the killer is that even if you fix the issue with a 301 then it does not fix the site. It may mean that when you query the page in the serps it shows the www - but the split/shadow/ghost site problems are still there.
Please Google fix this problem.
Since there are three urls that can be recognized and linked to your site with by others...your Google PR can actually be split amongst all three (3) URLs... thereby degrading the overall value of your website.
However the directive to fix this is not a real 301... it is a rewrite condition in htaccess or isapi for windoze.
But the whole problem is that Google has not fixed the issue that it does not devalue a site - it destroys it in there rankings.
They have known about it for over a year and still have not fixed it.
Linking to / or index.html/htm etc does not create two sites in Google.
That is what the situation is - look at MC site for an example.
mattcutts.com has pages listed under the non-www and they have been allocated PR and their own cache - the other pages on the non-www have been given a PR0 and no cache as they have not been crawled yet.
The mattcutts.com is acting like a different site to the www.mattcutts.com - it has different PR on the pages, it has different cache on the pages. It is acting like a totally independent site - but normally (except Matts so far) it destroys the ranking and crawling of the www site.
It is the three methods of urls that can be used to link to your site.
Each can be treated as their own sites. Also internal linking can effect PR as well.
The three links
[domain.com...]
[domain.com...]
www.domain.com
This is the problem.
However I don't see Google having an issue with it till now and I imagine they will have it correct soon.
They maybe trying to rectify URLs internally so nobody needs rewrites.
My thoughts
Over the last 18 months or so though it has become so widespread that every single update more and more sites get hit by the bug and the threads turn into:-
"Why cant I find my site with its own unique name?"
"My site has dropped hundreds of pages for unique phrases?"
"Rubbish directories and spam are above me for my own content?"
But if you talk to these people who post (and I have) they all have the same sympton of Google indexing sites under the non-www and www.
The Bug Destroys sites.
What is more this set of Big Daddy posts started with the supplemental problem of only homepages being in the main index And Every Single one of those sites had Canonical url issues - It is/was not a supplemental problem it is a Canonical url problem that has still not been fixed.
The test of doing site:domain.com -www will not even show a fix now - as long as Google allocates different PR and Cache to non-www pages compared to www in the site the problem remains.
They both already know about it. A fix was first mentioned about a year ago............sometimes I wonder if they know what damage it has done to sites (and the serps) but then they can not really let on.
I have plenty of clients who have had this same issue over the last three years...who have had their rewrites done... and do not suffer any loss of rankings ever.
They bumped around during BD update but are back in positions held for years.
Perhaps the issue is the server that the sites are on or the hosting company as has been shown that many web hosts will hurt a site more than google.
The problem when blaming Google is many many sites are not having the issues... that a few are shouting about.
So if lets say 1 in 100 is having an issue..it would seem to be a site/server problem and not Google.
You are right though that Google.. and its employees cannot always tell the truth, and actually will use deceit at times.
When were the rewrites done?
At launch? - more than 18 months ago?
If the answer two either of those is yes then the site(s) may well not have got hit. Have you got examples of sites that have been indexed under the non-www and www who have retained rankings then please sticky me.
Using Matt Cutts site as some sort of example of the problem is incorrect as he has not fixed this (there is no redirect).
Just thought the obvious should be pointed out...
Then the advice has been to do the 301 redirect to fix the issue - in a lot of cases the 301 does not result in a fix unless you catch it very early on in the problem.
Which is often why the more recent people who have been hit come back quicker - they come to WebmasterWorld - someone here knows the problem in an instant and the 301 is applied very soon after the problem originally arises while Google is still actively crawling the site.
If the problem is left though then it seems to become more ingrained and less likely to be fixed when the 301 is applied.
So, what lines did you add to your .htaccess file? is that the real solution for this problem? 'cause my site has gone supplemental on March 29 and at least I would like to know if I have to wait Google to fix the problem or make some changes in my site.
btw, I never did any 301 redirections. If I do site:mysite.com the root page never shows in 1º place... I can even find it... it only shows 2 inner pages.
Thanks everyone for all the updates.
Options +FollowSymLinks
Options -Indexes
RewriteEngine On
RewriteCond %{HTTP_HOST} ^sitename\.com$ [NC]
RewriteRule ^(.*)$ [sitename.com...] [R=301,L]
Some older than 18 months some under. Some at start up others not.
I will say I have little experience with this on the windoze side as most of my clients are on Unix servers.
I believe I took this from Matt Cutts blog but I may be wrong
--------------------------------------------------------
There's been much talk lately of canonical issues and search engines. This is where both the www and non-www versions of your pages are listed in a search engine. This is said to possibly trigger a duplicate content penalty.
FYI, these all apply to *nix server environments. So if your site is hosted on an IIS box completely disregard it all and let us know that.
First, and best IMO, is via BIND/Local DNS. If you have the ability to edit your local DNS records you could simply add a record to that to handle it seamlessly. If you don't have access to your local DNS you might ask your hosting company to set it up for you. Most will. Most do as a default these days. If you can edit the local DNS for the domain itself, simply add a CNAME record with the non-www version pointing to the www version.
Or use the following, but be aware that you may suffer a further loss of traffic while the engines sort out what's what. This example is where you wish to direct all non-www traffic to www:
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^yourwebsite.com [NC]
RewriteRule ^(.*)$ [yourwebsite.com...] [L,R=301]
Ensure that all your links to folders always end in a trailing / if there is no filename after that link.
Note: test, test and test again after making changes. Test *immediately* after implementing 301 redirects. If you find anything wrong, remove the redirect immediately. User a server header checker to check that you're getting a correct 301 response when using the old URL.
-------------------------------------------------------
Hope this helps.
So if lets say 1 in 100 is having an issue..it would seem to be a site/server problem and not Google.
MSN doesn't have this problem with my site. If I do "-wewd site:domain.com" it lists the correct number of pages and all the results are rooted as www.domain.com.
The index page is shown as www.domain.com/ on all folders where an index exists and its listed only one time.
Microsoft have no special magic, its lack of will, not lack of ability on Google's part.
I know how to do a 301 redirect in .htaccess and I have tested tested and tested.
Yahoo, MSN, Gigablast, Ask, Search Hippo, Mirago, Wisenut do not have a problem - it is just Google.
Besides - Google does follow the redirect - but this does not improve the problems bought about by the split site - you still end up with split PR and Cache within the site even after Google has seemingly picked up the redirect.
panlus
So I guessed right without even seeing your site that you had Canonical problems! - Common theme - but why have Google left this so long with out a fix - and it is still on going.
Perhaps a TBPR update will improve/consilidate things (if those <rk> figures are to be believed) - when the site went down I cant remember if the TBPR split first - so I dont know if they have to get TBPR consilidated between the non-www and www before an improvement is likely to be seen.
First, and best IMO, is via BIND/Local DNS. If you have the ability to edit your local DNS records you could simply add a record to that to handle it seamlessly. If you don't have access to your local DNS you might ask your hosting company to set it up for you. Most will. Most do as a default these days. If you can edit the local DNS for the domain itself, simply add a CNAME record with the non-www version pointing to the www version.
This has nothing to do with DNS. Adding a cname will accomplish the same thing as pointing the www and the non-www to the same A record.
So if you have:
www.domain.com.IN A127.0.0.1
test.domain.com. INCNAMEwww.domain.com.
That does not mean visiting test.domain.com will be redirected to www.domain.com. Apache will still see you connecting to the server test.domain.com and the host will be considered test.domain.com. All you are doing is connecting to the ipaddress assigned to A record www. If I am wrong on this, please educate me as I am then missing something and the tests I just did to show this were wrong.
Also this does not take advantage of the 301 redirects that will update the results in the serps so that they use the correct url. Yes this works as I have tested it.
The problem lies, as Dayo_UK said, that this method does not combine the split pr that was spread between the two "supposed" different sites.
Btw since my cano issues, I have 301ed all my non-www pages and they are pretty much completely dropped from the index.
[webmasterworld.com...]