Forum Moderators: Robert Charlton & goodroi
My results are different on every DC though just about - but that has been the way for my sites for 9-10 months.
My theory has developed into that Google can not determine the root page of some sites - this may be caused by either 302, supplementals or the Canonical misery.
Unfortunately, there does not seem to be any indication that Google are getting any closer to a fix.
The site ordering seems to lend support to that theory, I mean is the root supposed to be top? - if so then Google are missing the target a lot.
and again - searching for "www.domain.com" and the first page being an internal rather than the root again backs up this theory that Google maybe having problems determing the root page for a site.
Regarding the root page issue, I have had to redirect my index.php and every reference to it on the site to /.
The problem is now cured for me since I have done that (though it was never a problem prior to the past few months!)
>>The site ordering seems to lend support to that theory, I mean is the root supposed to be top? - if so then Google are missing the target a lot.<<
I read somewhere that Google is doing great efforts to help webmasters with canonical issues. Part of that effort is that Google itself gonna decide in future which page to consider as homepage. And depending on algos, the homepage might change with time of course.
Google call it democracy among webpages :-)
Where did you read this - sticky me is probably best as link is probably against TOS unless on a trusted news source etc
>>>>Part of that effort is that Google itself gonna decide in future which page to consider as homepage.
Surely this happens now? - I was thinking that perhaps sitemaps might be a good idea for somewhere where you can set the root.
I dont. :(
Google dont seem to be able to get to grips with the situation and as time goes on more and more sites get hit.
Not good.
It is technically wrong to do this unless Google are 100% sure that content on the non-www mirrors the www.
Technically it is possible to have different content on:-
domain.com
www.domain.com
shop.domain.com
fish.domain.com
etc
There would be no way that Google will credit all links to the www by default.
The problem arises when Googles can not correctly identify when the non-www matches the content on the www.
Petehall,
Yes, obviously that is an important part of the algo that the page specific to the search outranks the page linking to it (eg the Homepage etc) - however for me it is the other way around - the homepage can not even rank for its own name, "www.domain.com" search etc... - beaten by internal pages.
I am not going to call it a Canonical url problem anymore - I am going to call it a problem when Google can not determine the root url - as this AFAIK is why rankings drop - I have a feeling that the problem determining the root page though is very much connected to the Canonical problem, 302 hijack and maybe supplementals.
My local default is 66.102.9.104 today, and I was happy to see this -- until I checked our serps...
What happened last night? We're now out of the first 50 pages for all but 1 of our key phrases. I monitor 10 2-3 word phrases that have all been steadily progressing back to their pre-September top 10 positions, and were listed every day since their re-emergence on October 21st. As of yesterday, all phrases were in the top 30 results.
The site can't be under penalty, as one phrase is still #8 for a very competitive term (63 million results). How can this just be flux after 4 weeks of constant listings for us?
Someone please tell me not to worry and that you're seeing the same. Reseller, any chance a steaming hot cup of cappuccino to calm my nerves? :))
I was to write infact in the previous post, make www the default canonical, if "www and non-www are identical and both return 200 OK status" which will solve the problem in more than 90% of the cases IMO. In cases where the content is different (my suspicion is, it is a very small %), then leave them as is, with each canonical powered by their own links. I guess it will be a very simple solution to a large problem, (Unless I am missiing any other tech possibilities here)
But Googlebot does not visit the page at the same time - eg. the non-www might be visited on the 1st November and the www on the 15th November.
Lots of sites change there homepage daily, weekly, hourly - or even on every visit (different ads run from a database or something)
Sooo - unfortunately that is not the ideal solution.
and as far as we know 90% of the time Google do get it right now - so there will always be sites missing?
>>>with each canonical powered by their own links.
So the whole sites get indexed twice - hmmmz - as long as results from the domain.com/www.domain.com are restricted to 2 and it does not result in onsite duplicate content penalty then I wonder if that would work.
Not ideal - but better than the mess at the moment?
Zikos - I am very much hoping that once J3 has fully been implemented we will see a big crawl based on the J3 infastructure.
[edited by: Dayo_UK at 1:47 pm (utc) on Nov. 21, 2005]
But Googlebot does not visit the page at the same time
Nor does the algo work at the same time the bot makes a visit.
Obviously I don't have the stats, but I am fairly sure the instances of sites having different content on www and non-www is far less than the no. of sites suffering from this Canonical mess now.