Part 3 Update Jagger

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Part 3 Update Jagger

soapystar

4:10 pm on Nov 5, 2005 (gmt 0)

Continued from
[webmasterworld.com...]

if it rains they will need a replay!

Yippee

8:20 pm on Nov 6, 2005 (gmt 0)

From MC's comment (per Steveb), I would guess the future SERPs will lean more towards 66.102.7.104. Perhaps not 100%, but definitely not 50%.

Dayo_UK

8:22 pm on Nov 6, 2005 (gmt 0)

Well wow - in that case they have managed to fix one canonical problem in 10000000000000.

Sorry - annoyed. There must be more to come.

TammyJo

8:22 pm on Nov 6, 2005 (gmt 0)

ww-watcher

A site that tells you what colors of paint look good with a certin flowery wallpaper, and what window dressing & flooring go good together, and then tells you how to remove the old, and install the new, and then has a link to buy these products is not informational?

Yes...it is informational

I think you might be misunderstanding what I meant.

Ecommerce = Target stop & shop type site.
Informational - HGTV(they don't physically sell a tangable product...they just tell and refer.)But they have advertisers to support their content.

cleanup

8:28 pm on Nov 6, 2005 (gmt 0)

I see signs that they are working in that "second order change" now..:))

I am watching with the mcdar tool.

Still not at the "9" yet though.

King of all Sales

8:40 pm on Nov 6, 2005 (gmt 0)

Eazygoin

You're right on the mark. I love Adwords because once you get it figured out, you can dramatically lower your cost per click while attracting more traffic and the traffic converts much better than organic listings. And I don't worry about traffic from orgaic listings. If we get it, then it's gravy. I don't think Google owes me a thing.

But from Google's standpoint, advertisers like me pay the bills and you are right, we wouldn't even be having this dicussion if it weren't for the advertisers. So if the users who are looking to spend money - corporate purchasing agents for example - get sick of plowing through the weeds, they won't hesitate to bolt. And my advertising spend will follow the buyers. Period.

Things may change, and they usually do, but from the way it looks right now, Yahoo and MSN have a much better handle on that concept.

Jimmers

8:40 pm on Nov 6, 2005 (gmt 0)

First post after utilizing these forums for years because of the sheer degree of information regarding many areas of webdesign and optimization. But, with this update, as well as many others, it all appears the same.

The very first piece of advice I saw time and again was first and foremost building your site for your visitors. This was stated by many of the webmasters who are scurrying to make changes with each update. I understand everyone wants to take advantage of the traffic the big G has at it's discretion to provide but, doesnt anyone else believe that a large majority of webmasters have lost sight of the 'building for the visitors.'

People are running to make changes to the tags and descriptions, create 301 redirects, checking and counting link numbers on their pages, holding back on their updates simply because they may contain too many pages at one time for G and they will trip a penalty.

I respect what G has created for the most part and yes, my sites have been affected on several updates but, it's come to feel like someone dictating how I should build my sites if I want any traffic from them.

I simply decided to build for humans after realizing how many hours were spent trying to chase down problems that may not exist to begin with. The traffic always seems to come back sooner or later and in the meantime the actual visitors can enjoy what has been produced.

idolw

8:41 pm on Nov 6, 2005 (gmt 0)

From MC's comment (per Steveb), I would guess the future SERPs will lean more towards 66.102.7.104. Perhaps not 100%, but definitely not 50%.

I would wish them that. These results are really clean and more comprehensive.

anttiv

8:44 pm on Nov 6, 2005 (gmt 0)

MC's answer to someone who had his site drop nowhere overnight:

"Kelly Jones, that probably means that you don�t have any manual penalty. But the algorithms can still change and that can affect the ranking of your site. A reinclusion request doesn�t do much in such a case, because it�s the scoring that is causing the site to rank differently, not any sort of manual penalty."

Who programmed an algorithm that suddenly drops a normal clean site several hundred positions down?

Everyone here with good rankings should remember that all it takes is one (minor) algo change and the next morning you check your stats you're nowhere. Get more income sources and build more websites to different domains before it's too late.

Leosghost

8:52 pm on Nov 6, 2005 (gmt 0)

King of all sales

First off welcome to WebmasterWorld

Second ...thank whatever gods one beleives in for you bringing some intelligent input ..there is some but post moderation would drop these threads to less than 50 posts ..bleating on bridges doesn't even begin to describe most of the posts

Thirdly ..a good friend sent me a link the other day when we were discussing this thread off WebmasterWorld which was inspite of the best efforts of some to elevate it from "how you doin ..i'm doin"...not succeeding in changing the tone ( so mostly unmitigated rubbish and zero analysis ..presume the posters would be the same who would like avatars on here )..

He sent me this ...seems to be about what your son [mindset.research.yahoo.com...] was thinking of ..but better ..

Most of the rest of you can get back to your pyjama party and eat canonical cake ..like "G"'s PR dept wants you to ..

Oh yeah and last off ..phase 3 started last Saturday 31.10.2005..

while you were all busy looking the other way ..like the guys at the plex wanted you to ..:)

[edited by: Leosghost at 9:06 pm (utc) on Nov. 6, 2005]

g1smd

8:56 pm on Nov 6, 2005 (gmt 0)

Dayo: Just what do you mean by canonical page(s)?

For me, that has always meant Google trying to figure out whether to list the whole site as www or as non-www, and in the absense of any redirects to force the issue, they would score things (incoming links, internal absolute links, etc) to try to work out which was best.

A few years ago, they were able to combine the backlink results and the PR results for both www and non-www and showed the same result in the SERPs for both queries. It took a few months for the combination to happen on a new site. GoogleGuy even posted (several years ago) that this process only ran a few times per year. In the meantime if you added the correct 301 redirect to your site then you didn't have to worry about such things, it just worked.

Then, a while back, things changed again so that the www/non-www scoring was now being done on a page-per-page basis, with simple "duplicate content filtering" taking out of the SERPs (or at least making it URL-only) the "other URL" for the same page. What now happened was that some parts of the site were being listed as www and others were listed as non-www instead.

This started to make a mess of things. One problem was that PR might NOT be being passed around the site to full advantage, but the other problem was much more incidious.

After having one URL filtered out, if the page was ever modified again, the "other" URL for the page would then be "UN-duplicated" and would re-appear as a supplemental result in the SERPs, with a cache from years ago, and the page would start to rank for content that may no longer exist on the real site.

Google had long ago stopped running their "equalise www and non-www" process over their database, and any sites without a 301 redirect in place now started contributing huge amounts of ancient pages to the supplemental index, and massive amounts of "almost duplicate content" to the SERPs.

Even with the redirect put in place at a later date, data in the supplemental index still wasn't touched by that, and continues to show up in the SERPs. With this update, Google is ranking those supplemental pages lower, but Google has never cleaned the supplemental index of cached data that no longer represents real life. That was something that I expected to see from this "update".

To me "canonical pages" means that I need to set up a redirect to tell Google that only the www version of the site needs to be indexed (or the non-www if that is preferred). Having done that, I expect only that version to appear in the index, and there to be no public data for the other version. (If I don't set a redirect to say which one I want, then Google will decide for me.)

For sites where they have already indexed stuff from the "other version" before the redirect was added, I expect them to run a process over their supplemental database, now and again, that looks at the status of every URL that is indexed there and then takes following actions:

- supplemental URL returns 301 and a new URL that leads to a page that has content with status 200: Dump all cache data, title, and snippet data, from the supplemental index for the "old" URL. You have the new URL for the page - go index it.

- supplemental URL returns 404 and has done so for many months: dump the cache, title, and snippet, 6 months after the page went 404. Page has gone. Dump the data.

- supplemental result returns real content and status 200: Update the cache and compare that to other URLs to see if duplicate, if duplicate dump the cache, title, and snippet -- if no duplicate found, leave it in the supplemental index.

Finally, run a process over the supplemental title and snippet database (and for the search queries for which these are returned) (yes, that IS a separate database to wherever the supplemental cache data is kept {I can show you many examples of that as proof}) to remove out of date titles and descriptions from the supplemental index, for pages that in normal search results are shown under the same URL but with a totally different title and snippet. Remove the "ancient history" title and snippet from the supplemental results, and stop the URL being returned as a match for content that hasn't been on the page for several years.

The supplemental index has gradually filled up with junk over the past few years. It really needs a good clean up.

Another example of something mid-bogglingly stupid is when a page goes 404 and still appears in the index; you can use Google Remove to seemingly remove it, but in fact all that happens is that you hide it for 3 or 6 months. After that time, even if the page is still 404 it automatically reappears in the index.

Why?

If the page is 404 and the webmaster asked for it to be removed, then it should stay removed until such time that it ever comes back with a 200 status.

The same is true for pages where the domain no longer exists. You can use Google Remove to get rid of it, but again after 90 or 180 days the page re-appears in the supplemental index even though it still no longer exists. Why?

The supplemental index is stuffed full of millions of pages of such junk. I don't need to see an algo shuffle as an update; I would like to see the actual data in the database cleaned up: spider every URL on the web and act on the status code that is returned. Currently this never happens with supplemental results.

These are the sorts of issues that I thought "fixes to canonicalisation and supplemental problems" would address. For those issues, nothing has changed. Nothing at all.

The searches I monitor haven't changed one bit in the last 4 months. There are two very different sets of supplemental results out there: old and very ancient. In fact Google added a load of duff supplemental data back into the SERPs in August - so instantly UN-fixing the canonical problems that the many 301 redirects added back in March had finally fixed in June.

Is there any hope that Google is going to re-spider all the URLs and actually update things rather than piling yet more ancient history into the supplemental index?

Some people seem to be referring to "canonical page" as somehow referring to Google identifying which page is the homepage of the site. That isn't what I see that term referring to at all.

What I see that referring to is where the same content can be accessed by multiple URLs:

- directly at: www.domain.com/somepage.html
- and also via: domain.com/somepage.html
- via a 301 redirect from otherdomain.com/somepage.html
- and by a 302 redirect from competitor.com/redirect.php

that Google picks the "correct version" to index and show in the SERPs and then dumps or hides the rest.

[edited by: g1smd at 9:11 pm (utc) on Nov. 6, 2005]

Eazygoin

8:57 pm on Nov 6, 2005 (gmt 0)

King of all Sales >>
I stated it on a much earlier post, and I'm going to mention it again. Google provides 30% of my leads, and that is excellent. But I diversify, using various other marketing opportunities, to create a broad spread of customers. In this way, I don't rely on one search engine to provide my users.
I believe Google does an incredible job, but I have to also do my part, by finding other sources. Optimising a site for Google, but ALSO, and primarily for customers is one thing. Finding business from elsewhere is another, and very important thing.

petehall

9:11 pm on Nov 6, 2005 (gmt 0)

You all might as well just accept that the results are going to look more like 66.102.9.104 than anything else.

The storm is almost over...

jcmiras

9:11 pm on Nov 6, 2005 (gmt 0)

Have you guys notice a change in 66.102.7.104? it seems like the SERP in 66.102.7.104 = 66.102.7.104(x) + 66.102.9.104(y), where x>>y (x is very much greater than y). In short 66.102.7.104 dominates 66.102.9.104.

idolw

9:14 pm on Nov 6, 2005 (gmt 0)

jcmiras,
yes. and the "7" DC shows new cache (Nov 5 for my site) while the "9" Dc shows cache from Oct 21

Dayo_UK

9:16 pm on Nov 6, 2005 (gmt 0)

g1smd

Wife is really starting to nag that I am on PC so will be quick.

Canonical url effects only 1 page of the site. A site can only have one canonical url - not hundreds - eg 1,000s of pages can be listed under the non-www but this does not nescarilly mean that you have a Canonical url problem - if you have the homepage listed under the non-www then yes a canonical url problem is more likely.

Basically how I see it is the Canonical URL is the Main/Homepage of the site. Eg where the site begins and all the rankings come from - now if you have a non-www homepage Google can get confused as this is the root of most basic part of the domain as you were. EG - I think sometime ago Google thought that it was a Good idea to say that all Canonical pages are domain.com rather than www.domain.com - or something like that.

If you read back over GoogleGuys older posts you can see that you can have a Canonical url problem without having any non-www indexed. Eg. If, every single interal page of your site linked to www.example.com/intro/homepage.com Google will look to picking that page as the Canonical over the homepage.

Example:-

[webmasterworld.com...]

Note that this has nothing what so ever to do with non-www.

Another example of the problem when it does not involve the non-www:-

[webmasterworld.com...]

etc etc

Basically - Google have not fixed the problem yet - it has something to do with non-www indexing (eg if you have a non-www homepage Google seems to prefer this as the Canonical) - but just because you may have non-www pages showing does not mean you have a canonical url problem.

Sorry - this was very rushed - so may not make sense.

jcmiras

9:19 pm on Nov 6, 2005 (gmt 0)

"yes. and the "7" DC shows new cache (Nov 5 for my site) while the "9" Dc shows cache from Oct 21"

Yup. And since the objective of J3 is just something related to indexing, I think the DC,66.102.7.104, that they (G) had just updated will be the final SERP.

Go team 7!

King of all Sales

9:19 pm on Nov 6, 2005 (gmt 0)

Leosghost-

Thanks for the link. You are right - Mindset is better than what my son had in mind. Yipee mentioned Mindset, but I couldn't find it.

I have to say that it is facinating and works like a charm. From a users standpoint, it's easy to plug in one search term and "rummage" through Yahoo's index for that term. I can tell you from experience that users really love a degree of interaction as long as it doesn't get too complicated. This seems to fit the bill.

The really interesting thing is that Yahoo has been able to accomplish that without shaking all the fruit off the tree every three months. Thanks again!

stinky

9:19 pm on Nov 6, 2005 (gmt 0)

�can you see your site when doing a
site:www.yoursitename.ext?

If yes, check next for _site:yoursitename.ext -www _Do you see results?�

site:www.yoursitename.com
gives me back 144 results and my home page is not first, it is on the second page of results. Also 5 of my site results show without no title or discription, just the URL. Those pages with just the URL are fairly new pages 2-3 month old. Also when I try searching for parts of text from those pages in G, I do not get anything.

site:yoursitename.com �www
give me back 12 results, all without the www. I did not notice my home page in the results.

Another thing I noticed, when I do searches for www.mysite.com and mysite.com they both give me back result that have www, www.mysite.com

I do not have any 301 redirect in my site, should I place one in asap? Is the below correct?
_____
Options +FollowSymLinks RewriteEngine on RewriteCond %{HTTP_HOST} ^domain.com [NC] RewriteRule ^(.*)$ [domain.com...] [L,R=301]
______

Should this 301 be placed in between the <head> <head/> tags?

About my site: It is clean, I write good helpful content, and it has never had any problems through any of the past year updates.
Half of the results under my keywords show sites that are directories of sorts. A lot of their pages come up but not their home pages, example: DirectorySite.com/helpful-product.html . Also there are some .uk sites which do not ship/sell the product to the usa.
Any help or advice?
-Thanks

donelson

9:24 pm on Nov 6, 2005 (gmt 0)

yes. and the "7" DC shows new cache (Nov 5 for my site) while the "9" Dc shows cache from Oct 21

For our main site, both of those show the current page as cached on 4 Nov. For the second site, one shows a cache date and the other does not.

petehall

9:26 pm on Nov 6, 2005 (gmt 0)

Yup. And since the objective of J3 is just something related to indexing, I think the DC,66.102.7.104, that they (G) had just updated will be the final SERP.
Go team 7!

Why on earth would you think that when it's been stated 9 will be used as first order.

We'll see in the next day or so anyway...

tonyss

9:27 pm on Nov 6, 2005 (gmt 0)

DC 9 and 11 also are showing new cache dates for my sites.

steveb

9:35 pm on Nov 6, 2005 (gmt 0)

"Basically how I see it is the Canonical URL is the Main/Homepage of the site"

No, that really isn't the point in talking about canonicals.

site.com/page1.html
www.site.com/page1.html
www.site.com/copyofpage1.html
anothersite.com/page25.html
All of these could have the exact same content on them. Google's mission is to find, and rank, the canonical content, the correct,(usually) original, best content. Obviously Google should not rank four, or hundreds, of copies of the exat same page. They try to find the canonical page.

The most common way this comes up is duplicate content on the same domain. Here the challenge is simple to state: Google tries to choose the most appropriate URL from all the copies. This could be
www.site.com/page1.html
www.site.com/copyofpage1.html
or relate to www non-www issues.

The canonical page of a domain is just one challenge Google faces in dealing with canonical URLs or canonical content or canonical pages. "canonical issues" include all these.

[edited by: steveb at 9:36 pm (utc) on Nov. 6, 2005]

Patrick Taylor

9:35 pm on Nov 6, 2005 (gmt 0)

g1smd: msg #:400

Excellent post. Thanks for sharing.

g1smd

9:47 pm on Nov 6, 2005 (gmt 0)

Dayo: you have the wrong definition for canonical. Sorry.

The canonical URL for a page is the single URL that best represents that one piece of content; where that content can be accessed by multiple direct-access URLs (multiple domains, and www and non-www for each domain), and via one or more 301 or 302 redirect(s) from other parts of the same site or from another site.

Google needs to choose one URL to show in the SERPs for each piece of content: that URL is the canonical URL for that page. There is one canonical URL (and multiple duplicate content URLs) for every piece of content on the web.

The "302 hijack" problem earlier in the year was a canonical URL problem, where, for your page at www.yourdomain.com/somepage.html Google chose to list your content as belonging to the www.competitor.com/redirect.php?page=www.yourdomain.com/somepage.html URL that pointed a 302 redirect at your page.

[edited by: g1smd at 9:49 pm (utc) on Nov. 6, 2005]

joeduck

9:47 pm on Nov 6, 2005 (gmt 0)

G1 - superb post #400. Everyone in this thread asking about canonicals should read that as a point of reference on the topic.

It's my understanding from talking to Google that many of the "canonical" concerns and problems arose from the 302 hijacking/misdirection/redirections that were a huge problem last fall.

I think many problems we are all seeing now are from decisions about sites that were made *in fall of 2004*. Faulty decisions that incorporated many canonical and 302 problems. Will Jagger 3 fix most of these? I'm losing hope for our site.

Patrick Taylor

9:53 pm on Nov 6, 2005 (gmt 0)

And I think the hard thing for many webmasters... me anyway - even if one has read and understood g1smd's post #400 (and put the 301 in place) - is to grasp what, if any, effect a 'canonical' or 'supplemental' issue might be having on their rankings.

[edited by: Patrick_Taylor at 9:58 pm (utc) on Nov. 6, 2005]

g1smd

9:56 pm on Nov 6, 2005 (gmt 0)

stinky: This goes in the .htaccess file (plain ASCII text) in the root folder of your site (you do use "Apache" I assume?):

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^domain\.com [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [L,R=301]

All: Post #400 has been something that I have wanted to post for more than a year. I know that multiple emails have been sent to Google Support by dozens of people that have this problem. They have all been fobbed off with cut and paste "Google will update the data next time they spider the web" answers (no they do not, it just accumulates in the supplemental index). The replies show that Google has totally failed to understand this problem; and this "non-update" just reinforces that.

GoogleGuy totally avoided my posts at #25 and #37 of this thread. I really hope he reads those and #400 and copies them to the backroom boys. The frontline people at Google don't seem to understand the problem.

[edited by: g1smd at 10:11 pm (utc) on Nov. 6, 2005]

Patrick Taylor

10:03 pm on Nov 6, 2005 (gmt 0)

Technical question: is the no need for a backslash in front of .com?

RewriteCond %{HTTP_HOST} ^domain\.com

zikos

10:05 pm on Nov 6, 2005 (gmt 0)

"Have you guys notice a change in 66.102.7.104? it seems like the SERP in 66.102.7.104 = 66.102.7.104(x) + 66.102.9.104(y), where x>>y (x is very much greater than y). In short 66.102.7.104 dominates 66.102.9.104. "
for all of you that have dreams about 66.102.7.104 let me tell you that 66.102.7.104 has 800 pages of my site including all pages deleted the last 2 years ,while 66.102.9.104 gives 250 all the pages of my site without any supl/ls.So stop hurraing about 66.102.7.104 because your sites have a good ranking there ,that DC has problems with canonicals ans suplem/ls.

g1smd

10:05 pm on Nov 6, 2005 (gmt 0)

Added it to the example. Not sure where it originally disappeared to.

This 516 message thread spans 18 pages: 516