brainstorm: How might Google measure the site, and not just a page? - Google Search and SEO forum at WebmasterWorld - WebmasterWorld

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

brainstorm: How might Google measure the site, and not just a page?

1
2
3
»

tedster

8:02 am on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

One thing we tend not to discuss here very often - and that I have heard Google reps mention over the past year or so - is that Google is devising more ways of looking at the entire domain, and not only assessing relevance url by url.

This is my conjecture, but what if Google is now working with domain-wide semantic factors a bit more strongly than in the past?

Some are seeing a strange trend toward SERPs that rank higher level pages that are actually one click away from the real "meat" - now what's that all about?

Also I've looked at enough troubled rankings recently to realize that some of the domains involved have developed into a kind of crazy-quilt of topics. As long as the scoring was very strong on just the individual url, these sites were doing great. But just maybe the disconnectedness of their "theme" is now being detected and seen as a negative.

I'm talking here about sites that throw up lots of varied pages to catch different kinds of keyword traffic, you know? They usually have "real" content, not scraped, but it's either MFA or (dare I coin another acronym?) MFO, made for organic. What Google says they want is MFV, made for the visitor.

Now obviously general news sites are also a crazy quilt of a kind, so it shouldn't just be any wide ranging diversity of topics that is problematic - that's not precise enough. But Google probably knows that their end user is often happier when the SERP sends them to a domain filled with relevant information, and not just a one-off page or even a small section.

Something about this feels like it's lurking in the back of my brain somewhere trying to break through. I am thinking more about domain-wide positive relevance signals here, rather than penalties.

Have my babblings triggered anyone's brain cells?

BeeDeeDubbleU

10:49 am on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

To add to this, and like most others I will admit that most of my sites are MFMM (Made For Making Money). ;)

I have a 100 page site with lots of original content that I wrote myself. It has lots of inbound links many of which come from trusted sites and it has been near the top of the rankings for five years. It is still number one on .co.uk but lately it has slipped down to the second page on .com in favour of sites that are less relevant.

I would happily take my chances with a site rank as opposed to a page rank. In my situation it would mean that my site would be recognised as the resource that it is instead of being outranked by lesser sites perhaps with cleverer SEO on a single page.

Miamacs

2:46 pm on Mar 25, 2007 (gmt 0)

10+ Year Member

I'd say Google is heading to the exact opposite direction with its algo, but in the meantime, the changes trigger an effect that you describe, because of (forcing) relevant site navigation.

Suppose a site gets a relevant, high quality referral with its key theme to one of its pages. Homepage, subpage, any page. This URL will have scores for a word or phrase that it's supposed to be relevant for.

If the internal navigation is meant to combine or further develop this relevancy, no matter what page the links and visitors land on, both the users and the algo will gladly admit that on a widgets page ( page, not site ) the internal links promote other pages to be of value about the keyword "widgets" even if the anchor of the internal navigation doesn't mention widgets all over again.

Given the source and the target are relevant, and use phrases that are interpreted to be okay to carry on the theme, a site will thus be scoring higher.

The landing page for widgets can pass on further relevance with any words that are recognized by Google as a valid combination / derivation.

And so, the algo works its way through the site page by page, but since the navigation is coherent, the entire site will benefit from it.

With new age "stemming" just making its way through the most wanted factors, sometimes pages will be found by using by derivations as well, even if they don't have a single inbound link nor navigation featuring those in the anchor / OR the content. ( One of these is necessary though. )

For inbound links and navigation, relevancy scores are not just stabbing holes into the scrorekeeping card. It's more like pellets, with a big hole in the middle, and some supplemental scores all around for anything that's related to the theme... some you may already target, and some you may not even know of. And relevancy isn't just semantics anymore, it's data gathered or hand typed into a database of related phrases.

So regarding your question:
I'd say the algo is examining pages, not sites. But a well kept, well researched navigation will turn the tables, and in the end, results in the effect you mentioned.

Include the disclaimer here that while part of this is experience and test results, part of this remains a theory which may be proven false or obsolete at a later time.

annej

3:19 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

But just maybe the disconnectedness of their "theme" is now being detected and seen as a negative.

Then why are Wikipedia pages creeping up in the serps? I don't mind when a good article does well but pages marked as stubs are beating out sites with pages of information. Google must me considering the whole site heavily even though it's mostly unrelated.

trend toward SERPs that rank higher level pages that are actually one click away from the real "meat".

In some cases this is happening because the page with the real "meat" has been 950ed.

I would love to see Google consider the whole domain more including how related it's pages are but I haven't seen that trend yet. I hope it's on it's way.

Quadrille

4:06 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I see no evidence that Google is looking at sites rather than pages; it's just a weird coincidence that pages in the same site are often linked :) and so each may benefit from it's neighbor.

I think the wikipedia argument supports this; the occasional 'greater dud' is held up by the earned power of its neighbors. In general, pages' PR often adheres to a pattern - shrinking by the amount of clicks from the main page. But there's many many sites where individual pages break this - and often it can be demonstrated that they do so because of their own link strength, based on powerful content.

If MFO and MFV aren't virtually identical, it's only because the webmaster has misunderstood the term 'organic' :)

[edited by: Quadrille at 4:07 pm (utc) on Mar. 25, 2007]

jk3210

4:39 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Tedster, it's funny you should mention this, I was just getting ready to post a question in the 950-thread asking "Of the pages that have been affected by the 950 penalty, what percentage were not strongly related to your site's main topic?"

piney

4:54 pm on Mar 25, 2007 (gmt 0)

10+ Year Member

What I notice in the results these days is a variety of types of sites. A few days ago I did a search on my first name (not something I'm marketing but something I'm familiar with, obviously). A few years ago, this type of search would result in sites that were heavily targeting the name (ie adult sites). Now, the first result is a non-profit with my name in the acronym. The second is the dot com, a shoe company. Third is a definition (wikipedia). Fourth is a govt. agency that uses the name for an acronym. Fifth is an American Idol contestant, and so on with results being sprinkled with blogs, actresses, people's portfolios where the name is in the domain, more governments using the acronym. I would think, then, that if you used to rank well, say #2 on the first page, on so-and-so the actress, now you would be on page 2, as the other subject areas' first result would come in ahead of yours.

WW_Watcher

5:18 pm on Mar 25, 2007 (gmt 0)

10+ Year Member

IMHO, "In My Humble Opinion"

G will start looking at all their data acquired from the G Desktop & Toolbar stats to determine the quality of a "Site", and place some sort of unknown, unseen score, like PR, but based on how often users bounce back off of it, bookmark it, or time spend cruising site, or number of pages surfed, or some combination of all of the above. This will then be used somehow in the Algo to boost the trust rank, floating the quality to the top, & the crap that no-one wants to see to the bottom.

Pure Conjecture at this point, but you asked for "Brainstorm"

Back to Lurking
WW_Watcher

Edited to Add.
This would be more of a quality score created by surfers, than by inbound links that have been manipulated by SEO.

[edited by: WW_Watcher at 5:24 pm (utc) on Mar. 25, 2007]

jimbeetle

5:45 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Some are seeing a strange trend toward SERPs that rank higher level pages that are actually one click away from the real "meat" - now what's that all about?

On this I'd have to go with Miamacs and Quadrille: It's all in the linkage. In cases like this it might simply be that the link text dial was turned up just a tad, enough to give the less specific, but now much more powerful page, a bit more ooomph.

I've always been a big believer in internal pages supporting each other. My usual structure is pyramid/silo, but I deliberately address different aspects of a widget in different silos just for the opportunity of supporting cross links. I can't say that this structure is bulletproof, but it does appear to escape any drastic across the board ups and downs.

Piney:

Good observation. This is another aspect of the "new" Google commented on by Matt Cutts that has been totally overlooked, and that really addresses Tedster's "How might Google measure the site, and not just a page?"

A few years ago Google would simply return results for [MyFirstName] based simply on ranking. Google today has the ability to return results based on query type, i.e., navigational, informational, transactional.

For a transactional [buy MyFirstName] query, it will return mostly e-commerce sites. An informational [history of MyFirstName] query might see .edu, .org and other sites identified as informational in the SERPs. For a very general [MyFirstName] search, Google, not knowing exactly what you're looking for, will return a mix of types of sites.

What you described fits in perfectly with Matt's comments. Changes like this explain many of the seemingly abnormal results cited by many folks. Mixed in with other factors, well, it sometimes makes my head hurt trying to figure it all out.

<added>

G will start looking at all their data acquired from the G Desktop & Toolbar stats...

I think a general consensus has been building for the past year, year and a half or so that G is using this data to at least some extent. How much is anybody's guess, but I'd put bucks on it being used somewhere, somehow.
</added>

jk3210

5:52 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

jimbeetle, where did MC make that comment? Blog, PubCon, etc?

jimbeetle

6:22 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

It was in a comment on a different blog or board. I know I bookmarked it, seem to have misplaced a folder full of stuff. I'll try to dig up later today.

jimbeetle

7:09 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Well, I found it and reread Matt's comment. Wow, what a difference three months makes in how you remember something.

I usually take pains in separating fact and what I think, always stressing that much of SEO is opinion and gut feeling, with very few actual facts on which to base opinion. This time what I thought was fact turned out to be my opinion. It was initially based on a comment Matt made on a Danny Sullivan interview with Jimmy Wales. He basically said that Google has the ability to change the mix of results returned if it thinks users prefer a different mix. He did not say anything about different query types. His comment stimulated my thinking to go down a certain path, but the conclusion was mine, based on a broad swath of reading, other folks' comments around the boards and observations.

It's strange though that Piney's observation fits quite nicely -- though maybe not exactly -- with my hypothesis. Google does have the ability to categorize types of sites (as does Yahoo). I'm still using my hypothesis as the basis for my approach; anybody else please take it with a boulder-sized grain of salt.

Color me embarrassed in Manhattan. :o

Marcia

8:34 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

He basically said that Google has the ability to change the mix of results returned if it thinks users prefer a different mix. He did not say anything about different query types. His comment stimulated my thinking to go down a certain path, but the conclusion was mine, based on a broad swath of reading, other folks' comments around the boards and observations

But FWIW, I happen to agree with what you're thinking, for the simple reason that Google would have no way of knowing what type of mix users would prefer without looking at the nature of the query and trying to match it with respect to the proper context.

Looking at the whole site may not be too far off from reality, since Google's been looking at "whole sites" for a long time - for clustering purposes. I don't think it's too far a stretch to think they may be looking at whole sites overall to determine contextual relevancy with enhanced accuracy - which may be a significant factor in that phrase-based patent group.

piney

10:01 pm on Mar 25, 2007 (gmt 0)

10+ Year Member

What are the implications? Jimbeetle, can you talk more about how your thoughts about this changed how you design or market a site? Thoughts that came to mind for me were to:
-- try to determine what sector google sees my site as being in, who else is in the sector, and whether its a good fit for my site to be there
-- get the ill-fitting words out of the most commonly mentioned words on the site
-- try to think of words that should be there but aren't (look at other sites in the industry to determine a natural grouping of words google would develop? think in terms of the person trying to find my site & what they would type in)

Would there be a sector by sector factoring? For a sector like news, lots of fresh content would be vital. For another sector, it could be detrimental (does a normal business selling normal products add 1000 products to its inventory in one month?).

annej

11:49 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

pages that have been affected by the 950 penalty, what percentage were not strongly related to your site's main topic?

A lot of 950ed pages have been closely related to the rest of the site. So it's not just that. Though it may be a factor in some cases.

hughie

10:22 am on Mar 26, 2007 (gmt 0)

10+ Year Member

This is a very interesting topic and the results of "site rank" would certainly ring true with a customer of mine that's recently bombed in the SERPS after 7 years of invincibility.

The driving factor is that the site content is fairly weak, MFA/MFO (i like it!) and if i were a visitor i would generally go elsewhere, often via an adsense click.

The site has a fairly high bounce rate, low page views per visitor, and is distinctly average compared to some (not all) of the other high rankers. They one thing we have is a high number of quality established links.

If googles using the toolbar data and looking at how many times repeat searches were made after hitting this site compared to other sites in the sector then it may well indicate it needs downgrading.

henry0

11:22 am on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

As per Tedster �Babbling� � Well to me there is no babbling!

I do not even need or have time this AM to read the whole thread (sorry for that)
However I am a living proof of such an assertion
All my client�s sites are organically doing well!

No fancy optimization (More a la �Brett� with some updates)
But I make always a point to have them, helping them or/and having a pro in their field helping at creating real relevance content from page A to Z

What we are (I suppose) seeing is that there is less room for �pro-optimization-heavy-duty-manipulators�

idolw

11:46 am on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I'm talking here about sites that throw up lots of varied pages to catch different kinds of keyword traffic, you know? They usually have "real" content, not scraped, but it's either MFA or (dare I coin another acronym?) MFO, made for organic. What Google says they want is MFV, made for the visitor.

does that mean that if I have a site about widgets and want to make it more comprehensive and write about widget groups google won't like me anymore?
if I change my site about Toyota cars into an all-about-cars site might it hurt?

IMHO:
Lasnik said here the other day: we ban the whole domain in order to prevent you guys from nasty experiments on good domain.
We know: if your domain is trusted any page will rank.

Webwork

2:12 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Administrator

10+ Year Member

Top Contributors Of The Month

If only you could measure in a website what we value most in our relationships . . . integrity, credibility, honesty, reliability, intelligence, compassion, dispassion, nurturance, concern for others devoid of self-interest . . . now that would be a technology. :)

How do you reach the conclusions that you do about "others"?

When I say "others" fracture that: When you seek "wise counsel" in matters of peer issues? When you seek comfort? When you seek expert advice on matters of specialized knowledge? When you have a mind to "do business" with a firm or company?

What if search began moving datastreams into a specailized algos? Maybe one size doesn't fit all. Maybe the organism . . err . . the algo . . is evolving . . because it must to survive.

centime

2:42 pm on Mar 26, 2007 (gmt 0)

10+ Year Member

I actually thought that Tedsters assertion was already the accepted practice , at least for the last 4 months.

Infact , I see less incidence of a page from a very high ranking site like say,,,, the bbc ranking highly in a search for say

widgets for sale

Good for me cos it was quite embarrasssing to be out ranked by the only page in the bbc site , about a tv program on selling widgets, where my whole site was about widgets and actually sold widgets

This trend has kept alive my interest in sub domains

SincerelySandy

2:48 pm on Mar 26, 2007 (gmt 0)

10+ Year Member

What if search began moving datastreams into a specailized algos?

This would likely result in a significant increase in sales of asprin and rolaids.

ken_b

2:51 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I wonder, once you get beyond the normal navigation links, what role other internal linking plays in this.

lfgoal

3:45 pm on Mar 26, 2007 (gmt 0)

10+ Year Member

Top Contributors Of The Month

This is probably too early in the thread, but based on what Tedster and everyone else has postulated (my two dollar word for the day), what might possible recommendations for site content creation and internal linking be?

decaff

4:45 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

What if search began moving datastreams into a specailized algos? Maybe one size doesn't fit all. Maybe the organism . . err . . the algo . . is evolving . . because it must to survive....

The "algo" is a living breathing entity within the confines of software/programming...

In regards to Google...we are seeing some "hormone" issues here...(some changes and new feelings from the algo...as it grows and learns... ;-)

With the direction of this thread...let me add .. that some of what I am looking at recently hints that Google is looking at the full "stemming" values across not only the local domain ... but the inbound / outbound link market structure for the root domain AND interior pages...

Now tie this in with the recent interface testing Google is doing for localization / google maps / and possible Google phone services... and the fact that Google is running now 700? IP numbers related to their distributed data center network topology....and you have a series move towards localization.... and pushing a new advertising model through Google Maps... ..?

ALL Of this is conjecture on my part...as I am not sitting in Google's main "data war room" looking at large sets of stats and usability data for all that Google has going on these days...

Webwork

4:48 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Administrator

10+ Year Member

Top Contributors Of The Month

Algos likely handle operational stuff pretty well: 'natural' link development patterns, 'natural' website evolution patterns, etc. They're advancing on relationship stuff: authority, trust, etc. They're likely now starting to advance on defining what passes for knowledge - 'the answer', in context - itself (playing with Wikipedia is a sign), identifying the creation and evolutionary pathways of knowledge and assigning (dilution, accretion) value along that path, and identifying context of content.

The knowledge and information life-cycle, discernment of knowledge, dilution of knowledge, discernment of static (historical, subject to interpretation) vs. evolving knowledge, information per se (look it up ~ State capitals) vs. creative knowledge vs. analysis vs. opinion w&w/o authority, and context of information should excite algo evolution for the next decade.

It may not be an operational tip but it's something to consider: Context likely matters. Discern signals of context of content. "Signals of quality" may be a close cousin but not an identical twin.

What contextual signals does a website send? Language may express context. What's included and what's missing in the language, images, design - the "character" of the website? How or why do those signals matter to consumers of information, knowledge or 'other'?

Just a few clouds in the brainstorm.

[edited by: Webwork at 5:05 pm (utc) on Mar. 26, 2007]

BillyS

5:20 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

When I reinvented myself I tried to to avoid the problem of being crazy quilt of topics. In fact, I used the Google Sets [labs.google.com] tool to try and figure out what Google considered related topics. The outcome of the exercise was a logical set of topics to bundle into one website. To this day I still think that was a good decision because it made internal linking a natural.

I do think that site-wide ranking does occur and I agree with the earlier evidence that Wiki can pretty much put up any page and it's going to make it into the top 5 results. In fact, I also think the knob on site-wide has been turned up because I see Wiki ranking for all kinds of one word terms. History, medical, biographies... Wiki has a very strong presence everywhere. And because we've seen unfinished pages ranking in the top three spots, I'm leaning more towards site ranking than page ranking.

netmeg

5:27 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I definitely think there are site-wide considerations going on, but darned if I can figure out what they are. I have a couple client sites that should be ranking at least moderately well, lots of good unique data, and they've been around for a few years, and the standard seo stuff isn't too bad - but they can't get seem to catch a break in Google, no matter what.

Then on the other side, I have two sites where no matter what I put on them, it starts ranking highly and immediately. One of them has a domain that's been around since 1994 or so - I can kind of understand that. The other one is my partner's punk rock band site. No matter what we put up there, it seems to fly to the top of the results, for no apparent reason - a single mention of one of his guitars got that page ranking higher than the manufacturer! I sure wish I could figure out what was right on that site, and wrong on some of the others. But it's not apparent to my naked eyeball or brain.

(No, I'm not taking offers for product placement)

Marcia

5:50 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I do think that site-wide ranking does occur and I agree with the earlier evidence that Wiki can pretty much put up any page and it's going to make it into the top 5 results. In fact, I also think the knob on site-wide has been turned up because I see Wiki ranking for all kinds of one word terms. History, medical, biographies... Wiki has a very strong presence everywhere. And because we've seen unfinished pages ranking in the top three spots, I'm leaning more towards site ranking than page ranking.

Agreed, and it's possible the knobs have been turned up in more than one area.

"Authority" - not in currently used form meaning quality of information, which is largely subjective - but in the classic IR sense - ala Jon Kleinberg's Hubs & Authorities (circa 1998), has been in evidence for a long time, and it's by site. Think back to the Florida update, when all the "big directories," Ebay and Spamazon started turning up for everything but the kitchen sink. It's stil happening, at an accelerated rate.

That's site wide, and another way that can be used sitewide is:

{second order co-occurrence) + (word sense ambiguation) = phrase based indexing

That can be done on a much more granular level than on an entire data index, and IMHO that's what we've been seeing for a while - not LSI.

There's also reason to believe that might come heavily into play with clustering, and some of the +950 stuff may be more due to clustering than actual penalties in some cases. It all seems to fit, right along with the changes in crawling & indexing patterns since Big Daddy.

Personal theories, but arrived at after reading a bunch of background stuff and getting that gut-level suspicion about what's going on.

[edited by: Marcia at 5:57 pm (utc) on Mar. 26, 2007]

Marcia

6:02 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Another thing about sitewide factors - posted December 28, 2004:

Block Rank [webmasterworld.com]

Basically, in a nutshell it's describing sites by "domain" and "host" and "subdirectories" and suggesting crawling a whole domain in order to use the aggregate Local PR for a site to speed up computation of PageRank on a Global level.

Can't get more sitewide than that, for crawling & indexing. And anyone been noticing any anamolies in how PR is being distributed in sites?

Added: FWIW, the BlockRank paper at Stanford:

[stanford.edu...]

Daft theories, but maybe that's what brainstorming is.

[edited by: Marcia at 6:10 pm (utc) on Mar. 26, 2007]

trinorthlighting

7:32 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Tedster,

I think your on the right track. You see a lot of these penalty sites -950 that have a lot of cut and pasted content on them.

Overall, google probally is looking at duplicated content.

If the site is 80% cut and pasted from other sites or just duplicated content and 20% unique. Then it will probally not do to well.

Unlike wiki which is almost 100% unique, it does very well.

The more unique your site, the better it will rank.

Google could take the overall % of duplication and easily work it in the algos. Thats just a matter of math.

This 68 message thread spans 3 pages: 68

1
2
3
»