Forum Moderators: Robert Charlton & goodroi
This is my conjecture, but what if Google is now working with domain-wide semantic factors a bit more strongly than in the past?
Some are seeing a strange trend toward SERPs that rank higher level pages that are actually one click away from the real "meat" - now what's that all about?
Also I've looked at enough troubled rankings recently to realize that some of the domains involved have developed into a kind of crazy-quilt of topics. As long as the scoring was very strong on just the individual url, these sites were doing great. But just maybe the disconnectedness of their "theme" is now being detected and seen as a negative.
I'm talking here about sites that throw up lots of varied pages to catch different kinds of keyword traffic, you know? They usually have "real" content, not scraped, but it's either MFA or (dare I coin another acronym?) MFO, made for organic. What Google says they want is MFV, made for the visitor.
Now obviously general news sites are also a crazy quilt of a kind, so it shouldn't just be any wide ranging diversity of topics that is problematic - that's not precise enough. But Google probably knows that their end user is often happier when the SERP sends them to a domain filled with relevant information, and not just a one-off page or even a small section.
Something about this feels like it's lurking in the back of my brain somewhere trying to break through. I am thinking more about domain-wide positive relevance signals here, rather than penalties.
Have my babblings triggered anyone's brain cells?
I have a 100 page site with lots of original content that I wrote myself. It has lots of inbound links many of which come from trusted sites and it has been near the top of the rankings for five years. It is still number one on .co.uk but lately it has slipped down to the second page on .com in favour of sites that are less relevant.
I would happily take my chances with a site rank as opposed to a page rank. In my situation it would mean that my site would be recognised as the resource that it is instead of being outranked by lesser sites perhaps with cleverer SEO on a single page.
Suppose a site gets a relevant, high quality referral with its key theme to one of its pages. Homepage, subpage, any page. This URL will have scores for a word or phrase that it's supposed to be relevant for.
If the internal navigation is meant to combine or further develop this relevancy, no matter what page the links and visitors land on, both the users and the algo will gladly admit that on a widgets page ( page, not site ) the internal links promote other pages to be of value about the keyword "widgets" even if the anchor of the internal navigation doesn't mention widgets all over again.
Given the source and the target are relevant, and use phrases that are interpreted to be okay to carry on the theme, a site will thus be scoring higher.
The landing page for widgets can pass on further relevance with any words that are recognized by Google as a valid combination / derivation.
And so, the algo works its way through the site page by page, but since the navigation is coherent, the entire site will benefit from it.
With new age "stemming" just making its way through the most wanted factors, sometimes pages will be found by using by derivations as well, even if they don't have a single inbound link nor navigation featuring those in the anchor / OR the content. ( One of these is necessary though. )
For inbound links and navigation, relevancy scores are not just stabbing holes into the scrorekeeping card. It's more like pellets, with a big hole in the middle, and some supplemental scores all around for anything that's related to the theme... some you may already target, and some you may not even know of. And relevancy isn't just semantics anymore, it's data gathered or hand typed into a database of related phrases.
So regarding your question:
I'd say the algo is examining pages, not sites. But a well kept, well researched navigation will turn the tables, and in the end, results in the effect you mentioned.
Include the disclaimer here that while part of this is experience and test results, part of this remains a theory which may be proven false or obsolete at a later time.
But just maybe the disconnectedness of their "theme" is now being detected and seen as a negative.
Then why are Wikipedia pages creeping up in the serps? I don't mind when a good article does well but pages marked as stubs are beating out sites with pages of information. Google must me considering the whole site heavily even though it's mostly unrelated.
trend toward SERPs that rank higher level pages that are actually one click away from the real "meat".
In some cases this is happening because the page with the real "meat" has been 950ed.
I would love to see Google consider the whole domain more including how related it's pages are but I haven't seen that trend yet. I hope it's on it's way.
I think the wikipedia argument supports this; the occasional 'greater dud' is held up by the earned power of its neighbors. In general, pages' PR often adheres to a pattern - shrinking by the amount of clicks from the main page. But there's many many sites where individual pages break this - and often it can be demonstrated that they do so because of their own link strength, based on powerful content.
If MFO and MFV aren't virtually identical, it's only because the webmaster has misunderstood the term 'organic' :)
[edited by: Quadrille at 4:07 pm (utc) on Mar. 25, 2007]
G will start looking at all their data acquired from the G Desktop & Toolbar stats to determine the quality of a "Site", and place some sort of unknown, unseen score, like PR, but based on how often users bounce back off of it, bookmark it, or time spend cruising site, or number of pages surfed, or some combination of all of the above. This will then be used somehow in the Algo to boost the trust rank, floating the quality to the top, & the crap that no-one wants to see to the bottom.
Pure Conjecture at this point, but you asked for "Brainstorm"
Back to Lurking
WW_Watcher
Edited to Add.
This would be more of a quality score created by surfers, than by inbound links that have been manipulated by SEO.
[edited by: WW_Watcher at 5:24 pm (utc) on Mar. 25, 2007]
Some are seeing a strange trend toward SERPs that rank higher level pages that are actually one click away from the real "meat" - now what's that all about?
On this I'd have to go with Miamacs and Quadrille: It's all in the linkage. In cases like this it might simply be that the link text dial was turned up just a tad, enough to give the less specific, but now much more powerful page, a bit more ooomph.
I've always been a big believer in internal pages supporting each other. My usual structure is pyramid/silo, but I deliberately address different aspects of a widget in different silos just for the opportunity of supporting cross links. I can't say that this structure is bulletproof, but it does appear to escape any drastic across the board ups and downs.
Piney:
Good observation. This is another aspect of the "new" Google commented on by Matt Cutts that has been totally overlooked, and that really addresses Tedster's "How might Google measure the site, and not just a page?"
A few years ago Google would simply return results for [MyFirstName] based simply on ranking. Google today has the ability to return results based on query type, i.e., navigational, informational, transactional.
For a transactional [buy MyFirstName] query, it will return mostly e-commerce sites. An informational [history of MyFirstName] query might see .edu, .org and other sites identified as informational in the SERPs. For a very general [MyFirstName] search, Google, not knowing exactly what you're looking for, will return a mix of types of sites.
What you described fits in perfectly with Matt's comments. Changes like this explain many of the seemingly abnormal results cited by many folks. Mixed in with other factors, well, it sometimes makes my head hurt trying to figure it all out.
<added>
G will start looking at all their data acquired from the G Desktop & Toolbar stats...
I think a general consensus has been building for the past year, year and a half or so that G is using this data to at least some extent. How much is anybody's guess, but I'd put bucks on it being used somewhere, somehow.
</added>
I usually take pains in separating fact and what I think, always stressing that much of SEO is opinion and gut feeling, with very few actual facts on which to base opinion. This time what I thought was fact turned out to be my opinion. It was initially based on a comment Matt made on a Danny Sullivan interview with Jimmy Wales. He basically said that Google has the ability to change the mix of results returned if it thinks users prefer a different mix. He did not say anything about different query types. His comment stimulated my thinking to go down a certain path, but the conclusion was mine, based on a broad swath of reading, other folks' comments around the boards and observations.
It's strange though that Piney's observation fits quite nicely -- though maybe not exactly -- with my hypothesis. Google does have the ability to categorize types of sites (as does Yahoo). I'm still using my hypothesis as the basis for my approach; anybody else please take it with a boulder-sized grain of salt.
Color me embarrassed in Manhattan. :o
He basically said that Google has the ability to change the mix of results returned if it thinks users prefer a different mix. He did not say anything about different query types. His comment stimulated my thinking to go down a certain path, but the conclusion was mine, based on a broad swath of reading, other folks' comments around the boards and observations
Looking at the whole site may not be too far off from reality, since Google's been looking at "whole sites" for a long time - for clustering purposes. I don't think it's too far a stretch to think they may be looking at whole sites overall to determine contextual relevancy with enhanced accuracy - which may be a significant factor in that phrase-based patent group.
Would there be a sector by sector factoring? For a sector like news, lots of fresh content would be vital. For another sector, it could be detrimental (does a normal business selling normal products add 1000 products to its inventory in one month?).
The driving factor is that the site content is fairly weak, MFA/MFO (i like it!) and if i were a visitor i would generally go elsewhere, often via an adsense click.
The site has a fairly high bounce rate, low page views per visitor, and is distinctly average compared to some (not all) of the other high rankers. They one thing we have is a high number of quality established links.
If googles using the toolbar data and looking at how many times repeat searches were made after hitting this site compared to other sites in the sector then it may well indicate it needs downgrading.
No fancy optimization (More a la “Brett” with some updates)
But I make always a point to have them, helping them or/and having a pro in their field helping at creating real relevance content from page A to Z
What we are (I suppose) seeing is that there is less room for “pro-optimization-heavy-duty-manipulators”
I'm talking here about sites that throw up lots of varied pages to catch different kinds of keyword traffic, you know? They usually have "real" content, not scraped, but it's either MFA or (dare I coin another acronym?) MFO, made for organic. What Google says they want is MFV, made for the visitor.
does that mean that if I have a site about widgets and want to make it more comprehensive and write about widget groups google won't like me anymore?
if I change my site about Toyota cars into an all-about-cars site might it hurt?
IMHO:
Lasnik said here the other day: we ban the whole domain in order to prevent you guys from nasty experiments on good domain.
We know: if your domain is trusted any page will rank.
How do you reach the conclusions that you do about "others"?
When I say "others" fracture that: When you seek "wise counsel" in matters of peer issues? When you seek comfort? When you seek expert advice on matters of specialized knowledge? When you have a mind to "do business" with a firm or company?
What if search began moving datastreams into a specailized algos? Maybe one size doesn't fit all. Maybe the organism . . err . . the algo . . is evolving . . because it must to survive.
Infact , I see less incidence of a page from a very high ranking site like say,,,, the bbc ranking highly in a search for say
widgets for sale
Good for me cos it was quite embarrasssing to be out ranked by the only page in the bbc site , about a tv program on selling widgets, where my whole site was about widgets and actually sold widgets
This trend has kept alive my interest in sub domains
What if search began moving datastreams into a specailized algos? Maybe one size doesn't fit all. Maybe the organism . . err . . the algo . . is evolving . . because it must to survive....
The "algo" is a living breathing entity within the confines of software/programming...
In regards to Google...we are seeing some "hormone" issues here...(some changes and new feelings from the algo...as it grows and learns... ;-)
With the direction of this thread...let me add .. that some of what I am looking at recently hints that Google is looking at the full "stemming" values across not only the local domain ... but the inbound / outbound link market structure for the root domain AND interior pages...
Now tie this in with the recent interface testing Google is doing for localization / google maps / and possible Google phone services... and the fact that Google is running now 700? IP numbers related to their distributed data center network topology....and you have a series move towards localization.... and pushing a new advertising model through Google Maps... ..?
ALL Of this is conjecture on my part...as I am not sitting in Google's main "data war room" looking at large sets of stats and usability data for all that Google has going on these days...
The knowledge and information life-cycle, discernment of knowledge, dilution of knowledge, discernment of static (historical, subject to interpretation) vs. evolving knowledge, information per se (look it up ~ State capitals) vs. creative knowledge vs. analysis vs. opinion w&w/o authority, and context of information should excite algo evolution for the next decade.
It may not be an operational tip but it's something to consider: Context likely matters. Discern signals of context of content. "Signals of quality" may be a close cousin but not an identical twin.
What contextual signals does a website send? Language may express context. What's included and what's missing in the language, images, design - the "character" of the website? How or why do those signals matter to consumers of information, knowledge or 'other'?
Just a few clouds in the brainstorm.
[edited by: Webwork at 5:05 pm (utc) on Mar. 26, 2007]
I do think that site-wide ranking does occur and I agree with the earlier evidence that Wiki can pretty much put up any page and it's going to make it into the top 5 results. In fact, I also think the knob on site-wide has been turned up because I see Wiki ranking for all kinds of one word terms. History, medical, biographies... Wiki has a very strong presence everywhere. And because we've seen unfinished pages ranking in the top three spots, I'm leaning more towards site ranking than page ranking.
Then on the other side, I have two sites where no matter what I put on them, it starts ranking highly and immediately. One of them has a domain that's been around since 1994 or so - I can kind of understand that. The other one is my partner's punk rock band site. No matter what we put up there, it seems to fly to the top of the results, for no apparent reason - a single mention of one of his guitars got that page ranking higher than the manufacturer! I sure wish I could figure out what was right on that site, and wrong on some of the others. But it's not apparent to my naked eyeball or brain.
(No, I'm not taking offers for product placement)
I do think that site-wide ranking does occur and I agree with the earlier evidence that Wiki can pretty much put up any page and it's going to make it into the top 5 results. In fact, I also think the knob on site-wide has been turned up because I see Wiki ranking for all kinds of one word terms. History, medical, biographies... Wiki has a very strong presence everywhere. And because we've seen unfinished pages ranking in the top three spots, I'm leaning more towards site ranking than page ranking.
"Authority" - not in currently used form meaning quality of information, which is largely subjective - but in the classic IR sense - ala Jon Kleinberg's Hubs & Authorities (circa 1998), has been in evidence for a long time, and it's by site. Think back to the Florida update, when all the "big directories," Ebay and Spamazon started turning up for everything but the kitchen sink. It's stil happening, at an accelerated rate.
That's site wide, and another way that can be used sitewide is:
{second order co-occurrence) + (word sense ambiguation) = phrase based indexing
That can be done on a much more granular level than on an entire data index, and IMHO that's what we've been seeing for a while - not LSI.
There's also reason to believe that might come heavily into play with clustering, and some of the +950 stuff may be more due to clustering than actual penalties in some cases. It all seems to fit, right along with the changes in crawling & indexing patterns since Big Daddy.
Personal theories, but arrived at after reading a bunch of background stuff and getting that gut-level suspicion about what's going on.
[edited by: Marcia at 5:57 pm (utc) on Mar. 26, 2007]
Block Rank [webmasterworld.com]
Basically, in a nutshell it's describing sites by "domain" and "host" and "subdirectories" and suggesting crawling a whole domain in order to use the aggregate Local PR for a site to speed up computation of PageRank on a Global level.
Added: FWIW, the BlockRank paper at Stanford:
[stanford.edu...]
Daft theories, but maybe that's what brainstorming is.
[edited by: Marcia at 6:10 pm (utc) on Mar. 26, 2007]
I think your on the right track. You see a lot of these penalty sites -950 that have a lot of cut and pasted content on them.
Overall, google probally is looking at duplicated content.
If the site is 80% cut and pasted from other sites or just duplicated content and 20% unique. Then it will probally not do to well.
Unlike wiki which is almost 100% unique, it does very well.
The more unique your site, the better it will rank.
Google could take the overall % of duplication and easily work it in the algos. Thats just a matter of math.