Welcome to WebmasterWorld Guest from 18.104.22.168
Google analytics is showing that traffic to most of the categories on my site has dropped 25%-60% across the board, except for the 'main' or most popular category on my site, which has increased traffic. The overall effect is about the same net traffic, however much less narrowly focused than before (I compared September 08 to April 08.) In my case the shift happened very quickly.
It strikes me that Google has associated my site with one main topic and is driving traffic based on that topic. Is Google now forcing us to pick one horse and ride it? is this the end of the broadly-focused website?
Is Google Classifying 'Types' of Websites and Search Terms? [webmasterworld.com]
There's no doubt at this point that search terms get classified. With this recently suspected "traffic throttling", it is beginning to look like we have something real going on with websites, too - although the details are not yet clear.
If this is what's going on, it certainly doesn't impact every site - not the giants like Wikipedia, at any rate.
One very good indicator of Google's work in classification is their Labs project, Google Sets:
Try typing in your brand name or site name and see what other words are reflected back. If what's reflected back include names of your competitors, you're likely correctly categorized. If not, you might want to alter your site's keyword content so that you could get pigeon-holed better.
I'm not sure that Google needs to add any dedicated process beyond that for determining a website taxonomy.
Now taking things a bit firther, is there some added switch that says "rank this group of co-ocurring terms today, and this other group tomorrow?" In some cases at least, the data are quite suggestive.
Not to say that this IS the cause of what you're seeing, just that it is a factor that introduces some ambiguity into the analysis. Some of the other threads in this forum recently show similar reports - even to this degree:
THis day we are good at green widget, next day at red widget. But the overall user amount stays the same.
Is there a threshold for google traffic? [webmasterworld.com]
So whatever the current situation for Google, let's look at your question, "is this the end of the broadly-focused website?"
I'd say no - but it might be challenging to run a broadly-based commercial website without enough PageRank. The "topical limitations" do not seem to be affecting everyone, and from what I see one difference is whether there is enough backlink strength to support the diversity that the braod-based site offers.
So now a related question comes up for me: does Google sometimes give graybar PR for the parts of a site that are "off-topic", as it currently understands the topic. Or maybe discount internal link juice if it comes from another topical part of the site?
The Sitelinks example involves a general umbrella-term keyword... call it "gadgets". The site for which they appear isn't actually optimized all that well for "gadgets". What it's optimized for (and ranks well on) are phrases that fit into subcategories of "gadgets": "widgets", "gizmos", and "doodads". Yet the "gadget" phrase for which the site has Sitelinks is a very competitive phrase, and I was quite surprised when I first saw Sitelinks on this search.
More likely, it's due to due the implicit website taxonomy that tedster posits, plus the natural occurence of the term "gadgets" in inbound links describing the site as a whole, along with with several appearances of "gadgets" on the page.
The categories are likely to be effectively self-generating, to be somewhat fuzzy, maybe always shifting, and to come out of statistical analysis of the natural use of language (and perhaps of traffic patterns as well).
[edited by: Robert_Charlton at 8:44 pm (utc) on Oct. 4, 2008]
What made me sit up and take notice in the case I've referred to (which incidentally, no longer has sitelinks since a change or two was made) was the PR distribution, which even though the site's PR is now down a notch, still retains the same pages showing TBPR or greybar - and they're topical.
Nothing has changed for ages with regard to inlinks, but there are periodic changes in outlinks; and there's a uniform internal linking with no concentration on any sub-topic. Figure something like this:
Main overall subject area: baked goods
Topic A: cake
Topic B: cookies
Topic C: pie
There may be primarily inbound links with anchor text for topics A & B, yet if there's a heavier site concentration of text, outlinks (and topics of the sites linked to), keyword-based site navigation and page titles on: pumpkin, apple, mince, cherry, etc. then guess what? Those primarily co-occur with pie, not cake or cookies. Therefore the pie topic far outweighs cake and cookies and is of more benefit for users looking for pie, and sitelinks are more relevant for that subject.
To be clear, in the case mentioned, the "pie" category pages show PR and non-pie category pages do not. They used to all show PR until just a while back. However, the homepage ranks for all three phrases, with "pie" being the least competitive, ranking a bit higher, and having had sitelinks for a while.
I doubt there's fully topical PageRank, but between this case and clues I've seen on other sites, it looks like biased PageRank might be coming into play in recent times.
Now taking things a bit further, is there some added switch that says "rank this group of co-occurring terms today, and this other group tomorrow?"
[edited by: Marcia at 9:50 pm (utc) on Oct. 4, 2008]
Reading the patent seems it's a kind of "powered PhraseRank" and sites are clustered according to their topics. The patent also explains how personalization of SERPs works according to user preferences and topics and finally how use the new algos to combat more incisivelty spam resources.
Google should see IBLs to different category pages as opposed to the home page only/mostly as a counter point to possible spam wrt broad-topic sites.
Any new topical algo is bound to clash with the old 950/phrase-based spam algo and it could take some time for them to complete full integration with minimal collateral damage.
Google should judge the breadth of a site largely based on rate of site development. It is natural for a site to grow gradually over time and add new categories.
Unfortunately site age wasn't properly considered when the 950 penalty was instituted.
uol.com.br = Web Portals
terra.com.br = Lyrics & Tabs
globo.com = News
ig.com.br = File Sharing
In Brazil it's quite common to have web portals partners' redirect their domains to the web portal's subdomains. Ex: www.widgets.com.br redirects to widgets.uol.com.br
Then I dig a little further, and I found that the category is based on the number of pages indexed (looks like it). Even though all domains above should be listed under "Web Portals", they have different category because a good portion of indexed pages are from their partners.
terra.com.br = a lot of pages from letras . terra.com.br
ig.com.br = a lot of pages from baixaki . ig.com.br
globo.com = a lot of pages from g1 . globo.com
uol.com.br = good distribution of large subdomains (ex vagalume.uol.com.br superdownloads.com.br)
But... Here is my question...
terra.com.br ranks really well for music related search terms
ig.com.br ranks really well for software related terms
[edited by: tedster at 4:16 pm (utc) on Oct. 13, 2008]
[edit reason] moved from another location [/edit]
That's an excellent set of observations. Normally we do not discuss specific domains or keywords here, but in this case they are major portals so we'll make an exception.
As you can see from the earlier messages in this thread, other members are also suspecting that there is an automatic classification of some kind happening at Google - but we have no definitive understanding at this point.
It's always a challenge to compete with a high PR website, whether its a portal or just a stand alone domain. You may never be able to rank above them unless you become as strong as they are - but you can get on the first page, too, in many cases. It depends on the total picture of all competition on those keywords.
Interesting that the subdomains are affecting the categories of the higher level domain in AdPlanner. Yes, I think it's automated - the data is easily available through Google's phrase-based indexing [webmasterworld.com] algorithms.
Now the question becomes if a site in one category can rank for a competitive word that's not part of that category. I am beginning to feel that it has become difficult for a site to break into a new class of keywords, but it's not impossible. It seems to take a significant amount of content plus strong backlinks showing up to gain the new rankings - stronger in both these areas than in the past, perhpas, but it's still within possibility.