Forum Moderators: Robert Charlton & goodroi
1) I added a few hundred pages of content to try to monetize my site. Each page included an affiliate link with a rel='nofollow' tag. Perhaps google recognized the affiliate links and took away my trust rank? Perhaps I added too many pages? My site went from ~300 pages to ~900 pages.
2) All new pages were added in a sub-folder. Perhaps google thought the theme of my site had changed and penalized me? (although my index page stayed relatively the same)
3) Moved site to a dedicated server about 2 months previous.
Any thoughts?
I've been preaching a concept since the early days of the so-callled sandbox that still seems to be missed by a lot of webmasters, especially newer ones: Many algo elements are co-dependant.
What does that mean?
It means that the same precise change on one site may tank the site, while on another site, it actually helps the site. Example: If you have a 10 million page site and you add 500K pages overnight, you will probably start benefitting in short order, assuming that the new pages add value and cover kw's previously not effectively covered. If you have a 300 page site and you add 500K pages, you'll do well to avoid being tanked for some period of time, probably quite a long period of time. So the way G reacts to the addition of 500K new pages is dependant upon other site factors.
What are the implications?
It means that the more power a site the more it can get away with. It means that the more consistent a change is with prior changes on the site, the more likely it is that the change won't hurt the site. It means that Webmasters need to be thoughtful about how they implement large scale changes, in the context of both the site's history, and the general nature of sites in that category of site.
As pertains to this thread:
High Risk
- Adding a large number of new pages, relative to the existing number of pages.
- Adding a large percent of new pages, relative to prior growth rates of the site.
- Adding a large number of thin affiliate pages to the site (threshold for problems lower than for either of the two points above, IMO).
- Adding feeds.
Low Risk
- Adding new pages at a rate generally consistent with the site's history.
- Adding monetization vehicles to existing pages.
Non-Issue
- Fixing multiple link formats internally to the homepage by making all internal links consistent.
- Consolidating the non-canonical homepage URL's (e.g., "index.htm") into the selected canonical version of the homepage (e.g., "/"), via 301 redirects (with g1smd's caveat from this thread [webmasterworld.com] that you not create redirect chains).
I say non issue to those items noted immediately above, not because I'm certain or know what G knows, but because I've consolidated homepage URL's repeatedly for client sites over the last two years, including sites with 100K + pages, and only ever seen positive effect, ever. Nor have I ever heard of an issue in this regard being proven.
That said, from time to time bugs crop up in G's system, and I suppose it's always possible that this is one of those times. I sure doubt it though, and if it is a bug, MC et al would want to get it fixed fast. There is no good reason to filter or penalize sites for choosing a canonical version of the homepage and sticking to it. MC has often laid out the process for doing so: Pick a canonical version of the homepage, make all internal links consistent, 301 redirect non-canonical versions to the canonical version. Not a bad idea to try and get backlinks changed too. But that's part of the reason for the 301's.
Would you say it makes sense to mark many of those new affiliate pages as "noindex" for a while? and then slowly slowly turn each to "index, follow"? I wouldn't really concentrate on the user doing so, would I?
To be clear though, I'm speaking of new pages that, in our view, have value. I would not add thin affiliate pages to a site in today's environment, gradually or otherwise. Even if they are blocked from the bots, they are still visible during a hand check. If you care about that sort of thing.
Before the update, most pages on my site ranked on the first page of google serps for the targeted keywords. However, now I only rank for the site name.
TrustRank (as patented at least, who knows what is actually happening) is an extra step that limits PageRank "voting" to sites that Google considers authoritative and on-topic for your urls. Sort of an extra layer of calculation that is limited to your domain's neighborhood, instead of looking at the entire web.
So there would be no actual TrustRank number -- just an effect in the final ranking.
Let's put out a big THANK YOU to all the morons that build those massive garbage sites for causing all these problems.
Let's not confuse the general notion of what a "trusted site" might be in G's eyes, and TrustRank. The two are not the same at all. I would loosely connect notions of "trusted sites" and "sandboxing," insofar as if a site is trusted, it is not sandboxed, and if a site is sandboxed, it is not (yet) trusted.
As tedster points out, TrustRank is closely tied to PageRank. TrustRank is a way of refining PageRank by deducing presence of "quality" from a seed set of backlinks, and proximity of sites to seed sites. More or less.
The entire line of discussion in this thread, AFAIK anyway, has to do with issues that may be (and probably are) associated with notions of trusted sites. Sites can do lots of things that make them appear "not trusted." Some of those things can cause them to fall out of the SERP's for a while, until they are either hand-checked following a red flag, or until they "prove" themselves in other ways. Timing for reappearance can be short or long, and reappearance can be algo or manually driven.
If this site's issue is TrustRank, then we should be talking about backlink profiles. Indications from the OP do not so far lead us in that direction.
From my experience adding large number of pages is not a problem. On many ocassions I've added many pages to new and old sites. In one day (through sitemaps) one site went from 20 pages to 173,000 - another from 20,000 to 1,600,000 - yet another 300 to 110,000 - and no site was blocked or "filtered" by google. It took about 1-2 months to download new pages and 4-5 months to recalculate the PR - during that time sites stayed as they were before adding the pages - and after even better...
However, the difference is that all the pages I've added were very related to the topic of the site. In your case it's totally UNRELATED. That I think is the problem.
Questionable advice:
What a true back-hat would do is to update those pages to empty and put a redirect on them to your main page... Google seems to get "redirects" faster than "deletes"... After it'll get it, you may delete them.
We have several sites, that have been changed from "/" and Google appears to be treating the amendment as if they were new sites. They were all the same theme.
The adding of the pages is another problem that triggered this "review" filter. We also re introduced a high proportion of pages, when we changed our URL's from underscore [ _ ] to hyphen [ - ]
Such wholesale "fixes" can be viewed as "new" pages.
Matt has said if you add lots of pages in one go, there will be a "flag"
Somewhere i read "Pageoneresults" say - he puts sites up and leaves them and everything is OK. I can understand where he was coming from now - unfortunately not all of us can be in this idealistic position :)
IMO it's the turning of an informational site into a "commercial " one that caused the problem.Informationals get a leeway in their promotional methods that commercials do not, IMO...
That could be another factor as it's possibly another signal of spam that a site with a ton of pages suddenly monetizes them, hard to say without more empirical evidence.
My site which just got another 30K page injection is, and always has been, monetized, so perhaps it's just status quo of a 10 year old site letting me get away with it.
I have a 9 and an 8 year old site and both get bitten by the bug periodically. When they're not, they go straight to the top though. Oen was "informational" and I made enough on it, so I should not complain. It was funny when someone offered me $200 for it as it matched his name. That was funny.
I increased the number of pages on my web site from about 5,000 to 15,000 early this summer. My Googgle traffic has now fallen to almost nil.
The web site has been going for ten years, and the traffic has always been so good that I have never worried about search engine optimisation. Now I am being forced to look into these things for the first time.
The irony is that the reason I redesigned the site was to increase Adsense earnings. But now of course Adsense earnings and new customers for my main business have all but disappeared.
The content of the site didn't change, but the same information was divided into 15,000 pages instead of 5,000. There was no duplication of content.
Last week I returned the site to how it was before, and am now hoping for some improvement ...
i added the last month 7000 Pages with really similar content.
from 870 to 8000 pages - and google got the pages and don't show them on serps.
every week he shows 1000 and then they disappeared again and so on
then google shows 200 and they disappeared again then 500 , then 1000
and so on
now, all 8000 pages are indexed and are ranking well
i put them 200 to 500 every 2 or 3 days
my site has only a pr4
1) In addition to the many sites we own, we monitor others that we regard as bellweathers in various respects. I have never seen a healthy, clean, pre-1999 site that is highly respected, pop in and out of the SERP's (with one exception that was a site issue). There are no certainties without knowing the algo's ourselves, but IMO, any site that has been around 7+ years and pops in and out of the SERP's, has one or more issues, that age is helping it overcome. We see a lot of those situations. Either there is/are one or more problems with the site that are not bad enough to permanently kill it, or the site is pushing the SEO frontier on lots of fronts.
2) Because so many things are co-dependant, and because age is such an important factor, there will always be reports of sites that do something and get wacked, and others that do similar things and don't get wacked. But age is not the only factor. I know a site that had less than 1,000 pages that added another roughly 6,000 pages earilier this year, and the site is humming along. But before adding those pages, that site was already very old, very well regarded, #1 or #2 across all the major SE's for as long as we've watched it, with amazingly good citations across multiple categories.
OTOH, I have, and/or know of, plenty of other sites that, at the 1,000 page size, would probably suffer significantly if 6,000 new pages were added overnight. In fact, as noted in other posts, we've proved it to ourselves. :P
All of which does not even take into account the nature, contents, and quality of the pages being added, which IMHO is another huge factor that draws far less discussion.
3) I believe that there is a window of time in which, if a site adds a bunch of pages and gets hurt, that it can remove them. I've seen both circumstances, i.e., removed and rankings returned, and removed but rankings did not return. Not sure how that works, exactly, but there seems to be a period of time in which G allows for mistakes and does not permanently penalize for them. This may take into accont the need to protect sites from inadvertant errors, but has the effect of letting some sites that add lots of useless content reconsider that decision.
One thing we're fairly sure of: IF lots of pages are added and they are thin/useless, and the site is hit in the SERP's, take those pages down. OTOH, if the pages added have good value, leave them up.
We are running a journalist blog network. When we add a new weblog this is added to all the other blogs, so users can see the other subjects. All with fresh new content every day. Last week we added 5 new blogs, all with there own subdomain, like the current ones. As many of the blogs have PR 4/5/6 normally they get into the index really fast.
Now nothing, we see the links on the other blogs, so google is indexing them. But it looks like they're skipping the new blogs, as till one month ago they would be added within 2/3 days. "Old" blogs get indexed like they would in the past...
Al new blogs get decent links immediatly from all kind of sources, from low till high PR sites. Are we doing something wrong, or a proof of wrong filtering?
Maybe those new "blogs" got caught in the "sandbox filter" or is it the "lack of trust filter".
How many links did they aquire and in what time span did it take Google to see the links?
Did these new "blogs" also use any affiliate programs or in anyway link to areas of the web that were actually off theme from the exsisting "blogs" on the domain?
Do the new "blogs" while on theme to the journalism bent of the domain cover topics that Google might frown upon for a general audiance (adult matters, g!mbling, boo!e, terbak!, p!lls, weap!ns).
As caveman says, it is a co-dependent world out on the world wide wobbly.
[edited by: theBear at 5:24 pm (utc) on Oct. 10, 2006]
The main domain is six years old, i don't know how trust ranks exactly works, but i think this shouldn't be a problem.
"How many links did they aquire and in what time span did it take Google to see the links?"
Took them a few days and i see the name of the blog in the searchengine. And also in the cache of the site that links.
"Did these new "blogs" also use any affiliate programs or in anyway link to areas of the web that were actually off theme from the exsisting "blogs" on the domain?"
No, all the blogs have natural links. No affiliate, no bought links, all natural. And yes they also have links to external sources.
"Do the new "blogs" while on theme to the journalism bent of the domain cover topics that Google might frown upon for a general audiance (adult matters, g!mbling, boo!e, terbak!, p!lls, weap!ns)."
No, more general then this topics are impossible. Topics like dance music, illness and videoclips for example..
"As caveman says, it is a co-dependent world out on the world wide wobbly."
It's all dependend on what spammer are doing unfortunatly..
Ah end by the way, google news is seeing some of the items. So the googlebot is stopping by... Wrong working filter? I think so..
Last week we added 5 new blogs, all with there own subdomain, like the current ones.
It seems to me that, by using subdomains, you're telling Google that these are new sites, not new content within your main site (which stands to reason--they are new sites).
Wonder if that's wise, subs have a bad rep
Only if you abuse them. The heaviest users of subdomains are some of the most trustworty and authoritative sites. Google, yahoo, USDA, just about every college and university. Then there are all the blog sites like blogger and live journal that operate with subdomains.