homepage Welcome to WebmasterWorld Guest from 54.205.242.179
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 106 message thread spans 4 pages: 106 ( [1] 2 3 4 > >     
Matt Cutts: Adding Too Many URLs Triggers A Flag!
Shouldn't We Be More Careful When Adding New Contents?
reseller




msg:3070794
 10:19 pm on Sep 3, 2006 (gmt 0)

Hi Folks

I recall discussing last year whether adding too many pages suddenly might trigger a flag or some kind of "Sandboxing". And we were guessing at that time.

However, our kind fellow member member Matt Cutts has posted on his blog [mattcutts.com] recently very interesting remark which might confirm what we were guessing:

"We saw so many urls suddenly showing up on spaces.live.com that it triggered a flag in our system which requires more trust in individual urls in order for them to rank (this is despite the crawl guys trying to increase our hostload thresholds and taking similar measures to make the migration go smoothly for Spaces). We cleared that flag, and things look much better now."

So it seems that we should be very careful in future when adding too many pages at the same time, otherwise sandboxing of our new pages would be a high possibility!

Thoughts?

 

crobb305




msg:3070876
 1:57 am on Sep 4, 2006 (gmt 0)

more trust in individual urls in order for them to rank

Could he be talking more about subdomains than subdirectories. Just my guess. Though, a single site adding hundreds or thousands of pages at once might certainly trigger a flag.

Bewenched




msg:3070888
 2:18 am on Sep 4, 2006 (gmt 0)

This is very bad news for ecomm sites. For us we do new part dataloads about every three months or so .. removes old outdated parts and adding new ones as they are released. This would be the case for any major ecommerce site out there.

Stefan




msg:3070896
 2:28 am on Sep 4, 2006 (gmt 0)

So it seems that we should be very careful in future when adding too many pages at the same time, otherwise sandboxing of our new pages would be a high possibility!

Part of the reason that some websites have many thousands of pages is because it was a good way to do well with G in the past. You take a minimal amount of content and turn it into a sh*t-load of individual pages, all carefully inter-linked, thereby pumping it up. I can see there being collateral damage for sites that genuinely need all those pages, but it's easy to see why G would flag that sort of thing.

<edit>Discretion</edit>

[edited by: Stefan at 2:31 am (utc) on Sep. 4, 2006]

europeforvisitors




msg:3070922
 3:04 am on Sep 4, 2006 (gmt 0)

Part of the reason that some websites have many thousands of pages is because it was a good way to do well with G in the past. You take a minimal amount of content and turn it into a sh*t-load of individual pages, all carefully inter-linked, thereby pumping it up.

I think an even bigger reason is the trend toward template-based, keyword-driven, computer-generated sites that spew out hundreds of thousands or even millions of pages with little or no real content. It would make a lot of sense for Google to crack down on that kind of "index spam" (or "index clutter," if you prefer a kinder term).

theBear




msg:3070961
 4:16 am on Sep 4, 2006 (gmt 0)

"I think an even bigger reason is the trend toward template-based, keyword-driven, computer-generated sites that spew out hundreds of thousands or even millions of pages with little or no real content."

Well some of the index clutter is caused by the feeding and storing frenzy of the bot army out there.

Some is caused by site generators (aka CMS), some of the generators are keyword driven, some aren't.

A good size chunk of the clutter is the result of unintended side effects (multiple ways of accessing forums for example).

Some more of the clutter is caused by flubbed up server configurations, cPanel defaults leaps to mind as but one.

Then we have the world's gift to webserving (a MSFT product) that resolves HaX3r.html hAX3r.html to the same "page".

Now some of us programing types say well these are 'puters not dead tree publications. Now where did I put my travel site generator ;) ...

I think in addition to raw numbers of pages you might also wish to consider any large "number" of changes. If you read the patents they hint at such.

decaff




msg:3071012
 5:58 am on Sep 4, 2006 (gmt 0)

I think that you have to consider this from the perspective that Google has very specific historical markers stored for each page(and site) they index...so that if a site has historically been stable with a base set of pages and then adding new pages over time...(at a reasonable rate)...and all of a sudden .. out of the blue...they see a huge number of pages (new urls) showing up for your site...this could trigger some automated suppression filters...and even a manual review..

The context of Cutts statement is that suddenly for a certain site that is a social community type site...there were a huge in flux of new urls...and this caused filters to trip..

Regarding the individual with the ecomm site that changes out parts every three months...does this actually mean that you are adding brand new "volume" of pages...above and beyond your standard page count..or does your page count actually remain fairly constant?

I don't see a problem with changing out old parts to new over some predictable sector driven pattern..(Google does map out sectors for trends as well...and when a site goes bonkers in a sector with well established set of development variables...then filters get tripped)...

reseller




msg:3071025
 6:42 am on Sep 4, 2006 (gmt 0)

On different occasions, Matt has been promoting a specific theme; Natural Growth as to the growth of a site contents and back-links. I understood it at that time as a kind of advice or friendly recommendations.

However, after reading Matt's remarks which I mentioned in my first post, I see that he means business :-)

IMO, what Matt is telling us is:

If we see any sign of unnatural growth on your site, we gonna dandbox ya ;-)

BeeDeeDubbleU




msg:3071036
 7:14 am on Sep 4, 2006 (gmt 0)

It would make a lot of sense for Google to crack down on that kind of "index spam"

Absolutely ... and that is why they are doing it.

UK_Web_Guy




msg:3071081
 9:09 am on Sep 4, 2006 (gmt 0)

I also think he is referring more to subdomains with this comment.

Adding loads of subdomains is a known trick and one which they can obviously flag up.

I'm not saying adding ton's of pages won't also flag something up at Google's end, but I think the context of that comment was in relation to loads of subdomain's appearing off of live.com

Matt did mention something on adding loads of pages, not sure whether it was in one of his recent video's.

But I remember him basically saying don't add 100'000's of pages in one go, it might look suspect, rather add a couple thousand at a time - or something like that.

Basically, if what you are doing is legitimate, don't be afraid to develop your site.

That's my view on this issue anyway.

GaryTheScubaGuy




msg:3071086
 9:14 am on Sep 4, 2006 (gmt 0)

New domains are one thing, adding many pages to an exisiting website with rank will just pass PR.

Aforum




msg:3071225
 1:19 pm on Sep 4, 2006 (gmt 0)

Matt stated in his videos that starting a site with thousands of pages wouldn't be a problem but starting one with millions is another story. Its in the "Some SEO Myths" video.

I would hate to think many people getting penalized for introducing thousands of pages for Google to crawl but that's exactly what many people have done in the last 6 months to correct their site. Mod rewrite, submit sitemap, get flagged?

I hope not.

[edited by: Aforum at 1:19 pm (utc) on Sep. 4, 2006]

crobb305




msg:3071260
 2:14 pm on Sep 4, 2006 (gmt 0)

Again, I think he is likely talking about subdomains. There has been such a big/growing problem with subdomains ranking well because of the pagerank carried to them from the parent domain. There needs to be a process of filtering out the junk from the subdomains that should actually rank based on individual merits.

theBear




msg:3071276
 2:41 pm on Sep 4, 2006 (gmt 0)

"There has been such a big/growing problem with subdomains ranking well because of the pagerank carried to them from the parent domain."

For what it is worth IIRC PR passes by links. So a page is a page.

crobb305




msg:3071360
 4:22 pm on Sep 4, 2006 (gmt 0)

So a page is a page.

I am not sure that is 100% accurate. But then again we have no way of knowing the exact mathematics involved. It certainly seems that there has been an overall preference for subdomains over subdirectories in the ranking, and initial "sandboxing" of webpages. Why would this be if "a page is a page"?

bird




msg:3071369
 4:29 pm on Sep 4, 2006 (gmt 0)

we have no way of knowing the exact mathematics involved.

The fundamental math behind PageRank is published and well known.

crobb305




msg:3071372
 4:32 pm on Sep 4, 2006 (gmt 0)

The fundamental math behind PageRank is published and well known.

We do not know if there have been any changes in "fundamental" pagerank since it was originally "published. Nor do we know exactly how toolbar pagerank relates to "behind the scenes" pagerank.

Nothing is exact and completely "known". Certainly not in science. These discussions would be moot if everyone knew what was "exactly" going on. Further, I was originally suggesting that MattCutts may have been referring to the filtering of large numbers of subdomain pages.

We are talking about thousands of junk subdomain pages being created off of TRUSTED domains by thousands of different users, creating free mortgage or viagra blogs in b*spot or otherwise. This has got to create a major ranking problem, if, indeed, pagerank is "passed" to subdomains exactly the same as to subdirectories.

[edited by: crobb305 at 4:56 pm (utc) on Sep. 4, 2006]

trinorthlighting




msg:3071400
 4:50 pm on Sep 4, 2006 (gmt 0)

Instead of adding thousands of pages at once with an ecommerce site, add them a bit more slowly. Try to add and take away pages daily instead of quarterly.

Remember, dumping a bunch of pages into the system all at once can also set off flags on Yahoo and msn as well.

rustybrick




msg:3071402
 4:55 pm on Sep 4, 2006 (gmt 0)

I have a feeling this link will be allowed...

Matt Cutts commented about this here [seroundtable.com].

Remember that spaces.msn.com or spaces.live.com is up to tens of millions of urls, depending on which search engine you ask. It would be akin to if geocities.com suddenly moved to geocities.somenewdomain.com. So this is not something that a typical site owner needs to think about or worry about if they're not adding hundreds of thousands or millions of URLs very quickly.

So that may help put things in perspective.

crobb305




msg:3071408
 4:59 pm on Sep 4, 2006 (gmt 0)

Instead of adding thousands of pages at once with an ecommerce site, add them a bit more slowly. Try to add and take away pages daily instead of quarterly.

This works great for a single webmaster working on his own website. But, you can't limit how quickly bloggers create accounts/pages/junk on trusted domains.

crobb305




msg:3071422
 5:15 pm on Sep 4, 2006 (gmt 0)

RustyBrick,

Thanks for that link. I think it makes my point.

texasville




msg:3071494
 6:14 pm on Sep 4, 2006 (gmt 0)

>>>>We cleared that flag, and things look much better now." <<<<

Now to me...that was the most interesting statement in a very long time. It says certain things trip filters, filters raise flags and flags GET EYEBALLS!

reseller




msg:3071946
 6:34 am on Sep 5, 2006 (gmt 0)

rustybrick

"Matt Cutts commented about this here.

>>Remember that spaces.msn.com or spaces.live.com is up to tens of millions of urls, depending on which search engine you ask. It would be akin to if geocities.com suddenly moved to geocities.somenewdomain.com. So this is not something that a typical site owner needs to think about or worry about if they're not adding hundreds of thousands or millions of URLs very quickly.<<"

With all due respect to our good friend Matt, I don't think that he is telling the whole story here. And that is very understandable. Wouldn't expect Matt to reveal the detailes of filters and flags ;-)

For example, I would expect a site of lets say 500 pages would trigger a flag if such site suddenly add 1000 pages.

Or another site of 10.000 pages suddenly add another 10.000 pages. I.e it might depend on the relative proportion of pages added at the same time.

Alex70




msg:3071972
 7:32 am on Sep 5, 2006 (gmt 0)

One of my website was available only in english. I have decided to translate it and add to the site another language. My site was about 10.000 pages and I did add another language all at once. The result is that my site is sandboxed since last october. Be very carefull on adding content don't get into the "radar"

reseller




msg:3071982
 7:57 am on Sep 5, 2006 (gmt 0)

Alex70

Thanks for "confirmation" feedback. Adding to what you have mentioned, I wish to recall a thread started by caveman for around 2 years ago. Very interesting relevant readings indeed!

Can a Load of New Pages Hurt an Existing Site? [webmasterworld.com]

Enjoy ;-)

Vampster




msg:3072092
 10:33 am on Sep 5, 2006 (gmt 0)

Matt Cutts: We cleared that flag, and things look much better now.

Funny sentence, don't you think? I could swear someone has been telling us that Google results changes were completely automated...

[edited by: Vampster at 10:33 am (utc) on Sep. 5, 2006]

Chico_Loco




msg:3072159
 12:19 pm on Sep 5, 2006 (gmt 0)

Obviously this "abnormal growth rate" detection system has been quarreled about many time in our land. However, I'm more interested in that this post appears to confirm that larger websites truly do get special attention.

{quote]I know I got over 12 emails from GregP over at Microsoft about the progress of the migration from spaces.msn.com to spaces.live.com, and I checked with the indexing folks at Google...[/quote]

I wonder, if for some reason (perhaps a merger) I needed to change domains, would I be able to email Matty and have him ask the Google engineers to do something about it?!

Anyway. Not adding more than X pages per day has been a rule for me for some time.

walkman




msg:3072168
 12:44 pm on Sep 5, 2006 (gmt 0)

>> We saw so many urls
>> So this is not something that a typical site owner needs to think about or worry about if they're not adding hundreds of thousands or millions of URLs very quickly.

The second statement seems to define "many" as 100k+ pages but we be sure. To further confuse things: Live.com has a bazillion incoming links whereas a normal mom-pop.com is lucky to have a dozen or so.

Aforum




msg:3072169
 12:45 pm on Sep 5, 2006 (gmt 0)

This is funny. Now it's to the point that the issue has been specifically covered by Matt and people still don't believe it.

I would hate to think Google penalizes their own sitemap program.

walkman




msg:3072237
 1:40 pm on Sep 5, 2006 (gmt 0)

I cant edit so here it is: I "got away" with adding 1000+ pages to site that had less than 30 others (about two years ago.) I really, really don't think google will penalize for 1000's of pages. Imagine if you add an "Add," "Rate," "Comment," or "Send" feature and your pages quadrupled? I hope google has seen this before and is aware...

This 106 message thread spans 4 pages: 106 ( [1] 2 3 4 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved