| 8:20 pm on Aug 23, 2010 (gmt 0)|
From what I see, I'd say 50,000 at once should be fine - at least no problem that would bring about major penalties or anything like that. But make sure you've got any major canonical issues handled before you launch. With canonical fuzziness 50,000 URLs can turn into millions of URLs, and for a new site that can kill you.
| 8:54 pm on Aug 23, 2010 (gmt 0)|
OK thanks for the reply Tedster. Just out of curiosity though, what if the number is higher - say 1 million pages?
| 9:00 pm on Aug 23, 2010 (gmt 0)|
I would stage that launch, most definitely. The timing would depend on how fast and how strong the backlink profile grew. You need a good chunk of real PR to support crawling that deep.
| 11:51 pm on Aug 23, 2010 (gmt 0)|
Yeah those are my thoughts exactly...but HOW do you stage the launch? Just block out a certain % of the website with robots.txt and then gradually unblock it? Assuming it's directory based, do you block it out by categories?
Everyone seems to be aware that best practice for large enough sites is to launch it in stages, but I've never heard anyone say what specifically these best practices are. How do you stage a launch of a massive website in a sensible way (using the example of a directory style website with 1 mil pages)?
| 5:51 pm on Aug 24, 2010 (gmt 0)|
/"Just block out a certain % of the website with robots.txt and then gradually unblock it? Assuming it's directory based, do you block it out by categories? "/
That is not good Idea!
Matt Cutts on his recent video said, "Big No" to robots.txt optimization.
Watch this video[youtube.com ], I hope this might give you some idea!
| 6:34 pm on Aug 24, 2010 (gmt 0)|
I think this video is about trying use robots.txt to optimize the crawl in general (i.e. blocking certain parts temporarily of a site that is already up) which I agree is a bad idea. But I'm talking specifically for site launch, which Matt Cutts has said that that it's better to do gradually than all at once if the site is big enough.
regardless, the question still remains - how DO you roll out a large site gradually?
| 2:09 am on Aug 25, 2010 (gmt 0)|
I would assume your database is composed of pages that are built around a link or category ID. So what you would do is empty part of the data from the table, and then add a little in every few days.
My concern for a site launch like this would be uniqueness. If your site is full of 1 million pages of content never seen on the web before, then I think you are doing great. I would look out for pages that are mostly empty or recreations of data found on other sites that is not very unique. If neither of these are true, I would just launch it.