homepage Welcome to WebmasterWorld Guest from 54.166.173.147
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Best way to avoid sandbox with huge site launch?
2 Million pages before even any real content is added
JeremyL




msg:749754
 7:40 pm on Jul 18, 2005 (gmt 0)

I am looking into launching a review site of sorts and I assume I will be sandboxed from the start like most all but I want to keep the damage as low as possible. From what I have seen, I do subscribe to the fact that the growth rate of a site does have a factor in sandboxing along with many other factors.

I have almost nailed down the site structure and it will be something to this effect.
domain.com/brand/state/city/ with of course each dir having it's own index page targeted towards the brand and it's local outlets.

Based on the numbers I have run with all cities in the US, just the directory structure alone will create close to 2 Million pages of navigation content. Even if I cut the cities down by 3/4th to only cover semi decent sized cities, it will still have 1/2 million pages or if I did REALLY deep cuts maybe 100K pages. This is before even adding the reviews into the mix.

I could launch tommorrow with that many pages but I just don't know. I have thought about setting up the system so it only show links to brands, citys, and states when an actual listing has been added to the database. I bought a list from a data source but they need to be scrubed before each goes live. Scrubing the listings to the database manually is going to take a long time anyway. I figure 100 new listings a day depending on how many hours I put in.

Going in this direction I could work on scrubing a single city at a time so not as to add extra directory pages until that certain city is done. 100 pages a day I would assume would look allot more natural then 100k-2Mil from the start.

So what are others opinions on this. Is all this worry for nothing?

 

BeeDeeDubbleU




msg:749755
 8:39 pm on Jul 19, 2005 (gmt 0)

You will have to design and write for people instead of the engines.

Jermey, check what you wrote a couple of weeks ago, [webmasterworld.com...]

I take it you have changed your mind?

JuniorOptimizer




msg:749756
 10:39 pm on Jul 19, 2005 (gmt 0)

Your worry is very real. I think attempting to launch 2 million pages at one time is suicide.

JeremyL




msg:749757
 2:14 am on Jul 20, 2005 (gmt 0)

BDD,

The question I was answering was

how to optimize if the results will be always different from user to user?

The big word being "if". I do not believe the serps are to the point to where the only person you should be writing for is the user. They are still predictable based on certain principles. Give it a few years and that may be true, but for now, both what the user and search engine are expecting have to be taken into consideration.

Nuttakorn




msg:749758
 6:09 pm on Jul 20, 2005 (gmt 0)

Google might still take time for indexing 2 million pages. Your content could not show in the listing within a couple weeks.

ken_b




msg:749759
 6:54 pm on Jul 20, 2005 (gmt 0)

JeremyL; I have no experience with a site this size, but I still have a couple questions/comments.

I could launch tommorrow with that many pages

How many pages, 100,000 or 2,000,000?

I have thought about setting up the system so it only show links to brands, citys, and states when an actual listing has been added

This sounds like a good idea, especially if you start with the 100,000 and added a bunch of pages each day.

Can you make a obvious notice that new content is added daily to encourage folks to return and see what's new?

I'm assuming that on a site this size you'll have an internal search. If so, can you coordinate the new content with the unsuccesfull search results? If so, can you automate a message on the results oage saying thanking the person for the inquiry and telling them the content they are looking for is coming soon, or something similar?

Rollo




msg:749760
 7:42 pm on Jul 20, 2005 (gmt 0)

I launched a large site about two months ago with 10,000 pages (which until I read your post I thought was a good number of pages). At any rate, all pages were indexed almost immediately by Google, but are still sandboxed. MSN indexed a few hundred pages within about three weeks and they went right to the top of the SERPS. Yahoo has managed the laborious task of idexing the homepage. SERPS are still pretty anemic about two months out. PR debuted at 6 a few days ago. Links are growing at a healthy clip.

I think it was a mistake for me to wait to launch until the site was "ready". I feel I should have launched with a bare minimum and built slowly. I'd be in much better shape at the moment I'm sure. With 2 million pages, you really should have launhced a long time ago I think.

Google likes to see action, while Yahoo moves at the speed of molasis. I won't be waiting to launch again until everything is perfect, after all these are websites, not the space shuttle.

JeremyL




msg:749761
 8:21 pm on Jul 20, 2005 (gmt 0)

If so, can you automate a message on the results oage saying thanking the person for the inquiry and telling them the content they are looking for is coming soon, or something similar?

Very good idea. It actually just gave me another idea. The main way people will find locations is via a zip code (maybe city) search. I can load 100% of the data into the search function and write a script to release 100 listings a day to the directory pages. The directory structure is actually htaccess created, so the SE's can see the domain.com/dir/dir, but the people doing the search will do so on domain.com/listing.php?var=2452 which I can deny in the robots file.

4string




msg:749762
 3:15 pm on Jul 21, 2005 (gmt 0)

What about launching the site as you have it, but block Google from most directories and pages with robots.txt. Then, maybe once a week, unblock a directory in robots.txt. Would that work? I don't know from experience, but it sounds like a decent idea. Then, you could let other engines like Yahoo get the full site and not scare off Google too badly.

I'd love to know if this works because I might be in a similar position soon.

JeremyL




msg:749763
 5:29 pm on Jul 21, 2005 (gmt 0)

I've always been fearfull of doing that for pages I know I will want to rank good later. That and that would be one huge robots file.

Gorilla




msg:749764
 5:35 pm on Jul 21, 2005 (gmt 0)

The directory structure is actually htaccess created, so the SE's can see the domain.com/dir/dir, but the people doing the search will do so on domain.com/listing.php?var=2452 which I can deny in the robots file.

This is a mistake. People will make links to the URLs they see and as you disallow spiders access to those pages, you will not get ranking benefits from those links. Pages should only be visible under one address.

JeremyL




msg:749765
 6:18 pm on Jul 21, 2005 (gmt 0)

Yea, I didn't really describe exactly how it will be. The searh results will be domain.com/search.php?search=blah, but the result list will actually link to the domain.com/location/listing.htm. There is no way to get around having the search results be dynamic looking and I don't want them to be. But the actual listing and listing reviews will be the same as if they found it through the directory.

europeforvisitors




msg:749766
 9:07 pm on Jul 21, 2005 (gmt 0)

Your worry is very real. I think attempting to launch 2 million pages at one time is suicide.

Especially if they're just empty vessels waiting to be filled by users.

eyezshine




msg:749767
 9:53 pm on Jul 21, 2005 (gmt 0)

I would just put a meta noindex tag on the pages that don't have content yet and set it up so the noindex tag goes away when the page is updated with content.

Google will index the pages as soon as the noindex tag comes off. I did this to a link directory with many categories but few links and it worked great. As soon as links were added to a category, the noindex tag was changed to index and google indexed the page 2 weeks later.

BeeDeeDubbleU




msg:749768
 6:35 am on Jul 22, 2005 (gmt 0)

Especially if they're just empty vessels waiting to be filled by users.

Yeah! This sounds like another of those really interesting sites :(

phantombookman




msg:749769
 6:48 am on Jul 22, 2005 (gmt 0)

I've had several sites that have been sandboxed and some that have not.

The biggest factor in being sandboxed is the theme/subject (or peceived theme by G) if it triggers the sandbox then you're in, links, size of site etc will not come into it.

GlynMusica




msg:749770
 2:45 pm on Jul 22, 2005 (gmt 0)

If the site is any good people will link to it. 2 million pages is allot of site and it will get important and Google will note that.

Why don't you start with 10 of the biggest cities and see how those pages work out, then roll it out accordingly. That way you can see how effective the site is in terms of converting customers.

How much better would the web be for users without databases.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved