Welcome to WebmasterWorld Guest from 54.167.245.235

Forum Moderators: anallawalla & bakedjake

Message Too Old, No Replies

Starting a site with millions of pages (built on APIs)

     
4:46 pm on Feb 7, 2012 (gmt 0)

New User

5+ Year Member

joined:Apr 15, 2009
posts: 14
votes: 0


Hey guys,

I'm in the process of developing a site that will probably have millions of pages, thanks to APIs with millions of entities. Most definitely about 99% of the content (companies/services/products/places) will already be published somewhere on the web.
Will I run into massive duplicate content issues? Even if I combine the content from different sources so it won't look like an exact 1:1 copy?

Now just concerning the business listings and from a search engine view, the site drills down like:
    homepage->state->city->category->business
    homepage->category->business (tens or hundreds of thousands businesses with huge pagination in this case)

and of course there's a search form, too.

I assume I should use noindex,follow for the second one?
Should I list all its categories on the business listings page? If so, should I also link them back to the categories? Not sure about the link juice here.

[edited by: tedster at 6:06 pm (utc) on Feb 7, 2012]
[edit reason] moved from another location [/edit]

7:07 pm on Feb 7, 2012 (gmt 0)

Moderator This Forum from AU 

WebmasterWorld Administrator anallawalla is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 3, 2003
posts:3744
votes: 13


I have worked with some very large national directories and can offer this - Google will take some time to index it all (depends on how many millions), so a good sitemap.xml index method will help to get the more important pages indexed first.

Search engine visitors will come via category based phrases, so you don't want to block that path. I don't see a need for two hierarchies. The eventual business profile page should have only one instance.

You may have omitted one step - "businesses", which lead to individual "business" listings.

Directory users (direct visits) will use the internal search primarily and might browse geographically. You may want to consider how and where to use noindex,follow.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members