Forum Moderators: open

Message Too Old, No Replies

Spidering large websites of 50000 pages or more

How long does it take for Google to spider

         

thewyliecoyote

6:51 pm on Mar 5, 2004 (gmt 0)

10+ Year Member



We have recently included an additional 50000 pages to an existing site which is already spidered by Googlebot 2.1 roughly once a week.

I do not know how long it will take for Google to spider the extra pages. Has anyone any experience of this. The pages are linked effectively and the pages are correctly setup. Theoretically Google should "love" them - but you never know.

If anyone has any experience of spidering large sites with Google or indeed other SE's please advise.

Many thanks

Rob Jones

claus

7:06 pm on Mar 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



*bump*

I'd like to hear as well - i had a quite large project coming up when Update Florida happened. It's mothballed at the moment.

Here's a related thread from October. At that time i was more or less convinced that a 200K page site would never get fully indexed (i mean indexed as in "shown in SERPS", not as in "crawled")

[webmasterworld.com...]

Also, when starting from scratch, what are your chances? And if you do not have high PR backlinks to start with, is it possible to get beyond indexing of level 1 at all? What's the likely time frame for indexing a few dozen K of newly created pages?

I tend to think that the amount of linking power you could get with such a large amount of pages should trigger some kind of latency in the indexing process - if anybody have facts they want to share about this, i'd like to know.

Sharper

8:29 pm on Mar 6, 2004 (gmt 0)

10+ Year Member



It takes forever.... forever... forever....

Really, it will seem like it. 200K pages isn't too bad, but based on past experience, that'll take about 3-5 months if you have good incoming links (like a resulting PR 6). Even then, you'll find that many (if not most) of the pages are included in the index, but without a cache, title, etc... indicating that Google saw them, but hasn't actually stored them yet.

My suggestions?

1. A good set of sitemap pages. You can fit a lot of links on a 100Kb page, but sometimes you just have to have multiple pages.

2. Additional deeplinks from elsewhere, where possible.

Schneewittchen

2:54 pm on Mar 7, 2004 (gmt 0)

10+ Year Member



i think he means the number of pages, not the pagesize. ive seen pages with over 500.000 pages in G-cache. some of my websites have up to 21.000 sites and i see no problem, all pages are spidered and cached by G-bot - ONLY a question of time

percentages

3:10 pm on Mar 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



2 years ago, I would have said 4/5 weeks......right now unless you have a lot of deep links for spider food then 7/8 weeks to get all 50K pages.