Forum Moderators: open

Message Too Old, No Replies

Spidering Huge Sites

Indexing huge sites beyond 100k docs.

         

adfree

1:55 pm on Jan 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My 2¢:

Startup Situation:
Navigation crawl-friendly.
Text links only, 1-2 word anchors.
Avrg. doc size 30k (never bigger than 45k).
CSS used.
Dynamic content rewritten URL's using IIS rewrite.
Never more than 100 links per page (incl. sitemaps).
Overall size 170k docs in up to 5 levels.

Progress:
Site found (third party link), spidered 3 times with 20 hits after 3 days.
GBot hovering across files within top level for two weeks.
Doubling visit frequency all two to four weeks.
Pounding deep after 8 weeks, hundrets of visits a day, between 3k to 5k hits.
After three months after start 10% of entire site indexed.

What to expect next? How to help progressing more? What's your experience?

Thanks for any contribution, cheers, Jens

adfree

10:22 pm on Jan 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Forgot to mention this:

Inbound links coming from more than 1 site with PR higher than 3 each to the homepage only.

Cheers, Jens

ThomasB

11:17 pm on Jan 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd wait till more sites are indexed. But more good links should speed up the process.