Thanks TJ will have to have a read up Block Rank, not subscribed to WebmasterWorld so can't read the thread.
Thinking about the term though I can see where it's leading.
And yes the chunk of stuff it took does need page rank calculated as it's all been renamed etc.
Block rank bed time reading : [dbpubs.stanford.edu:8090...]
Just checked and I am seeing that I have a ton of pages make it out of the supplemental index.
That has nothing to do with a google change though.
I solved ( as I said before ) my page size problem and a duplicate page problem.
"my page size problem"
what do you mean?
How did you solve the page duplication problem and get ur pages out of supplement results?
walkman: Maybe he used to have pages bigger than 100K.
Many of my pages were well over 100k. Its a forum/cms based site that was using a heavy template. I edited the template to reduce the amount of redundent code and almost all of the pages displayed now are under 40k. Which, by no coincidence, is what most of the webmaster world page sizes are and its well indexed by google.
My page duplication problem was a result of mod_rewrite, I had many pages that were available via several different url's. To fix the issue I added the regular dynamic looking page urls to my robots.txt file. I also went through and double checked my forum/cms software to make sure that it was outputting the new static looking url.
Google ain't crawling like November :)
In November I peaked at 170 pages per second from Googlebot (yes, dynamic database-generated pages, thank goodness for C :) and saw 100+ pages per second for three to four minutes at a time. This crawl I see 20 to 30 pages per second max, without anywhere near the sustain time of November's crawl.
Yahoo, on the other hand, is crawling more aggressively than I've ever seen, but is still using (boneheaded) Inktomi code. One beautiful thing that Inktomi's (boneheaded) code will do is ask for a non-directory when given a directory link. That is to say, if Inktomi finds a link like "/a/b/c/" it'll try to crawl "/a/b/c" and will generate an extra hit as the server redirects to the proper URI with a 301 redirect. Beautiful.
Just bringing this thread back for a quick question.
Has anyone see the pages from this crawl make the index - I have some pages added - but definetly not the amount that were crawled.
Just wondering what others experiences are on this?
I'm seeing pages from two days ago in Google's Index -- site is a forum like (but completely unlike, if you see what I mean) this one.
Yes - I am getting pages added - but this crawl went very deep on some sites that have not been crawled well recently - these pages have not appeared in the index - but more recently crawled pages have.
| This 40 message thread spans 2 pages: < < 40 ( 1  ) |