Forum Moderators: Robert Charlton & goodroi
Today gbot has pulled 4700 pages (so far) requesting 3 pages a second.
I verified its a real bot. Is this normal? My server can easily handle the load but just seems a little frightening and exciting at the same time.
Consider Block Rank
Thanks TJ will have to have a read up Block Rank, not subscribed to WebmasterWorld so can't read the thread.
Thinking about the term though I can see where it's leading.
And yes the chunk of stuff it took does need page rank calculated as it's all been renamed etc.
Block rank bed time reading : [dbpubs.stanford.edu:8090...]
That has nothing to do with a google change though.
I solved ( as I said before ) my page size problem and a duplicate page problem.
what do you mean?
My page duplication problem was a result of mod_rewrite, I had many pages that were available via several different url's. To fix the issue I added the regular dynamic looking page urls to my robots.txt file. I also went through and double checked my forum/cms software to make sure that it was outputting the new static looking url.
In November I peaked at 170 pages per second from Googlebot (yes, dynamic database-generated pages, thank goodness for C :) and saw 100+ pages per second for three to four minutes at a time. This crawl I see 20 to 30 pages per second max, without anywhere near the sustain time of November's crawl.
Yahoo, on the other hand, is crawling more aggressively than I've ever seen, but is still using (boneheaded) Inktomi code. One beautiful thing that Inktomi's (boneheaded) code will do is ask for a non-directory when given a directory link. That is to say, if Inktomi finds a link like "/a/b/c/" it'll try to crawl "/a/b/c" and will generate an extra hit as the server redirects to the proper URI with a 301 redirect. Beautiful.
Has anyone see the pages from this crawl make the index - I have some pages added - but definetly not the amount that were crawled.
Just wondering what others experiences are on this?