Forum Moderators: open
But I only have about 48 pages on my entire web site.
Slurp/cat seems to be staying out of the areas restricted by robots.txt and my <meta> tags, so that's not the problem (if they crawled every "add to cart" link, my server would have crashed by now...)
I'm at an absolute loss to explain why they need to hit every page an average of 10 times -- it's pulled my default home page 31 times since July 11. It shows a history of hitting the *exact* same page multiple times within a couple of minutes.
I haven't added any significant new content in the last month and nearly all of my pages were already in the index.
Additionally, it appears Slurp is continuing to hit a number of pages for which I have a 301 permanent redirect in place and have had it there for about 6 weeks.
My guess is that Slurp is somewhat confused this month and isn't keeping track of where it's been.
For example, 1338 page requests in the last 24 hours, and 5770 page requests in the last week on one of our e-commerce sites. One thing I've noticed is that it seems to be going deeper into the site than ever before... requesting almost every page on the site, including just about every form of valid dynamic parameter that is linked from somewhere else.
About 20% of the requests over the last week are duplicate.
No signs of additional pages to the index. I'll continue watching this very closely.
it doesn't seem too smart as it repeatedly requests the same files on the same days 1000s of them. ugh.
i may wait a while before i block it or restrict it, though, just to see what happens later this year w/Yahoo ...
Sometimes slurp seems to think I'm just a big kidder and comes back and back and back for non-existent files. Big G's processing of 301 and 404 codes has ink beat by miles and miles from my perspective.
Yeah, that's well over 10 hits per page.
And no, it doesn't include hits on robots.txt.
Seems like a pretty poor way to run a spider to me. At very least, there were 470 hits that could have been directed to some other site...