Forum Moderators: Robert Charlton & goodroi
During the same time period Inktomi has "hit" 67 times and MSN 357 times
Has anybody else ever seen anything like this? Or can anyone offer any explanation of why google woul hit my site so many times like that?
David
I started a thread at [webmasterworld.com...]
Are you running phpbb by any chance? It seems like the googlebot is stuck in my phpbb discussion area on two different sites with two diffent hosts. The bandwidth is starting to really clobber me.
I reluctantly emailed google. I'm hoping they will fix the problem but the danger is that they might just turn the googlebot off for the sites then goodbye to being in the index. However, at the current consumption I have to do something.
Chris
Amd just since yesterday the number of hits from googles bit is up to 37,364 and 1.2 Gigs of banwidth and that is just for this month so far! If the trend continues I will have expended over 2 Gigs BW - just for this - by the time the month is over.
I would request that if you do hear back from Google on your email that you would let me know what they say about it.
Like you I fear that the google bot is an "all or nothing" affair. It would seem that if they are legitimately crawling a site that much then they must really like it. But they have had way more time than needed to index every single page I have. Typically I add several pages at least once a week but I doubt that accounts for this. I was doing the same in previous months.
The MSN bot has hit me 357 times at 9.8 megs BW about the same percentage as the google hits so they seem to be normal crawls.
I wonder how many others are seeing this. If anyone reads this and has had similar experience a post would be appreciated.
The google bot has 'hit' 50,249 times using 1.47 G Band so far for this month
Compare to July 64 hits 730 k Bw!
Of course everyone wants to see googlebot in their webstats. But this is getting crazy.
In another thread many seem to think this is happening because of phpbb. BUT that isnt it. It is just a coincidence that they are using phpbb. Maybe the bb does add to google's spidering efforts but this doesnt account for what we are seeing.
My site is only a few months old and the total pages are probably around or less than a hundred. I am not using scripts (except an occasional java) nor am I running any type of bulletin board or forum. But one month I see 0.7M and then the next looks like it will be about 70-75 G
Google's goin exponential on me over here! Ha Ha
EDIT--> on another, newer (about 2 mos old), website I am seeing 18 hits by the googlebot
I had checked the ips with arin whois and they were all google. I saw something in the news that G and Y are in a "my index is bigger than yours" race. It might be possible that G is being a little more aggressive with the bots trying to increase index size. (Which might work out well for me in the longer run since a whole bunch of images from one of my sites just got indexed. Went from 13 to 200+ images indexed for that site.)
cg
I long, long ago (well maybe > 5 years!) put in place a couple of mechanisms to automatically to cap bandwidth consumed by *any* single user, whoever/whatever they are, especially if their behaviour does not look like a normal human, nor even a student up all night on caffiene avoding doing their assignment! B^>
It protects a little against mindless idiots trying to "save time" by downloading all my 10GB of free content rather than clicking on a couple of links (or just trying to steal it), as well as the GoogleBot or Y!Bot, or whatever, running amok from time-to-time.
Indeed, I first put this in place because G accidentally crushed my site (at least) once, and even though they stopped when I emailed I simply didn't want to have to worry about it again.
And I seem to stay fairly up-to-date in at least G's SERPs even though I *know* that I am holding back their bots from time-to-time because I see my software whinging about their IPs.
You can do this at several levels, from "traffic shaping" in the router in front of your Web site, to dynamic filters in your Web server.
Rgds
Damon
You might search for info on WebmasterWorld for "Google and Session IDs and Infinite Loops" which are used to capture cookies as search engines can sometimes get lost in sites designed in those scripts produced by Php and Asp. That might be the reason for the large bandwidth.
There are several discussions on this on WebmasterWorld. Here is one:
[webmasterworld.com...]
In fact my site is small and relatively simple.
The last I checked I had over 50,000 hits fron the googlebot this month. Why would they crawl that many times for a site that has around or less than 100 pages? Very strange.
I had always thought they were supposed to be fairly demure in how frequently they access a site. It doesn't seem stuck in any loops, though, they're just being VERY thorough at crawling my forums and other script-based areas of the site.
I don't mind, I just hope I see some sort of benefit - such as even more pages in their index.