| 1:06 pm on Oct 31, 2004 (gmt 0)|
referrals? Or GoogleBot visits?
Neither is uncommon.
| 10:06 am on Nov 2, 2004 (gmt 0)|
O Man you're being hit badly! :>) Take it easy its common.
| 5:54 pm on Nov 2, 2004 (gmt 0)|
You are not alone with the enormous amounts of requests.
| 5:58 pm on Nov 2, 2004 (gmt 0)|
We get that per day here.
| 6:24 pm on Nov 2, 2004 (gmt 0)|
Yep, Google sends around 30K per day to me, and expecting a *large* increase in traffic once the new links kick in.
| 7:12 pm on Nov 2, 2004 (gmt 0)|
I'm getting the same thing, this is the second in the last 30 days.
| 7:33 pm on Nov 2, 2004 (gmt 0)|
We got hit so hard by GBot today that all our stores are offline - inadvertent Denial Of Service
| 7:34 pm on Nov 2, 2004 (gmt 0)|
i have never been hit so hard, why all of the action?
| 9:13 pm on Nov 2, 2004 (gmt 0)|
One of my sites had more Gbot hits yesterday than the whole month of October.
| 10:38 pm on Nov 2, 2004 (gmt 0)|
Googlebot crawling like crazy here. 2 gig in less than 2 days. COOL! Maybe googlebot will be able to keep up with AJ and MSN who have been outcrawling (and referring more) than Google in the past few months.
| 10:58 pm on Nov 2, 2004 (gmt 0)|
My site has been in the sandbox, or whatever you prefer to call it, since I submitted in June. G was only coming around a couple hundred times a month. My site is 1400-1500 pages. In the last couple of days, it has visited 1501 times, and has now indexed 885 pages.
Now to work on more incoming links!
| 11:18 pm on Nov 2, 2004 (gmt 0)|
This might be somewhat off topic, but I was not able to find any information while performing a preliminary search of the forums.
How does Google acctually index a site, I understand there are multiple bots that perform diff tasks, find links, index the page...etc. Is there any documentation on this. what is the order of occurance by each bot?
| 11:26 am on Nov 3, 2004 (gmt 0)|
The darn thing is doing "in" my site :-
looking through the stats I get blasts of activity
up to 20 requests a second!
This is hurting too much!
I like google...but this is silly!
I keep having to ban and unban that IP!
on IP 18.104.22.168
| 6:47 pm on Nov 3, 2004 (gmt 0)|
pipster2004, the crawl team is looking into it. We don't want to crawl so hard that you have to take action like that.
| 7:09 pm on Nov 3, 2004 (gmt 0)|
One thing ive noticed, when something big is in the offing Googleguy pops up.
Is it me? Am I paranoid? Or has anyone else noticed this?
| 7:45 pm on Nov 3, 2004 (gmt 0)|
[BEGIN COMPLETE SPECULATION]
Personally, I think that google has finally developed a system that has overcome the space limitations of their previous version, and have now begun a full crawl using a newly developed crawler (that attempts to evaluate the speed/capacity of a site's server on the fly for maximum indesing speed) in earnest to rebuild their entire index from the ground up.
I think in the next 3-6 weeks there will be both a MAJOR update, as well as an release by google saying "now searching XXX billion and/or trillion pages".
| 7:51 pm on Nov 3, 2004 (gmt 0)|
Is there an imposter Googlebot roaming around?
What's with the Mozilla user-agent?
Why isn't it requesting my "robots.txt" files, anymore?
| 7:57 pm on Nov 3, 2004 (gmt 0)|
200 pages per second crawled...
Not even breathing hard :)
| 8:12 pm on Nov 3, 2004 (gmt 0)|
3 million pages crawled...this morning. However this is spread across several servers on several IPs so it's not painful, just surprising.
| 8:49 pm on Nov 3, 2004 (gmt 0)|
I concur with webfusion's speculation. Something's up. An additional point is MSN may soon graduate its techpreview. IMO they wouldn't want to face comparative reviews showing MSN's 25 billion (or whatever) pages to Google's 4 billion. ... at least not without a way to compete in the numbers game.
| 8:54 pm on Nov 3, 2004 (gmt 0)|
I have had a new site crawled and now ranking in under two weeks now - visiting daily at the moment.
A few of my older sites have been crawled deeply as well!
| 9:08 pm on Nov 3, 2004 (gmt 0)|
Here is a quick point and question:
One of our sites was getting hit hard and fast by G. It seemed to build throughout October. Then on 11/1 it all but stopped. 11/1 G requested about 3% of what it was on 10/31. Ok, now the embarassing part - I made a small (really small) change to the index page on 10/31 that cause it not to validate. Could the lack of the code validating inhibit G and the other bots?
| 9:20 pm on Nov 3, 2004 (gmt 0)|
We have 250,000 pages on one of our sites but only see 6 or so pages per second at peak from GoogleBot. I'm thinking that our database is a limiting factor and that if I put in more RAM it could handle closer to the 200 pages per second reported here. Two questions: 1.)Do you think GoogleBot figures out how fast the machine is and backs off if it can't handle it? and 2.)if the machine was faster would GoogleBot get more pages (Google only reports between 100,000 and 125,000 with site:domain.com for the site in question)?
| 9:23 pm on Nov 3, 2004 (gmt 0)|
Oh, one more question. Does anyone know if speed of the machine is factored in to Page Rank?
| 9:52 pm on Nov 3, 2004 (gmt 0)|
66K pages this week.
based on Googleguy's comment we can call this a HARD CRAWL.
| 10:02 pm on Nov 3, 2004 (gmt 0)|
24k first two days of month on one site compared to 7k all last month....
| 11:29 pm on Nov 3, 2004 (gmt 0)|
There's been some interesting speculation over the last month on why GoogleBot has so much energy. No one seems to have caught on. It's not 'panic crawling' or somesuch nonsense. Google have simply upgraded their infrastructure thanks to a cash injection.
I run a search engine, and besides your various algo's, the two most important things are the size of your index, and its freshness. If I were google, that's the first thing I'd throw money at if I had spare cash and was worried about Microsoft and Yahoo on my heels. I'd upgrade my crawler farm and add massive capacity to the servers that carry my indices. Then I'd crawl as hard and as deep as possible.
And I'd make sure I have a small team lurking on the boards checking if webmasters start squeaking about bandwidth and load - as above. I may even contact the owner of the board, and call in a favor to bump the 'Google hits' discussion to the home page. ;)
| 11:38 pm on Nov 3, 2004 (gmt 0)|
I would bet that Google is already doing everything you just mentioned and much much more.
| 11:51 pm on Nov 3, 2004 (gmt 0)|
Google basically frooze my server for a while, I saw up to 50 scripts running simultaneously all hit from the same googlebot IP. In the meantime it spiders 10s of 1000s of pages but only a couple of 100 urls not previously spidered. So I'm not even gonna get new pages indexed for my hassle :(
| This 96 message thread spans 4 pages: 96 (  2 3 4 ) > > |