Forum Moderators: Robert Charlton & goodroi
Today gbot has pulled 4700 pages (so far) requesting 3 pages a second.
I verified its a real bot. Is this normal? My server can easily handle the load but just seems a little frightening and exciting at the same time.
I emailed Google and told them to behave or be banned.
They emailed back an apology, and all has been well since.
Their bots can be as buggy as their SERPs. If you don't tell them the problem, they may not notice it themslves.
[webmasterworld.com...]
Maybe they are trying to build an even bigger database? I never thought I would see them digging that deep on cgi generated pages?
I've got a couple of pages that went URL only on a site. One was definitely something wierd on their end - it was a duplicate listing of the same URL/page on the SERP - repeated twice, so one was normal and the other was URL_only. I suspect that's a side effect of something they're doing "differently" in some way.
The other page is my fault for sure - accidentally put the same text on two pages, widgets-2.html and widgets-3.html instead of the final copy of the pages. All they did was have the widgets-2.html page go URL_only and the other is fine, as is the rest of the site. Rather, will be fine if it ever gets out of the "sandbox."
My gut feeling, which I can't shake and can't figure out, is that they're doing something different in the way they're crawling sites.
Could also be a panic crawl to sort out a problem.
I have a site where a site:www.example.com search seems completely broken - loads of URL only listings, only returns 600 results even with &filter=0 and doesn't return the home page or any top level pages even though they are indexed and ranking well in other Google searches.
they're doing something different in the way they're crawling sites.
I know exactly what you mean Marcia - I felt the same way about it. Couldn't spot any specific pattern though.
Actually, it looks like old-skool deepbot of about 2 years ago, but with some tweaks. I wonder if they dug some old code out?
Back to a monthly re-index maybe?
TJ
Actually, it looks like old-skool deepbot of about 2 years ago, but with some tweaks. I wonder if they dug some old code out?Back to a monthly re-index maybe?
With the index doubling in size recently, and as Mr Speed said with Yahoos crawls, and now from what I see MSN and AskJeeve page counts in the server stats Google needs to do this.
On subdomains the crawl is even crazier multiple visits per day
On a client site Googlebot ate up 13k pages.
Sweeeeeeeeeeettttt
Still has someway to beat Yahoo which is all day everyday. Infact I just started to put up some new stuff and yahoo was taking it before I had even finished tweaking the pages, i'm frankly amazed it found it it's almost like it spiders real time as soon as it finds a link off it goes. Rather than collecting a list of links which it intends to crawl later.
Googlebot is crawling like it did in November - just before the index size increased to 8 Billion.