homepage Welcome to WebmasterWorld Guest from 54.197.94.241
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 176 message thread spans 6 pages: < < 176 ( 1 2 3 4 [5] 6 > >     
Gbot running hard
ncw164x




msg:167902
 9:04 am on Sep 23, 2004 (gmt 0)

googlebot requesting between 2 - 5 pages a second, not seen this type of spidering for a long time

 

Kirby




msg:168022
 5:01 am on Sep 29, 2004 (gmt 0)

>Geeze - well, I have a new site I launched 2.5 weeks ago that I submitted to Google last week and I haven't seen any signs of gBot on my door step yet.

Put a new site up 30 hours ago with one PR5 link to it. Gbot crawled it and cache showed up within 12 hours. Then new fresh tag 8 hours after that.

willybfriendly




msg:168023
 5:03 am on Sep 29, 2004 (gmt 0)

t's Gbot on Viagra,... or Levitra,... or Cialis,... or Vardenafil,... or SuperViagra or ... anyway, you get the idea. Gbot is running hard for long time.

If that's true, does it mean that we are getting sc****d?

WBF

jnmconsulting




msg:168024
 5:24 am on Sep 29, 2004 (gmt 0)

Running hard, It's an orgi...7 diff google bots spidering one of my sites since 4 pm

cabbie




msg:168025
 7:02 am on Sep 29, 2004 (gmt 0)

>>>If that's true,...

:)

Powdork




msg:168026
 7:10 am on Sep 29, 2004 (gmt 0)

If that's true, does it mean that we are getting sc****d?
More than likely, most of us will think so.

No, we will not get a kiss afterwards, or even a BackRub.

Staffa




msg:168027
 7:19 am on Sep 29, 2004 (gmt 0)

One of the new 66.249.xx.xx numbers is on it's 28 visit since yesterday.
Not to mention the countless visits of many other numbers in that range.

In that time period it asked 4 times for robots.txt and still crawls a DIR that's off limits.

Whatever their 'panic crawling' they better get their act together.

Josk




msg:168028
 9:25 am on Sep 29, 2004 (gmt 0)

Have had a site that I've been trying to get into Google for ages. Right at the moment I'm not sure why its being spidered, but I think might due to a link from somewhere else.

Its being spidered hard as I speak... :)

ruserious




msg:168029
 10:27 am on Sep 29, 2004 (gmt 0)

Here's something I don't understand:

Why are they hitting everybody that hard? With Billions of websites in their index one could assume they would be able to easily spread it out over enough websites that a single site doesn't get so strong at one point in time.
If they spider a site with 1 request in 2 seconds, they'd still be able to get > 2.5 Million pages over 24 hours from a single site. Why request 5-6 pages per second from the same website? Are we at the point where there are more webcrawlers than webservers?

I really hope this won't become standard-behaviour. Some sites have scripts that prevent people from downloading complete sites by serving 503s if too many requests come in during a short time period...

GerBot




msg:168030
 10:34 am on Sep 29, 2004 (gmt 0)

friendly tip,
what your bandwidth limits closely - I just discovered a site of mine (large content/low visitors) had run out of bandwidth with most being a result of the Gbot.

Vork




msg:168031
 1:07 pm on Sep 29, 2004 (gmt 0)

I bet hundreds of webmasters who are watching this post closely would do anything just to get a confimation from GoogleGuy (those were the days...) that this might herald the end of sandbox yoke :)
c'mon googlebot crawl deeper - I don't even mind exceeding my bandwidth for the month as long as this torture is over :)

Critter




msg:168032
 1:09 pm on Sep 29, 2004 (gmt 0)

If they spider a site with 1 request in 2 seconds, they'd still be able to get > 2.5 Million pages over 24 hours from a single site.

Huh?

There's 86,400 seconds in a day. One page every two seconds would be 43,200 pages in a day, not 2.5 million.

You're a little off. :)

petehall




msg:168033
 1:27 pm on Sep 29, 2004 (gmt 0)

If they spider a site with 1 request in 2 seconds, they'd still be able to get > 2.5 Million pages over 24 hours from a single site.

Huh?

There's 86,400 seconds in a day. One page every two seconds would be 43,200 pages in a day, not 2.5 million.

You're a little off. :)

I think the keyword here is site and not page?

He said spider a site with a single request every 2 seconds. If there could be 43,200 sites spidered at an average of 57 pages per site that would equal 2.5 million pages.

Having said that, I myself am a little confused about the statement and its true meaning :-\

I presume an entire site can not be indexed through a single request... :-)

Critter




msg:168034
 1:47 pm on Sep 29, 2004 (gmt 0)

Nope, he said you could get 2.5 million pages in a 24 hour period *from a single site*.

Ain't gonna happen at that rate.

ddogg




msg:168035
 4:58 pm on Sep 29, 2004 (gmt 0)

I haven't noticed an increase in Googlebot activity on my sites. Same as usual, bummer..

MrSpeed




msg:168036
 6:14 pm on Sep 29, 2004 (gmt 0)

To add to the comment about trying to spider pages that don't exist.

Googlebot is trying to spider pages that don't exist on the domain but do exist on a domain on the same server and thus the same IP I think.

66.249.65.130 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

a104u2nv




msg:168037
 6:22 pm on Sep 29, 2004 (gmt 0)

First, I would like to say hi to everyone this will be my first post even though I have been reading these boards for quite sometime.

Googlebot blasted my site yesterday after not visiting the site since the 14th of Sept. We were starting to get a bit worried about it but I kept coming here and reading about others that were getting the same treatment (not visiting or visiting very little then out of the blue a very deep crawl).

Just wanted to chime in and more or less introduce myself to everyone and share what Googlebot did to our site today.

ruserious




msg:168038
 6:47 pm on Sep 29, 2004 (gmt 0)

Yes, Critter, you're absolutely right. I was off by a factor of 60. *embarassed*

But wouldn't 40.000+ pages a day per site still be enough? They've been at it for a week (?), so that would have been enough to sufficiently spider even the larger sites.

idf03




msg:168039
 7:05 pm on Sep 29, 2004 (gmt 0)


Googlebot is trying to spider pages that don't exist on the domain but do exist on a domain on the same server and thus the same IP I think.

No, definitely requesting pages which exist on sites I link to on completely different domains and servers.

I can even see this happening between my sites.

Site 1 is getting requests for documents which only exist on Site 2. Sites 1 and 2 are on completely different servers in different servers/ISPs/locations/IP ranges.

What I can say, is that the links are of the type: link.php?url=www.abcd.com

sblake




msg:168040
 7:45 pm on Sep 29, 2004 (gmt 0)

SERPS dancing around pretty significantly in the areas I monitor right now-- different results from search to search.

idoc




msg:168041
 9:15 pm on Sep 29, 2004 (gmt 0)

"links are of the type: link.php?url=www.abcd.com"

provided that the bots were inappropriately attributing these dynamic redirects as belonging to site 1 and not site 2 where the content actually resides on site 2... then now the bots now have to determine *if* the page is on site 1 or not. I suspect the site 1 serp will be delisted and *hopefully* site 2 will now show the serp *and* get the appropriate p.r. transfer it is rightfully due from the incoming link from site 1.

idf03




msg:168042
 9:48 pm on Sep 29, 2004 (gmt 0)


provided that the bots were inappropriately attributing these dynamic redirects as belonging to site 1 and not site 2 where the content actually resides on site 2..

Just to confirm - The link from Site 1 is to site 2 root only, not to any document on site 2, but gbot is looking on site 1 for documents which exist on site 2, but are not linked directly.

If I had a link to webmasterworld, I would be getting requests for control panel, site search, glossary etc.

g1smd




msg:168043
 10:32 pm on Sep 29, 2004 (gmt 0)

#52: >> Was still wondering though ... how long till the cache of the new web page, or newly spidered webpage shows up in the google index? <<

A page modified on Sept 8th, and cached daily, showed up in the index for new search terms from new content on the page within 48 hours, but was still findable for search terms no longer on the page until only 3 days ago (even though the cache reflected the new content, the snippet still contained the old content, when running a search for the old content).

On the day that it was no longer findable for old content, Googlebot had shifted one hour earlier in its spidering compared with the time of arrival daily for the previous 3 weeks or more. Additionally, a fresh date was included on the day of the change (even though the fresh content had been online for 3 weeks, and had been indexed and cached daily for 3 weeks). Until then the new content result had not included a fresh date.

cabbie




msg:168044
 1:39 am on Sep 30, 2004 (gmt 0)

Welcome a104u2nv and Thanks for your contribution.:)

darqSHADOW




msg:168045
 3:46 am on Sep 30, 2004 (gmt 0)

GoogleBot has been tearing my site up, as well.

Over 10k pages cached, and 320MB of bandwidth used in 4 or 5 days straight of constant crawling. My entire website uses phpBB2 as a backend, and the forums reported this as of yesterday:

Most users ever online was 238 on 27 Sep 2004 06:18 pm

So its been going at me quite hard, it seems. Hopefully the new linking of the forums has caused the crawler to get my whole site, instead of the little 35 pages it crawled before.

DS

webdude




msg:168046
 11:48 am on Sep 30, 2004 (gmt 0)

provided that the bots were inappropriately attributing these dynamic redirects as belonging to site 1 and not site 2 where the content actually resides on site 2...

This is the crux of what is happening I believe.

So its been going at me quite hard, it seems. Hopefully the new linking of the forums has caused the crawler to get my whole site, instead of the little 35 pages it crawled before.

I have the exact same scenario. I have a forum on one of my sites that has never been fully crawled. Recently every page in the forum was crawled. In the past, a few pages of the forum would actually make it in the SERPs, but only after a month or 2. I check the SERPs now and it seems that about 30% of the forum pages are listed. That is way up since last month. Also some of these pages are very recent.

WebFusion




msg:168047
 5:38 pm on Sep 30, 2004 (gmt 0)

Personally, I think they amy have finally solved the space limitations of their old system, and are doing a massive re-crawl of the entire web to recalculate the whole thing based on a new algo.

Stand by for the holiday cheers and jeers...if history is any indication, google's about to shake things up in a big way ;-)

g1smd




msg:168048
 7:32 pm on Sep 30, 2004 (gmt 0)

It will be interesting to see if the old message changes anytime soon:

>> 2004 Google - Searching 4,285,199,774 web pages

darqSHADOW




msg:168049
 7:45 pm on Sep 30, 2004 (gmt 0)

Well, new pages have been added to the index from my site now. For the past few months I've only had 35 (sometimes 34) pages indexed by GoogleBot. As of today I now have 552 pages indexed, most of them from my forums (which should help our rank, since my forums are full of my primary keywords).

My site has also moved up in many of our targetted results, which is nice. Hopefully I can track this change tonite and determine exactly where I moved up, and if any moved down, etc.

DS

jnmconsulting




msg:168050
 7:47 pm on Sep 30, 2004 (gmt 0)

Interesting, 90 million pages have been dropped out of the index by google since friday of last week. I maybe wrong...This is using the search term "+the" without quotes, it was 5,809,000,000 now its 5,710,000,000

kashyap rajput




msg:168051
 7:37 am on Oct 1, 2004 (gmt 0)

I have noticed some deep crawling of googlebot with some new spider. Msnbot is too crawling my website

kashyap

cdog863




msg:168052
 9:14 pm on Oct 1, 2004 (gmt 0)

I have been getting hit hard too by the googlebot..

Recently (it's a new site) they have sucked up over 250 pages. (even my forum)

The only other thing I'm wondering now is when am I gonna get a PR... jeesh

This 176 message thread spans 6 pages: < < 176 ( 1 2 3 4 [5] 6 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved