homepage Welcome to WebmasterWorld Guest from 50.19.172.0
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 58 message thread spans 2 pages: 58 ( [1] 2 > >     
Feb Crawl Has Started
The 216's have started
peterdaly

10+ Year Member



 
Msg#: 9090 posted 1:36 am on Feb 6, 2003 (gmt 0)

While discussion has started in some other misc threads, the February crawl is underway. Here is an organized place to discuss it more.

I have one site with googlebot requests from 216.239.46.*

-Pete

 

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 9090 posted 1:43 am on Feb 6, 2003 (gmt 0)

Yes,

Deepcrawler from 216.239.46.* on two of my sites here, too. For reference, both are currently PR5.

Jim

Krapulator

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 1:45 am on Feb 6, 2003 (gmt 0)

Yeah, Ive just seen the little blighter! Grabbed robots.txt and a few pages from my root directory.

jimh009

10+ Year Member



 
Msg#: 9090 posted 2:10 am on Feb 6, 2003 (gmt 0)

Good! I was getting worried that I might be missed. Normally, the deep crawl on my site begins right on the first of the month.

Hopefully I'll see the bugger in my logs tomorrow.

Jesse_Smith

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 2:34 am on Feb 6, 2003 (gmt 0)

Does it start out slow? So far I've only seen one listing in one of my logs (two year old site), out of 10 domains (nine two month old sites). How long does the deep crawl last?

Bio4ce

10+ Year Member



 
Msg#: 9090 posted 3:48 am on Feb 6, 2003 (gmt 0)

Could it be a proxy? Just got hit with 216* and 64*.

amznVibe

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 3:53 am on Feb 6, 2003 (gmt 0)

crawl4.googlebot.com (216.239.46.104) is on one of my sites right now

interesting thing is that the site was down for 30 minutes just before the bot showed up... hope that doesn't hurt anything...

it also didn't ask for robots.txt, I guess its using a cached version from freshbot which was on the site 12 hours earlier?

troels nybo nielsen

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 4:11 am on Feb 6, 2003 (gmt 0)

Me too, PR5 sites.

Welcome to WebmasterWorld, Jesse. On my sites deep crawl usually starts with rather few hits the first day and then come back and finish the job the next day. I have heard about websites where it may take longer than that.

eljefe3

WebmasterWorld Administrator 10+ Year Member



 
Msg#: 9090 posted 4:31 am on Feb 6, 2003 (gmt 0)

Only that that I have to add of interest is that my flash file was grabbed for the first time. Normally it's just the pages themselves that get picked up.

Stefan

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 4:34 am on Feb 6, 2003 (gmt 0)

Yep, it's underway here too... not many hits yet, but the freshbot is grabbing a lot right now as well. Googlebots everywhere...

johnraphone

10+ Year Member



 
Msg#: 9090 posted 4:37 am on Feb 6, 2003 (gmt 0)

crawl7.googlebot.com (216 domain) is crawling my PR 6 web site.

GeorgeGG

10+ Year Member



 
Msg#: 9090 posted 4:49 am on Feb 6, 2003 (gmt 0)

216.239.46.*, 1 personal web site, 3 PR6 pages and 1 PR5 page.

GeorgeGG

[edited by: GeorgeGG at 5:40 am (utc) on Feb. 6, 2003]

quotations

10+ Year Member



 
Msg#: 9090 posted 5:35 am on Feb 6, 2003 (gmt 0)

Two of my very different PR6 sites got hit about an hour apart.

Host: 216.239.46.19 Url: /

Only the top page was taken and there was no interest in robots.txt here either.

Stefan

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 2:12 pm on Feb 6, 2003 (gmt 0)

The bots were having a party at my site overnight. Freshie took 90% of the pages the first while, with a few hits from 216.239, then 216.239 took over and has been staying very busy. Ink dropped by for a few rounds to check out the action and is still lurking about, and a bizarro-bot that disguises itself and doesn't look for a robots.txt crawled the entire place.
Welcome back deepbot, freshie will show you where the bar is. Help yourself to the pretzels.

[edited by: Stefan at 3:35 pm (utc) on Feb. 6, 2003]

uber_boy

10+ Year Member



 
Msg#: 9090 posted 2:27 pm on Feb 6, 2003 (gmt 0)

My site's a PR7 that usually has about 50-150k pages read in during the deep crawl. Several of the deep crawl bots finally started visiting last night, albeit at a rather slow pace. Whereas they usually gobble up pages at a rate measured in thousands/hour, for the past 12 hours it's been a figure in the hundreds/hour. As noted above, the patter is for the deep crawl to start off slow and then ramp up but, if memory serves me correctly, this is a little slower than usual.

coolshop

10+ Year Member



 
Msg#: 9090 posted 2:31 pm on Feb 6, 2003 (gmt 0)

uber_boy, I would agree with your observation. This deep crawl seems to not be picking up steam like it usually does -- yet.

amznVibe

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 2:55 pm on Feb 6, 2003 (gmt 0)

perhaps they are testing some changes and seeing how the algorithms react by doing light reads only for now

taxpod

10+ Year Member



 
Msg#: 9090 posted 2:57 pm on Feb 6, 2003 (gmt 0)

Usually it starts out slow for me, then finds the right pace, then slows down again near the end. But it does seem pretty darn slow right now.

Albaba

10+ Year Member



 
Msg#: 9090 posted 3:00 pm on Feb 6, 2003 (gmt 0)

Yeahh...googlebot slow... :(
now I'am doubt that she will crawl all my new site pages (150k pages)...hope google staff give more power to her :)

peterdaly

10+ Year Member



 
Msg#: 9090 posted 6:14 pm on Feb 6, 2003 (gmt 0)

I had 89 pages deep crawled yesterday (2/5) between 7:25PM US Eastern Time and 8:27PM.

Requested one page so far today (2/6) at 8:39am. Not a sign since.

This is a PR5 site with currently over 50k pages in the index. This makes me nervous.

-Pete

EquityMind



 
Msg#: 9090 posted 6:16 pm on Feb 6, 2003 (gmt 0)

Little bit early this month.......

amznVibe

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 8:14 pm on Feb 6, 2003 (gmt 0)

I'm getting really crazy "touch-and-go" action from the 216.239.* range. Seems like they are using a different crawler to read different pages on one site, sporadically all day... what's the chance that Google has switched the functions of the 216.239 ranges and 64.68 ranges. Either that or some kind of nettique where they are spreading out the bandwidth load over the day?

SubZeroGTS

10+ Year Member



 
Msg#: 9090 posted 8:43 pm on Feb 6, 2003 (gmt 0)

if google got no response from my server earlier today, is it guaranteed to try at least once more sometime?

taxpod

10+ Year Member



 
Msg#: 9090 posted 8:47 pm on Feb 6, 2003 (gmt 0)

SubZeroGTS,

I would think so. Gbot is persistent. Give it time.

WindSun

10+ Year Member



 
Msg#: 9090 posted 10:05 pm on Feb 6, 2003 (gmt 0)

Hmm... Got a bunch of picture crawls, only a few page crawls so far.

Craig_F

10+ Year Member



 
Msg#: 9090 posted 11:31 pm on Feb 6, 2003 (gmt 0)

So, I'm not too late? I usually don't pay attention to the full crawl, but I had 20 pages mostly done, so I uploaded them. I'll tidy up over the next few days.

Stefan

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 12:48 am on Feb 7, 2003 (gmt 0)

My site has only about 60 pages, but it's a PR6, and deepbot has seen them all since 02:00 UTC, Feb 06. When I saw it starting, I put up some more pages fast and then it picked them up a few hours later. Freshbot was working away as well and I have Feb 5 tags showing for a lot of pages. Google rocks.

<added>Webmasterworld rocks too.</added>

Jesse_Smith

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 1:43 am on Feb 7, 2003 (gmt 0)

About how many times does it show up in logs? So far I've only seen this show up once on each of my domains.

216.239.46.204 - - [05/Feb/2003:18:50:39 -0800] "GET /robots.txt HTTP/1.0" 404 645 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
216.239.46.204 - - [05/Feb/2003:18:50:39 -0800] "GET / HTTP/1.0" 200 65291 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

About how long does it spend doing deep crawling?

SubZeroGTS

10+ Year Member



 
Msg#: 9090 posted 2:37 am on Feb 7, 2003 (gmt 0)

GOOGLEBOT CAME BACK! AHAHHAHAFGHAHSdjasoidisdjs;'lk'dfh

thank you God.

Stefan

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9090 posted 2:52 am on Feb 7, 2003 (gmt 0)

In early Jan it went for about 10 days compared to other crawls that were only 3 days. Who knows how long it will go this time. It's early days, and it found you, so it will probably be back to get everything else.

It will show a log entry for every page linked by the time it's done. It might crawl through a few times.

This 58 message thread spans 2 pages: 58 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved