homepage Welcome to WebmasterWorld Guest from 54.204.58.87
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Is Googlebot Deep Crawling?
I've not seen much activity yet
Kerrin




msg:104659
 5:25 pm on Oct 1, 2002 (gmt 0)

Ayone seeing any deep crawling from Googlebot? I want to make some major changes to my website but, just to be safe, I don't want to do it if the deep crawl has already started.

 

warmasol




msg:104660
 5:44 pm on Oct 1, 2002 (gmt 0)

No, I saw the "fresh"-crawler every day, but no "deep"-crawler. I think he is a little too late.

Grumpus




msg:104661
 6:08 pm on Oct 1, 2002 (gmt 0)

Nope. Just the FreshBot here, too.

G.

MrLucky




msg:104662
 6:13 pm on Oct 1, 2002 (gmt 0)

same here

Kerrin




msg:104663
 6:25 pm on Oct 1, 2002 (gmt 0)

Thanks guys, i'd better get busy ;)

lipskin




msg:104664
 6:42 pm on Oct 1, 2002 (gmt 0)

Googlebot finally started deep crawling one of my PR5 sites about an hour ago, but it's the only site to get deep crawled so far.

warmasol




msg:104665
 1:37 pm on Oct 2, 2002 (gmt 0)

Now I see the "deep"-crawler(216.239.46.*) on my sites. I think he starts his monthly crawl.

Slade




msg:104666
 2:22 pm on Oct 2, 2002 (gmt 0)

Is there any consistancy to these references of "FreshBot" and "DeepCrawler"?

Specifically, IP ranges?

warmasol




msg:104667
 2:29 pm on Oct 2, 2002 (gmt 0)

FreshBot:
64.68.82.*

DeepCrawler:
216.239.46.*

Rugles




msg:104668
 2:33 pm on Oct 2, 2002 (gmt 0)

Deep crawling here for the last 12 hours.

savvy1




msg:104669
 2:33 pm on Oct 2, 2002 (gmt 0)

Seeing activity from both, seems more from the fresh bots

MrLucky




msg:104670
 8:52 pm on Oct 2, 2002 (gmt 0)

Activity has picked up over here -- looking like a deep crawl.

Any difference in useragent between the freshbot and deepcrawler? Or only by IP?

jdMorgan




msg:104671
 9:46 pm on Oct 2, 2002 (gmt 0)

Deepcrawling started 16:40:54 (-0400) on one of mine - index.html is PR5.

I think the fresh crawler reverse resolves to "crawlernn.googlebot.com" and the deepcrawler to "crawlnn.googlebot.com", but both UAs are the same.

Jim

KakenBetaal




msg:104672
 9:16 am on Oct 3, 2002 (gmt 0)

Started on my PR5 site yesterday morning. jdMorgan, I'd tend to agree on the crawl vs crawler theory. Back in Marchish, crawler used to be the imagebot crawler, IIRC.

dukeblue219




msg:104673
 10:58 am on Oct 3, 2002 (gmt 0)

She just arrived on scene at 5 AM this morning, and proceeded to crawl about 200 pages in half an hour. Sounds like a good deep crawl to me.

Mr_Tickle




msg:104674
 12:23 pm on Oct 3, 2002 (gmt 0)

Deep crawling my site right now.

IP address is 64.68.82.*

Made my site changes just in time :)

Grumpus




msg:104675
 12:27 pm on Oct 3, 2002 (gmt 0)

MrTickle - that's the Freshbot you're seeing. Check google in about 24 hours (or less) and you'll see those pages in the index, most likely.

G.

Mr_Tickle




msg:104676
 5:10 pm on Oct 3, 2002 (gmt 0)

Cool, that means Freshbot is deepcrawling my site now :)

KakenBetaal




msg:104677
 10:23 am on Oct 4, 2002 (gmt 0)

Could be, Grumpus, but to me it looks like both ranges are being used for deep crawling:

From 64.68.x.y: 2273 pages requested

From 216.239.x.y: 4608 pages requested

They are fairly evenly mixed up throughout my logfile. I'm not so sure on what I said in a previous message in this thread now.

I only have a meagre little PR5 site with about 250 flat html pages and a phpBB forum, so getting so many pages requested by the fresh crawl seems unlikely.

promis




msg:104678
 9:18 am on Oct 5, 2002 (gmt 0)

64.68.82.* has been deep crawling my site yesterday October 4th.
Have not seen the pages crawled yet on any of www, www2 or www3.
Any idea how long it takes for them to appear?

Slade




msg:104679
 7:02 pm on Oct 5, 2002 (gmt 0)

I have PR 1 or 2 personal site (pretty much non-existant backlinks) that got deepcrawled starting at 5:30 on Oct 4. Still getting hits actually, even though there are only about 2 dozen pages.

martin




msg:104680
 9:25 pm on Oct 5, 2002 (gmt 0)

Started on 2nd of October, a little re-fresh on 4th, and back to crawl on the 5th.

vitaplease




msg:104681
 8:52 am on Oct 7, 2002 (gmt 0)

Just checked my log-stats,

Is it me, or was this deep-crawl one of the most intense in months?

Visit Thailand




msg:104682
 9:51 am on Oct 7, 2002 (gmt 0)

Just curious but do you think it could damage anything if your are constantly uploading pages old and new to the site when googlebots sniffing around your site?

Grumpus




msg:104683
 11:40 am on Oct 7, 2002 (gmt 0)

(From higher up the thread) Cool that the freshbot looks to be going deep on some sites. That could mean a much deeper crawl this month. :)

Uploading pages during the crawl won't hurt anything. If you change a page after it's been crawled, though, Googlebot likely won't find any new links on it and the cached version after the next update won't reflect the newer content. Other than that, I wouldn't worry. If you've got something new to say, say it!

G.

thunderpaste




msg:104684
 5:15 pm on Oct 8, 2002 (gmt 0)

I know this is a bit off topic, but what log analysis software do you guys prefer for seeing what pages googlebot is crawling?

Sasquatch




msg:104685
 5:48 pm on Oct 8, 2002 (gmt 0)

grep to see everything google is doing.

For my real content pages (the ones that I want crawled) I have PHP code that logs the info I want to a mysql db when I get hit with a UA of a SE spider. Then I have a page to do some manipulation of this info on my admin page. It works really slick.

crobb305




msg:104686
 4:53 am on Oct 9, 2002 (gmt 0)

Are you guys referring to the deep crawl for the October update? I am not sure how far in advance the deep crawls occur.

Also, what ip(s) should I be looking for in my logs?

Thanks :)

gilli




msg:104687
 8:15 am on Oct 9, 2002 (gmt 0)

Try looking here:
[searchengineworld.com...] . Don't know how uptodate this is. Also try search these forums for "google ip addresses" or similar.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved