homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 47 message thread spans 2 pages: 47 ( [1] 2 > >     
Bot out walking

 12:00 pm on Nov 4, 2002 (gmt 0)

Well, despite the false starts and confusing with the fresh bot, GoogleBot is out in no un certain terms today.



 12:34 pm on Nov 4, 2002 (gmt 0)

Hi Brett

I take it that this is the main bot not the fresh bot? NET: Google Inc. RDNS: crawl5.googlebot.com


 12:48 pm on Nov 4, 2002 (gmt 0)

Correct, clickclick. That's the right one for the deep crawl. Actually, the '121' can be just about anything - they own the whole block (or close to it - I forget offhand and suck at keeping good notes).

Confirmed, though. She's walking.



 1:21 pm on Nov 4, 2002 (gmt 0)

Yes I too can report some deep crawling from the 216 IP range. It has been a longer wait this time.


 1:49 pm on Nov 4, 2002 (gmt 0)

How can I tell the difference between the fresh bot and the deep crawling bot?


 2:14 pm on Nov 4, 2002 (gmt 0)

Just wanted to confirm that I've spotted hits from:


 2:59 pm on Nov 4, 2002 (gmt 0)

I too, have finally been visited by Googlebot this morning! Although she only grabbed my robots.txt and index page, I'm still happier than ever that she visited me at all. Is it possible she will visit again during this crawl? Being that I just went live on 10/22, and this was her first visit, am I safe in assuming that I will most likely be included in the next update? I do have a few quality PR sites linking to me. And am I also safe in assuming that it should be somewhere around the end of this month? Just want to double check my conclusions, based on previous posts, so I'm not totally off track.

Sorry for all the questions, but as I'm sure you all can understand, I am new to Google and VERY excited to have seen her :)



 3:21 pm on Nov 4, 2002 (gmt 0)

Yep, me too.

Got all my pages; including the new ones in the subdir I was worried about. :)


 3:40 pm on Nov 4, 2002 (gmt 0)

Congrats! Yes, assuming you were deep crawled at the end of the month you should magically appear..


 4:03 pm on Nov 4, 2002 (gmt 0)

Just confirming that I'm seeing lots of 216s.


 4:32 pm on Nov 4, 2002 (gmt 0)

Hi guys,
Can anyone tell me which is the best way I can tell if googlebot has visited my site? Is there a free stats tool that I can use for analyzing to see when she visits?



 4:39 pm on Nov 4, 2002 (gmt 0)

Definitely pulling down lots of pages. I've seen these fellows:

just to name the first couple. So I'd guess Brett is right about the entire block.

The thing I'm curious about is why so many IPs. Is this just the result of inbound links? What i mean is that if I go through my logs, I get a different IP for Googlebot on just about every line whereas usually with the freshbot, I get just one IP throughout the day.


 6:44 pm on Nov 4, 2002 (gmt 0)


You can probably find what you are looking for in the “tracking and logging” forum:


Just browse through the titles until you see some relevant ones.

However, I can never seem to get the free ones I use (analog) to do exactly what I want so I often download my log files and write little programs of my own to pull out the information I want.

To track the Googlebot, I just extract every log entry with the word “googlebot” in it.


 10:26 pm on Nov 4, 2002 (gmt 0)

And you doubted my word, Brett? Like I can't spot an IP in the logs starting with 216.*? :( ;) The deep crawler grabbed everything on both my sites. Not that this is a whole lot, but it insisted on having it all.


 11:21 pm on Nov 4, 2002 (gmt 0)

Does the depth of Googlebot's indexing depend on your PR? For some reason Googlebot rarely goes deeper than the first set of links on my front page. Unfortunately this only brings up content indexes and very little content itself.

Googlebot filled up quite a chunk of the log file yesterday.


 11:20 pm on Nov 4, 2002 (gmt 0)

Googlebot just grabbed 1200 pages from my site in one night. IP start with 216.xxx..... Can anyone confirm?


 12:05 am on Nov 5, 2002 (gmt 0)

Finder, if this is the case for you, you should deeplink to good content from your main pages.

See [searchengineworld.com...]


 1:47 am on Nov 5, 2002 (gmt 0)

Thanks Slade.

I did that very thing last month after perusing WW for tips and tricks. I pulled up more links to deep content onto my front page. I think I need to do more though. I'll have to be a little creative. :)


 2:08 am on Nov 5, 2002 (gmt 0)

This is neat - I've got the deepcrawl bot AND the freshbot visiting at the same time. I guess this is why it's called a dance :)


 4:25 pm on Nov 5, 2002 (gmt 0)


You've got a ton of parameters in your URL.


is apparently just too much to get through. IMHO, shorten it up if you can or try one of the rewrite programs so that your parameters look like directories instead of parameters. I don't know anything about these but you need to do something to cut down the parameters.


 4:57 pm on Nov 5, 2002 (gmt 0)

Yo Ms. Googlebot, walk on over here, got some new links for you to see, cold milk and chocolate chip cookies for you to munch on, too! Don't be shy, and I know you're hungry! :) :) :)


 9:58 pm on Nov 5, 2002 (gmt 0)


That's the easiest way for me to do it. I could change "auth" to an ID number, but that's about it.


 10:40 pm on Nov 5, 2002 (gmt 0)

It might be the easiest way, but it's not going to deliver traffic...

I just completed a redesign for a dynamic site that now uses an ALA-type method to deliver static looking URL's.

Here's a really brief outline: each page in the database has a url like "blue-widget.html" in a url field. All requests for product pages are redirected to a php script that parses the URL and looks it up in the database to retrieve the product record and all necessary data.

Previously, this client encoded some of that data in the URL, like you do, and none of their pages were in google. As of the October update, they have 400 product pages in google, and the site is doing twice the business it did in October.

You can optimize your main pages all you want, but you're going to get far more traffic by having hundreds or thousands of pages in google than you ever will from just a dozen optimized html pages.

So, look up ways to make your URL's search engine friendly, and do it. Getting those pages in google is definitely worth it.


 10:51 pm on Nov 5, 2002 (gmt 0)

Was visited by the freshbot first yesterday then deep crawl Freshbot visits nearly every day.


 12:20 am on Nov 6, 2002 (gmt 0)

Glabbin' em all over here as well..


 1:22 am on Nov 6, 2002 (gmt 0)


You are right. I analyzed the logs and it looks like Googlebot isn't taking any URL that has more than two variables in it. I'm working on eliminating some of the worst offenses. I'll have to wait until next month to see if it helps. I think it has all it wants from my site for this month.

Thanks everyone for the tips. It has helped a great deal!


 1:23 am on Nov 6, 2002 (gmt 0)

Leeched robots.txt of new sites - will be a surefire next index i can feel it. All the sites will be up there somewhere :) And if not time will tell where they are. but we all know we love the #1-5 spots for ranking!


 2:53 am on Nov 6, 2002 (gmt 0)

She's going nuts. When GoogleBot visits my sites, its a great day. :),,,,,,,,,,,,,,,,,,


 3:07 am on Nov 6, 2002 (gmt 0)

She seems quite hungry to my delight. Last night grabbed my entire site - about a week before just came for a look. Keep on coming ms. google


 9:51 pm on Nov 6, 2002 (gmt 0)

Googlebot has has a different pattern this time around it seems. My sites are getting crawled very shallow. Noirmally they get a good deep one around this time. How is every one else doing regarding depth..?

This 47 message thread spans 2 pages: 47 ( [1] 2 > >
Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved