Forum Moderators: open

Message Too Old, No Replies

Googlebot is back!

It looks like the deep crawl is starting.

         

RoadRash

9:28 am on Mar 9, 2003 (gmt 0)

10+ Year Member



Google started to deep crawl my site in the last 15 minutes. Anyone else notice this?

LowLevel

1:29 am on Mar 11, 2003 (gmt 0)

10+ Year Member



Althought it is the deepbot, it crawled just my homepage and the pages linked by the homepage. Usually this is the behaviour of freshbot.

I hope it will come back to download the entire site.

MetropolisRobot

1:53 am on Mar 11, 2003 (gmt 0)

10+ Year Member



deepcrawl is in the house. I currently have 4 deepcrawlers taking apart my site. Funny enough at the same time I also have a couple of freshbots and the Wisenutbot as well!

Anyways x-fingers for a good crawl.

wired4fun

3:39 am on Mar 11, 2003 (gmt 0)

10+ Year Member



Deepbot crawling here too. 2500 pages and going strong...

marcs

3:45 am on Mar 11, 2003 (gmt 0)

10+ Year Member



Deepbot most have gotten bored with the long update cycle, very busy here also.

The bot must be happy to be unleashed :)

matthias

3:55 am on Mar 11, 2003 (gmt 0)

10+ Year Member



Bot requested one file, the same file a second later and was away since than. is this normal? :-)

Jesse_Smith

4:12 am on Mar 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



:::Bot requested one file, the same file a second later and was away since than. is this normal?

Yes. With in a day or two you will probably start seeing it bomb your site.

casperkor

4:16 am on Mar 11, 2003 (gmt 0)

10+ Year Member



Just was wondering what programs you guys are using to be
able to know instantly when the Googlebot is crawling your site?

I would appreciate any info on this :)

4serendipity

4:51 am on Mar 11, 2003 (gmt 0)

10+ Year Member



Althought it is the deepbot, it crawled just my homepage and the pages linked by the homepage. Usually this is the behaviour of freshbot.

I've noticed the same things. Deepbot doesn't appear to be going too deep right now, although it's going a little deeper on my site than freshbot usually does.

Just was wondering what programs you guys are using to be
able to know instantly when the Googlebot is crawling your site?

I browse my raw log files, looking for googlebot.

216.xxx.xxx.xxx IPs are deepbot
64.xxx.xxx.xxx IOs are freshbot

That is, unless something has changed this month ;)

AthlonInside

5:19 am on Mar 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not that HunNGry, just get a a few files. :( Don't Stop There, GoogleBot, the food are for you!

Emma McCreary

5:21 am on Mar 11, 2003 (gmt 0)

10+ Year Member



casperkor - I embed a bit of PHP code on my pages that emails me when googlebot stops by. I'm sure I'll get bored of it eventually, but for now it's kind of fun.

I'm sure I picked up this code here on the forum, but I'm not sure where...here it is. It emails me and it adds a line to googlebot.txt, a text file on the server. Remember to chmod it for write access.

<?
if(eregi("googlebot",$HTTP_USER_AGENT))
{
if ($QUERY_STRING!= "")
{$url = "http://".$SERVER_NAME.$PHP_SELF.'?'.$QUERY_STRING;}
else
{$url = "http://".$SERVER_NAME.$PHP_SELF;}
$today = date("F j, Y, g:i a");
$host = gethostbyaddr($REMOTE_ADDR);
mail("myemail@mydomain.org", "googlebot detected on $SERVER_NAME", "$today - Google crawled $url \n $host");
$logfile = @fopen('googlebot.txt', 'a');
@fputs($logfile, "$today - Google crawled $url$host\n");
@fclose($logfile);
}
?>

JeremyL

5:22 am on Mar 11, 2003 (gmt 0)

10+ Year Member



Maybe they aren't going as deep this month so they have time to get back on cycle after the delay this month.

oLeon

8:24 am on Mar 11, 2003 (gmt 0)

10+ Year Member



yes, we were hit about 5000 times ...
so itīs started -

luck for you all, that it may catch all pages you want to!

jilla

8:52 am on Mar 11, 2003 (gmt 0)

10+ Year Member



I see googlebot 200, it's not 216 though... is this the deepcrawler?

pardo

9:37 am on Mar 11, 2003 (gmt 0)

10+ Year Member



yep, both deepcrawler and freshbot on our website since this morning. Could be another update this month?

SubZeroGTS

1:03 pm on Mar 12, 2003 (gmt 0)

10+ Year Member



still only freshbot for me

Gibble

3:18 pm on Mar 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm getting no love, no love at all

After a few weeks of downtime my site is finally back up with a fresh look and much better code, but I need a new crawl so I can get rid of all the 404s people are getting (I have a redirect script, but it's not 100% effective).

farside847

4:19 pm on Mar 12, 2003 (gmt 0)

10+ Year Member




woof - Im getting hit 25,000 times a day by googlebot. My server
may be complaining, but Im not :) Cant wait till next month.

One thing I did notice, googlebot is following links that
include a session tag this month. (www.foobar.com/list.html?id=112233)
They didnt use to do this and it is creating many repetitive
hits. But, they are hitting all my other pages too. (it is
a retail site with 60k products, each with its own page...)

Any one know of a way I can have googlebot follow only links
that do not have the session tag?

thanks!

teeceo

7:31 am on Mar 13, 2003 (gmt 0)

10+ Year Member



If I have a new site thats up rightnow with lots of links going to it, will my site be in the index nextmonth providing it gets hit by deepbot now?

teeceo.

brotherhood of LAN

7:33 am on Mar 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



so where is the bot now? ;)

coconutz

7:47 am on Mar 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Bots are vacationing in Hawaii:

Googlebot:
216.239.46.55 GET /suntan.htm Googlebot/2.1 +(+http://www.googlebot.com/bot.html)

Freshbot:
64.68.82.17 GET /shades.htm Googlebot/2.1 +(+http://www.googlebot.com/bot.html)

Imagebot:
64.68.86.59 GET /mai-tai.jpg Googlebot/2.1 +(+http://www.googlebot.com/bot.html)

iam david lee

1:47 pm on Mar 13, 2003 (gmt 0)

10+ Year Member



I submitted my new site a month ago and also put a link on one site which is already in google to the new site. But the old site has been visited by google but the new one has still not been visited by gogle... Any chance i will get in in the next update.. Or do i need to do anything else to get in... Please help

uber_boy

8:15 pm on Mar 13, 2003 (gmt 0)

10+ Year Member



It seems we're about three days into the Deep Crawl, yet I've yet to get any significant attention from the appropriate bots. I'm trying to be patient, but I'm getting worried since I usually have about 100,000 pages read in during this time. Anybody else with a reasonable history of extensive visitations feeling neglected at the moment?

PaulPaul

8:19 pm on Mar 13, 2003 (gmt 0)

10+ Year Member



Im in the same boat as you Uber...

I have had the bot ask for 35 pages and thats all, that was 2 days ago.

Hoping she comes back to get a couple more thousand..

Painting

9:37 pm on Mar 14, 2003 (gmt 0)

10+ Year Member



DavidT

Another thing, is Freshbot stupid? It can't seem to understand a 301 redirect from www.domain.com to domain.com. Keeps requesting pages and nothing but 301 in return.

I would like to know about this as well, google isn't following my server redirects and keeps trying "www." (trying to concentrate my PR and i just plain hate the "www." on domains ;-)

Oaf357

3:07 am on Mar 15, 2003 (gmt 0)

10+ Year Member



I'm having deep crawls as well. It's good stuff.

bether2

3:26 am on Mar 15, 2003 (gmt 0)

10+ Year Member



teeceo,
Yes your site should be in the next index (prob. end of March or beginning of April) if it's getting crawled now by deepbot. If your site is very large, you may not see *all* of your pages in this next update - from what I've heard.

Beth

Oaf357

4:21 am on Mar 15, 2003 (gmt 0)

10+ Year Member



To add on to what bether2 said I got a mere freshbot crawl during the last update and a few of the pages on my new site showed up in the index. Now that a deep crawl has begun I hope that everything will be indexed.

Gibble

2:13 pm on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



YIPEE! Googlebot finally knocked :)

Gibble

2:15 pm on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Oh oh, it better knock again, my robots.txt got copied from dev and was disallowing all robots...fixed now but...oh damn

Oaf357

5:17 pm on Mar 15, 2003 (gmt 0)

10+ Year Member



You only need one robots.txt file in the root of your site to disallow everything you don't want indexed.
This 102 message thread spans 4 pages: 102