Welcome to WebmasterWorld Guest from 50.19.135.67

Message Too Old, No Replies

Google Bots

what are the different google bots

     

vipink

5:23 pm on May 29, 2006 (gmt 0)

5+ Year Member



What are the different types of google bot and for what purpose are these bots meant for? I recently heard about google adwords bot (AdsBot-Google).

Is there any list available for all type of Google bots?

tedster

5:36 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Well, beside the regular googlebots, there's

MediaBot - used to analyze AdSense pages
user agent "Mediapartners-Google"

ImageBot - crawling for the Image Search
user agent "GoogleBot-Image"

AdsBot - checking AdWords landing pages for quality
user agent "AdsBot-Google"

Right Reading

10:02 pm on May 29, 2006 (gmt 0)

5+ Year Member



Isn't there one for rss also?

Demaestro

10:09 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member demaestro is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There is Feedfetcher-Google

Although this is the google.com/ig rss reader fetcher not an actual RSS bot looking for feeds, just the thing that requests feeds to display on google/ig

trinorthlighting

3:29 pm on May 30, 2006 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



What about froogle and google base?

BillyS

4:21 pm on Jun 2, 2006 (gmt 0)

WebmasterWorld Senior Member billys is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Here's another one...

Generic Mobile Phone (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)

phantombookman

5:22 pm on Jun 2, 2006 (gmt 0)

10+ Year Member



What are the different types of google bot

I cannot seem to remember!
If you spot one let me know, they seem to be an endangered species nowadays :)

bumpski

2:11 pm on Jun 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Regarding Adwords
This bot
"GoogleBot/2.1" has been crawling one page of my site daily. The page being crawled is the target of an adwords campaign. So Adwords is definitely looking at target page quality. Note the upper case "Bot". This is ("GoogleBot/2.1") the entire referrer string as well. It has used multiple IP addresses.
This bot ("GoogleBot/2.1") also detected a change to one of my robots.txt files (more liberal for Google) and appeared to almost immediately trigger a deep complete crawl of the site from the conventional bot. I believe all the bots cooperate in collecting the robots.txt file.

The conventional bot has a lower case "bot"
"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Google has made it a little difficult to sort out all of their bots with one search string "/2.1" works but does find some extra unrelated odds and ends in logs. (Except of course Googlebot-Image/1.0)

Referrer strings extracted directly from my logs


Adsense: "Mediapartners-Google/2.1"
Adwords: "GoogleBot/2.1"
Google-: "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Image--: "Googlebot-Image/1.0"

Of course as mentioned, there's Froogle,Mobile,and Feed for RSS feeds.

Also Google now appears to be exclusively using HTTP/1.1, until recently there has been a mix of HTTP/1.1 and HTTP/1.0. One thing important to note is Google is now always requesting GZIP compressed content if your server provides it. Your website might get an "attaboy" if you served GZIP compressed content to the "bots". This could cut your website's bandwidth usage and boost your site's performance.
(Webmasterworld and Google serve GZIP compressed content)

larryhatch

3:00 pm on Jun 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi all: I hope this doesn't hi-jack the thread.

I look at the DC pages all the time for my keywords <using an online tool>.

DCs are subdivided into 'New', and various wild-card DNS numbers like .104, .107 and so on.

What I would like to know is the geographical locations for these DNS numbers.

I presume Google puts DCs all over the place to reduce long distance bandwidth
and to balance the insane load that all those surfers place on their servers.

Does anybody have a list of where 64.233.171.99 is for example?
I mean the physical location (City, State ..) of the servers.
Are any overseas? Lots of them? Where the heck are they?

Most of them give me good SERPs positions, but a few have my site in the dumpster.
If those few are in Upper Volta or Ananaguay I don't care so very much.

Whois lookups always point back to Mountain View, California (or very nearby)
where Google HQ are, and that tells me nothing. Any help appreciated! -Larry

[edited by: tedster at 5:44 am (utc) on Dec. 20, 2006]

bumpski

4:14 pm on Jun 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One means that may provide some information on physical locations can be found at:
[internettrafficreport.com...]
I believe this could be considered a link to an authority site.

You can use the "ITR client" and then "trace" the route to the IP addresses. This trace typically shows physical locations along a communication path.

Your firewall or modem firewall may block this capability to some extent.

larryhatch

10:45 am on Jun 4, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Bumpski: I couldn't make sense of the recommended page, so I pinged the DNS #.
That got there in several steps and back, but no info as to actual location.
All results indicated zero distance and 'USA' however.

I was really hoping that all this was public knowledge, and that somebody had a simple list up,
something like:

123.123.123 Dallas, TX
234.234.234 Boston, MA
121.121.111 Paris, France .. and so on.

-Larry

bumpski

11:44 am on Jun 4, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just for reference:

[internettrafficreport.com...]
I believe this could be considered a link to an authority site.

On this page to the right you will see a "click here" link to the "ITR Client" download. This windows executable is very useful and free.

I double checked, I had to set my DSL modem firewall to OFF from a setting of LOW to fully enable the Route Tracing function of the ITR client.

davelms

8:28 pm on Jun 4, 2006 (gmt 0)

10+ Year Member



I presume this is deemed on-topic... along with their user-agents, can any advise what to use in robots.txt for them? I've only ever used Mediapartners-Google* and Googlebot to date. I disallow other robots. I was just wondering if I happened to be excluding some of the other Google robots (eg Image), or does 'Googlebot' cover them all?

bumpski

11:31 am on Jun 5, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Davelms

This link may help with Googlebot Image
[google.com...]

[google.com...]
Sign up for sitemaps above. There is a robots.txt analysis tool that Google provides. You don't have to have a sitemap.xml file to sign up for sitemaps, but you may have to verify your site ownership. Google actually asks you to place a uniquely named file in your Website, so they can verify you are who you say you are!

If you blocked all bots but Googlebot, you probably are blocking Google Images, and Google Mobile. Of course there's also Adsense and Adwords bots.

Allowing only specific bots is now risky because search engines keep inventing new ones.

davelms

4:55 pm on Jun 5, 2006 (gmt 0)

10+ Year Member



Excellent, thankyou! You know, I looked (not very hard, I admit) and never found the info before.

I do use the Sitemaps and did use the robots.txt analysis tool in the 'early days'. But since the Sitemaps page had misspelt the Mediapartners robot (*), I wasn't 100% confident in its results/info.

Plus it gave incorrect analysis results for the normal Googlebot, although that was fixed on a return visit a short while after ... yeah - I just didn't trust that page ;-)

Maybe Google have sorted it out since the early days of that analysis tool/feature, I shall give it another go.

(*) at least, if it was correct it was spelt differently to the Adsense guidelines and help pages.

tiori

5:27 pm on Jun 5, 2006 (gmt 0)

10+ Year Member



had misspelt

?

davelms

10:43 pm on Jun 5, 2006 (gmt 0)

10+ Year Member



Yeah. It's right now. It was probably corrected very early on. If I recall rightly I think it was initially implemented as Google-Mediapartners instead of Mediapartners-Google (as it is now).

lupobianco

3:40 pm on Jun 13, 2006 (gmt 0)

5+ Year Member



A "sort of" google bot is gsa-crawler, used by the Google Search Appliance: [google.com...]

Halfdeck

9:28 pm on Jun 13, 2006 (gmt 0)

5+ Year Member



BTW, has someone seen the Supplemental Bot in action?

lupobianco

1:49 pm on Jun 16, 2006 (gmt 0)

5+ Year Member



A google proxy (browser, not bot) which will show up in web logs is "Google Wireless Transcoder"; used by Google Mobile [http://www.google.com/xhtml], [http://www.google.com/gwt/n]

br4inwash3r

3:44 pm on Jun 18, 2006 (gmt 0)

5+ Year Member



so how do you detect these extra bots? none of them showed up on all my stats program (or is it a sign that I should get a better stat apps?)

and regarding to the adsense and image bots. does regular optimization technique applies to them as well? or there's a different sets of tricks to handle these bots?

StickyNote

8:15 pm on Jun 18, 2006 (gmt 0)

10+ Year Member



I have been doing a daily check to see how much Mozilla bot has been spidering my site. Although 70% de-indexed or supplemental, Mozilla has been spidering over 200 'pages' a day for about 3 days.

Now for the on topic bit: I now see that it has been requesting images, not pages. Yes, this is Mozilla bot, not the image bot. This little guy: "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Is Mozilla bot taking over image bots job?

 

Featured Threads

Hot Threads This Week

Hot Threads This Month