Welcome to WebmasterWorld Guest from 184.72.177.182

Message Too Old, No Replies

Google Bots

what are the different google bots

     
5:23 pm on May 29, 2006 (gmt 0)

New User

5+ Year Member

joined:May 29, 2006
posts:34
votes: 0


What are the different types of google bot and for what purpose are these bots meant for? I recently heard about google adwords bot (AdsBot-Google).

Is there any list available for all type of Google bots?

5:36 pm on May 29, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Well, beside the regular googlebots, there's

MediaBot - used to analyze AdSense pages
user agent "Mediapartners-Google"

ImageBot - crawling for the Image Search
user agent "GoogleBot-Image"

AdsBot - checking AdWords landing pages for quality
user agent "AdsBot-Google"

10:02 pm on May 29, 2006 (gmt 0)

Junior Member

5+ Year Member

joined:May 7, 2006
posts:60
votes: 0


Isn't there one for rss also?
10:09 pm on May 29, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 15, 2003
posts:2606
votes: 0


There is Feedfetcher-Google

Although this is the google.com/ig rss reader fetcher not an actual RSS bot looking for feeds, just the thing that requests feeds to display on google/ig

3:29 pm on May 30, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 5, 2006
posts:2094
votes: 2


What about froogle and google base?
4:21 pm on June 2, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member billys is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 1, 2004
posts:3181
votes: 0


Here's another one...

Generic Mobile Phone (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)

5:22 pm on June 2, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 30, 2003
posts:625
votes: 0


What are the different types of google bot

I cannot seem to remember!
If you spot one let me know, they seem to be an endangered species nowadays :)

2:11 pm on June 3, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 13, 2004
posts:801
votes: 2


Regarding Adwords
This bot
"GoogleBot/2.1" has been crawling one page of my site daily. The page being crawled is the target of an adwords campaign. So Adwords is definitely looking at target page quality. Note the upper case "Bot". This is ("GoogleBot/2.1") the entire referrer string as well. It has used multiple IP addresses.
This bot ("GoogleBot/2.1") also detected a change to one of my robots.txt files (more liberal for Google) and appeared to almost immediately trigger a deep complete crawl of the site from the conventional bot. I believe all the bots cooperate in collecting the robots.txt file.

The conventional bot has a lower case "bot"
"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Google has made it a little difficult to sort out all of their bots with one search string "/2.1" works but does find some extra unrelated odds and ends in logs. (Except of course Googlebot-Image/1.0)

Referrer strings extracted directly from my logs


Adsense: "Mediapartners-Google/2.1"
Adwords: "GoogleBot/2.1"
Google-: "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Image--: "Googlebot-Image/1.0"

Of course as mentioned, there's Froogle,Mobile,and Feed for RSS feeds.

Also Google now appears to be exclusively using HTTP/1.1, until recently there has been a mix of HTTP/1.1 and HTTP/1.0. One thing important to note is Google is now always requesting GZIP compressed content if your server provides it. Your website might get an "attaboy" if you served GZIP compressed content to the "bots". This could cut your website's bandwidth usage and boost your site's performance.
(Webmasterworld and Google serve GZIP compressed content)

3:00 pm on June 3, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Hi all: I hope this doesn't hi-jack the thread.

I look at the DC pages all the time for my keywords <using an online tool>.

DCs are subdivided into 'New', and various wild-card DNS numbers like .104, .107 and so on.

What I would like to know is the geographical locations for these DNS numbers.

I presume Google puts DCs all over the place to reduce long distance bandwidth
and to balance the insane load that all those surfers place on their servers.

Does anybody have a list of where 64.233.171.99 is for example?
I mean the physical location (City, State ..) of the servers.
Are any overseas? Lots of them? Where the heck are they?

Most of them give me good SERPs positions, but a few have my site in the dumpster.
If those few are in Upper Volta or Ananaguay I don't care so very much.

Whois lookups always point back to Mountain View, California (or very nearby)
where Google HQ are, and that tells me nothing. Any help appreciated! -Larry

[edited by: tedster at 5:44 am (utc) on Dec. 20, 2006]

4:14 pm on June 3, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 13, 2004
posts:801
votes: 2


One means that may provide some information on physical locations can be found at:
[internettrafficreport.com...]
I believe this could be considered a link to an authority site.

You can use the "ITR client" and then "trace" the route to the IP addresses. This trace typically shows physical locations along a communication path.

Your firewall or modem firewall may block this capability to some extent.

10:45 am on June 4, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Thanks Bumpski: I couldn't make sense of the recommended page, so I pinged the DNS #.
That got there in several steps and back, but no info as to actual location.
All results indicated zero distance and 'USA' however.

I was really hoping that all this was public knowledge, and that somebody had a simple list up,
something like:

123.123.123 Dallas, TX
234.234.234 Boston, MA
121.121.111 Paris, France .. and so on.

-Larry

11:44 am on June 4, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 13, 2004
posts:801
votes: 2


Just for reference:

[internettrafficreport.com...]
I believe this could be considered a link to an authority site.

On this page to the right you will see a "click here" link to the "ITR Client" download. This windows executable is very useful and free.

I double checked, I had to set my DSL modem firewall to OFF from a setting of LOW to fully enable the Route Tracing function of the ITR client.

8:28 pm on June 4, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:July 21, 2003
posts:130
votes: 0


I presume this is deemed on-topic... along with their user-agents, can any advise what to use in robots.txt for them? I've only ever used Mediapartners-Google* and Googlebot to date. I disallow other robots. I was just wondering if I happened to be excluding some of the other Google robots (eg Image), or does 'Googlebot' cover them all?
11:31 am on June 5, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 13, 2004
posts:801
votes: 2


Davelms

This link may help with Googlebot Image
[google.com...]

[google.com...]
Sign up for sitemaps above. There is a robots.txt analysis tool that Google provides. You don't have to have a sitemap.xml file to sign up for sitemaps, but you may have to verify your site ownership. Google actually asks you to place a uniquely named file in your Website, so they can verify you are who you say you are!

If you blocked all bots but Googlebot, you probably are blocking Google Images, and Google Mobile. Of course there's also Adsense and Adwords bots.

Allowing only specific bots is now risky because search engines keep inventing new ones.

4:55 pm on June 5, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:July 21, 2003
posts:130
votes: 0


Excellent, thankyou! You know, I looked (not very hard, I admit) and never found the info before.

I do use the Sitemaps and did use the robots.txt analysis tool in the 'early days'. But since the Sitemaps page had misspelt the Mediapartners robot (*), I wasn't 100% confident in its results/info.

Plus it gave incorrect analysis results for the normal Googlebot, although that was fixed on a return visit a short while after ... yeah - I just didn't trust that page ;-)

Maybe Google have sorted it out since the early days of that analysis tool/feature, I shall give it another go.

(*) at least, if it was correct it was spelt differently to the Adsense guidelines and help pages.

5:27 pm on June 5, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 22, 2005
posts:104
votes: 0


had misspelt

?

10:43 pm on June 5, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:July 21, 2003
posts:130
votes: 0


Yeah. It's right now. It was probably corrected very early on. If I recall rightly I think it was initially implemented as Google-Mediapartners instead of Mediapartners-Google (as it is now).
3:40 pm on June 13, 2006 (gmt 0)

New User

5+ Year Member

joined:Feb 25, 2006
posts:2
votes: 0


A "sort of" google bot is gsa-crawler, used by the Google Search Appliance: [google.com...]
9:28 pm on June 13, 2006 (gmt 0)

Full Member

10+ Year Member

joined:Nov 10, 2005
posts:240
votes: 0


BTW, has someone seen the Supplemental Bot in action?
1:49 pm on June 16, 2006 (gmt 0)

New User

5+ Year Member

joined:Feb 25, 2006
posts:2
votes: 0


A google proxy (browser, not bot) which will show up in web logs is "Google Wireless Transcoder"; used by Google Mobile [http://www.google.com/xhtml], [http://www.google.com/gwt/n]
3:44 pm on June 18, 2006 (gmt 0)

New User

5+ Year Member

joined:Mar 31, 2006
posts:13
votes: 0


so how do you detect these extra bots? none of them showed up on all my stats program (or is it a sign that I should get a better stat apps?)

and regarding to the adsense and image bots. does regular optimization technique applies to them as well? or there's a different sets of tricks to handle these bots?

8:15 pm on June 18, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 9, 2005
posts:75
votes: 0


I have been doing a daily check to see how much Mozilla bot has been spidering my site. Although 70% de-indexed or supplemental, Mozilla has been spidering over 200 'pages' a day for about 3 days.

Now for the on topic bit: I now see that it has been requesting images, not pages. Yes, this is Mozilla bot, not the image bot. This little guy: "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Is Mozilla bot taking over image bots job?