Forum Moderators: open

Message Too Old, No Replies

Will the Real Googlebot-Mobile Please setup forward.

Googlebot-Mobile

         

Ocean10000

3:35 am on Oct 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Seeing how I am working on updating my site to detect the newer crawlers. I though I might put some light on GoogleBot-Mobile's multiple personalities that I have seen on my website in the last year.

I am posting full headers minus any referrer information if any was supplied and exemplified Host information. I am also supplying the Ip Ranges which I have seen them coming from, which I can verify are Google owned via whois lookup.


Spoofing DoCoMo

<Headers>
<header name="Connection" value="Keep-alive" />
<header name="Accept" value="text/plain,text/html" />
<header name="Accept-Encoding" value="gzip,deflate" />
<header name="From" value="googlebot(at)googlebot.com" />
<header name="Host" value="example.com" />
<header name="User-Agent" value="DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" />
</Headers>

<BotRange>
<range StartIp="66.249.64.0" EndIp="66.249.95.255" />
<range StartIp="209.85.128.0" EndIp="209.85.255.255" />
</BotRange>

<Info FirstSeen="10/10/2008 12:03:00 PM" LastVisit="9/29/2009 11:36:00 PM" />

This version takes the same files as the regular googlebot, and doesn't mind html/xhtml files at all.


Spoofing Phone.com

<Headers>
<header name="Connection" value="Keep-alive" />
<header name="Accept" value="application/vnd.wap.xhtml+xml,application/xhtml+xml;q=0.9,text/vnd.wap.wml;q=0.8,text/html;q=0.7,*/*;q=0.6" />
<header name="Accept-Encoding" value="gzip,deflate" />
<header name="From" value="googlebot(at)googlebot.com" />
<header name="Host" value="example.com" />
<header name="User-Agent" value="SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" />
</Headers>

<BotRange>
<range StartIp="66.249.64.0" EndIp="66.249.95.255" />
</BotRange>

<Info FirstSeen="2/5/2009 6:02:00 AM" LastVisit="10/3/2009 9:58:00 PM" />

I see maybe one or two hits from this version a week. I don't think it likes my xhtml website which it is crawling. This version I think is actually looking for content designed specifically for mobiles phones.


Spoofing iPhone

<Headers>
<header name="Connection" value="Keep-alive" />
<header name="Accept" value="*/*" />
<header name="Accept-Encoding" value="gzip,deflate" />
<header name="From" value="googlebot(at)googlebot.com" />
<header name="Host" value="example.com" />
<header name="User-Agent" value="Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1A543a Safari/419.3 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" />
</Headers>

<BotRange>
<range StartIp="66.249.64.0" EndIp="66.249.95.255" />
</BotRange>

<Info FirstSeen="5/9/2009 6:02:00 AM" LastVisit="8/7/2009 5:18:00 AM" />

This is the new kid on the block I have only just started noticing this in my logs. I don't have many hits from this version too little for me to tell much about it.


There are more variations of Googlebot-Mobile, but I have not seen them in the last year on the websites which I monitor.

You may have noticed all the different User-Agent variations still list Googlebot-Mobile as version 2.1.

Pfui

5:34 pm on Oct 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for your info, Owen! I'm curious: Did any of Google's mobile UAs request robots.txt?

1.) If yes, did they heed it?
2.) If no, do you have "User-agent: Googlebot-Mobile" specifically allowed or disallowed?

Aside: That's the UA Google defines as one of its robots.txt testing agents in Webmaster Tools/Site Configuration/Crawler access.

(FWIW: Thus far, all of Googlebot's mobile variations read and heed robots.txt on my sites.)

Ocean10000

10:24 pm on Oct 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Small problem I have with Googlebot is that more then one can come from the same IP. From my quick scan of the last couple of days of logs it usually the non-mobile user-agent that downloads Robots.txt, not the mobile versions.
Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html)

They follow it since I give Google full access to my website.