Welcome to WebmasterWorld Guest from 220.127.116.11
The conventional bot has a lower case "bot"
"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Google has made it a little difficult to sort out all of their bots with one search string "/2.1" works but does find some extra unrelated odds and ends in logs. (Except of course Googlebot-Image/1.0)
Referrer strings extracted directly from my logs
Google-: "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Of course as mentioned, there's Froogle,Mobile,and Feed for RSS feeds.
Also Google now appears to be exclusively using HTTP/1.1, until recently there has been a mix of HTTP/1.1 and HTTP/1.0. One thing important to note is Google is now always requesting GZIP compressed content if your server provides it. Your website might get an "attaboy" if you served GZIP compressed content to the "bots". This could cut your website's bandwidth usage and boost your site's performance.
(Webmasterworld and Google serve GZIP compressed content)
I look at the DC pages all the time for my keywords <using an online tool>.
DCs are subdivided into 'New', and various wild-card DNS numbers like .104, .107 and so on.
What I would like to know is the geographical locations for these DNS numbers.
I presume Google puts DCs all over the place to reduce long distance bandwidth
and to balance the insane load that all those surfers place on their servers.
Does anybody have a list of where 18.104.22.168 is for example?
I mean the physical location (City, State ..) of the servers.
Are any overseas? Lots of them? Where the heck are they?
Most of them give me good SERPs positions, but a few have my site in the dumpster.
If those few are in Upper Volta or Ananaguay I don't care so very much.
Whois lookups always point back to Mountain View, California (or very nearby)
where Google HQ are, and that tells me nothing. Any help appreciated! -Larry
[edited by: tedster at 5:44 am (utc) on Dec. 20, 2006]
You can use the "ITR client" and then "trace" the route to the IP addresses. This trace typically shows physical locations along a communication path.
Your firewall or modem firewall may block this capability to some extent.
I was really hoping that all this was public knowledge, and that somebody had a simple list up,
123.123.123 Dallas, TX
234.234.234 Boston, MA
121.121.111 Paris, France .. and so on.
I believe this could be considered a link to an authority site.
On this page to the right you will see a "click here" link to the "ITR Client" download. This windows executable is very useful and free.
I double checked, I had to set my DSL modem firewall to OFF from a setting of LOW to fully enable the Route Tracing function of the ITR client.
This link may help with Googlebot Image
Sign up for sitemaps above. There is a robots.txt analysis tool that Google provides. You don't have to have a sitemap.xml file to sign up for sitemaps, but you may have to verify your site ownership. Google actually asks you to place a uniquely named file in your Website, so they can verify you are who you say you are!
If you blocked all bots but Googlebot, you probably are blocking Google Images, and Google Mobile. Of course there's also Adsense and Adwords bots.
Allowing only specific bots is now risky because search engines keep inventing new ones.
I do use the Sitemaps and did use the robots.txt analysis tool in the 'early days'. But since the Sitemaps page had misspelt the Mediapartners robot (*), I wasn't 100% confident in its results/info.
Plus it gave incorrect analysis results for the normal Googlebot, although that was fixed on a return visit a short while after ... yeah - I just didn't trust that page ;-)
Maybe Google have sorted it out since the early days of that analysis tool/feature, I shall give it another go.
(*) at least, if it was correct it was spelt differently to the Adsense guidelines and help pages.
and regarding to the adsense and image bots. does regular optimization technique applies to them as well? or there's a different sets of tricks to handle these bots?
Now for the on topic bit: I now see that it has been requesting images, not pages. Yes, this is Mozilla bot, not the image bot. This little guy: "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Is Mozilla bot taking over image bots job?