Forum Moderators: open

Message Too Old, No Replies

Google Toolbar as Spider

What should be returned?

         

dstiles

7:32 pm on Aug 18, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Posting here since this is the spidering activity of GTB.

What should one return for this. Does it matter?

I'm getting an increasing number of hits from GTB, some of them very dumb. Can I return something equally dumb or does it really matter? For example:

GTB asks for favicon.ico - not the one whose location I list in the web page header but always for one in the root directory. I don't keep them there because I don't want icon scrapers walking off with them and they seldom look elsewhere (Firefox is a similarly dumb in that it always looks in the root first but then follows the correct link). In some instances it looks for the same favicon several times within a few seconds. Obviously, a missing icon returns a 404.

Sometimes (not always) GTB, having got a 404 on missing favicon, goes for favicon.gif. Which also gets a 404 since it don't exist.

At other times GTB goes for the home page - rarely for any other. Does it need the whole page or is a simple reassurance good enough? Certainly a simple "Hello dumbo" would save a lot of bandwidth over some of the home pages it asks for.

Since GTB has a very poor set of headers I've had to ensure I don't log the IP as a baddy.

dstiles

5:15 pm on Aug 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Further to this, can someone confirm or deny the following GTB UA, please? It claims to be an XP browser but the "big" suggests it may be a 64-bit machine. Either way the "Windows XP" should not be there if it's just the google toolbar ID added to a working browser UA. Should it? Surely it's supposed to be "Windows NT 5.1".

Mozilla/4.0 (compatible; GoogleToolbar 4.0.1601.4978-big; Windows XP 5.1; MSIE 7.0.5730.13)

The dotted numbers are slightly variable, as one would expect with updated software.