Forum Moderators: open

Message Too Old, No Replies

GoogleToolbar in User Agent

Does this imply it's not readng the page?

         

dstiles

10:08 pm on Mar 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



After reading the GTB section of WebmasterWorld I'm still confused about this. Perhaps UA specialists can help.

Does the inclusion of the term GoogleToolBar or GTB in a browser's UA mean it's on an automated spree that the human won't see or is it always there even when browsing?

At present I record the UA in a log. The log indicates very little activity from GTB UAs, suggesting it's a sort of bookmark checker. I'm wondering if I should still return a full page or if I can return a much shorter 405.

blend27

1:17 am on Mar 4, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I was just wondering the exact same thing 2 days ago and meant to post it here but was waiting to gather some more evidence.

caribguy

7:24 am on Mar 4, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From what I can see, it's not always automated, although I certainly understand what you're getting at.

One of the sites I just checked includes a JS onmousedown event that leads to a different subdomain. That subdomain visit from a Comcast user was preceded by the only possible legitimate pageview.

UA:
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB6.3; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; yie8)"


A fair number of other visits pass the 'smell test' as far as I can tell..

dstiles

12:10 am on Mar 5, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hmm. Was that the only file taken or were there pics as well? Google can read JS and it would be installed in the browser...

I can see I'll have to look through the logs more closely...

One of my Blocked logs shows...

Mozilla/4.0 (compatible; GoogleToolbar 6.3.1106.427; Windows 6.0; MSIE 8.0.6001.18882)

then 8 off...

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB6.3; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.21022; OfficeLiveConnector.1.3; OfficeLivePatch.0.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729)

then another googletoolbar then a final gtb. Took 2 minues in all so either a person or a long auto-delay (see also below).

The googletoolbar hits had no valid header fields.

The gtb hits had two Accept types: */* and a proper Accept string (don't know exactly what).

An examination of the W3C logs shows a lot of activity on a fairly small site, with several files being accessed several times and favicon being looked for where it shouldn't have been. Images, CSS, JS etc were loaded. Took from 13:40:03 to 16:14:05 with only three breaks of about 10 and one of 30 minutes. There was then a longer break and activity resumed at 19:08:57 until 19:43:34, although only brief activity with short breaks. Most of the files after a short period returned 304's. Total pages fetched about 55, several duplicated.

The above Googletoolbar section began when a PDF was accessed. The toolbar asked twice for favicon in the same folder as the PDF (which is ALL it contained apart from a standard redirect/go-away index file). It suggests that (in this case) GoogleToolBar is automated and GTB is a real browser (or controlled by OfficeLive - not sure how that works). I suspect, though, that GTB was passing on info about the browsing session as well as googletoolbar. Ok, I'm paranoid. :)

At a few points the AOL "toolbar" ee://aol/http joined in, trying several times to fetch favicon from where it wasn't.

The whole thing looks a mess but it was probably "mostly human". I wonder if OfficeLive was actively involved in things rather than google but I don't know how OL works.

I ned to analyse this more, to discover if it's "real" activity - if I can! :(

caribguy

1:23 am on Mar 5, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Haven't noticed it spelled out before, just GTB6.3; - will take a closer look later tonight.

Maybe see what this fine person/bot with GTB6; (valid?) was trying to do when it hit a 404 on
www.example.com/t =
without referrer...


HTTP_ACCEPT'*/*'
CONNECTION_TYPE'Keep-Alive'
HTTP_USER_AGENT'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6; .NET CLR 1.1.4322; .NET CLR 2.0.50727)'
HTTP_COOKIE'__utma=123456789; __utmz=123456789.utmcsr=bing|utmccn=(organic)|utmcmd=organic|utmctr=my%20keyword; __utmb=123456789; __utmc=123456789; _ZopeId="123456789abcd"; i_blah=1'
HTTP_ACCEPT_LANGUAGE'en-us'
HTTP_ACCEPT_ENCODING'gzip, deflate'
etc...


Edit: the same fellow choked on this about 2 minutes earlier, doesn't inspire much confidence....

www.example.com/dhtmlSuite-tabVie-col><!-- login form --><DIV id=google_masthead><SCRIPT type=text/javascript><!-- google_ad_client =


Apparently it munged part of the source code for that page:

<div id="lh-col">
<!-- login form -->
<div id="google_masthead">
<script type="text/javascript"><!--
google_ad_client =

dstiles

1:26 am on Mar 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The reason I'm wondering about blocking GTB/GoogleToolBar is because I've had evidence on the past of bad behaviour. Whether it's intrinsic in the toolbar or whether it's the user or perhaps other installed software I haven't worked out.

The access block I'm trying to analyze maintains three or four (temp IIS) session cookies at various points and drops cookies altogether in a few cases, including but not always favicon fetches.

I've isolated a total of four UAs for the single IP, suggesting (probably) two different computers on the same IP. Evidence suggests the first one is an auto-activity perhaps integrated in the browsing activity. The short aol UA performed icon-fetching, presumably for the aol browser. I've force-wrapped the longer UAs for clarity.

Mozilla/4.0+(compatible;+GoogleToolbar+6.3.1106.427;+Windows+6.0;+MSIE+8.0.6001.18882)

Mozilla/4.0+(compatible;+MSIE+7.0;+AOL+10.1;+AOLBuild+2.1.84.1;+brand=aol;
+Windows+NT+6.0;+Trident/4.0;
+GTB6.3;+SLCC1;+.NET+CLR+2.0.50727;
+Media+Center+PC+5.0;+.NET+CLR+3.5.21022;
+OfficeLiveConnector.1.3;+OfficeLivePatch.0.0;
+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30729)

Mozilla/4.0+(compatible;+MSIE+8.0;+Windows+NT+6.0;+Trident/4.0;
+GTB6.3;+SLCC1;+.NET+CLR+2.0.50727;+Media+Center+PC+5.0;
+.NET+CLR+3.5.21022;
+OfficeLiveConnector.1.3;+OfficeLivePatch.0.0;
+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30729)

ee://aol/http

I think this particular access session is atypical in some respects in that it seems to use two browsers/computers. It is giving me a few insights, though.

I've previously blocked certain variations of the toolbar UA, especially if it had unlikely headers (usually all missing). I'm now rebuilding my trap software and wondering whether to step up or down toolbar trapping.

If ALL instances of the toolbar read pages I suppose I shall have to let it go, but if I can pinpoint "wasteful" accesses I'd rather block them as a means of discouraging this kind of activity.

caribguy

7:42 am on Mar 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RE your comment from March 4: yes, the GTB6.3 user exhibited verifiable human behavior.

About GoogleToolbar: every single hit was for favicon.ico - except for one instance where favicon.gif was requested after receiving a 403 (it may have triggered a block because of headers, but I don't know for sure)

"ee://aol/http" only downloads favicon.ico

GTB6; 6.3 or 6.4 as part of the UA don't raise any specific flags - I went through a few thousand lines for each variant and things generally look ok.

dstiles

11:16 pm on Mar 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks, Caribguy. From my own research I tend to agree. So it's safe to do anything at all to googletoolbar (and aol) but if they only hit icons there isn't much point in actually blocking them - more bytes than it's worth. :)

In the past I've seen some malformed GTB UAs but they may have been screwed up by MSIE's "Let's add everything" policy of UA updating or by some other add-in such as bsalsa.

I think continue to I'll include logging for them all for now and see what comes up.

dstiles

8:58 pm on Mar 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've found a few "googletoolbar" UAs over the past week hitting home pages. Looking at them I've still decided to block them with a 405. From what I've seen, all googletoolbar types arrive with null headers.

I've been looking through a few form spams the past few days and noticed some using Firefox with the string "GTBDFff GTB7.0" added to the end of the UA. I'm assuming this is a custom version of GTB. I don't think it's limited to form spammers: that's almost certainly coincidental. Some of the hits submitted empty forms; whether that's the fault of this plug-in I don't know. If anyone knows what GTBDFff means I'd be mildly interested.

GaryK

11:43 pm on Mar 31, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sorry I haven't been around in a long while.

I'm certainly no expert when it comes to analyzing headers and such. I just collect user agents and try to identify them. Based on my understanding of this thread I think you all are wondering whether or not to ban/block any UA with a GTB or GoogleToolbar token in it.

After seeing lots of questionable activity from almost 19,000 unique UAs with one or the other token it it I decided to add it to my ban list late last year.

I make my ban list available via my browser project site in the form of an httpd.ini file for ISAPI_Rewrite.

Within a week I had over a hundred comments from people about the entry turning-away legitimate users so I removed it.

dstiles

7:55 pm on Apr 1, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I take it you removed both types of UAs at the same time?

I've been rejecting googletoolbar UAs with no complaints so far, for reasons noted above. I think GTB is the real browser and the other just pokes around on its own.

GaryK

12:07 am on Apr 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I thought about removing GoogleToolbar first since that's what I had the most comments about, but ultimately I did remove them both at the same time.

I don't know if this helps or not, but from 2006 to date I've seen 19,354 unique GTB UAs and 353 unique GoogleToolbar UAs.

dstiles

8:50 pm on Apr 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That surprises me, Gary, that you had complaints about googletoolbar. My own checks suggest it's automated, and it does arrive with no headers worth noting, which is typical bad-bot behaviour. I'll keep an eye on it.

Staffa

9:16 pm on Apr 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Since the advent of Sidewiki I redirect all GTB UAs to a special page explaining why and what to do about it.

Just today and not for the first time, a visitor arrived viewed a number of pages with a browser without GTB, then changed browser UA including GTB to access email account clicked on the link and was redirected to the special page.

Hung around for a while, likely trying the suggestions and "when all else fails" changed browser again and continued browsing the site for quite a while longer.

This has happened on a number of occasions although there are also visitors who move on, likely because they can't be bothered or don't understand what I'm talking about. Never mind ;)

02/04/2010 03:23:38----65.126.nnn.nnn----/directory/realpage.asp ----Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
02/04/2010 03:24:10----65.126.nnn.nnn----/directory/gtb.asp ----Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB0.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; Tablet PC 2.0; InfoPath.2; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
02/04/2010 03:24:10----65.126.nnn.nnn----/directory/gtb.asp ----Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB0.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; Tablet PC 2.0; InfoPath.2; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
02/04/2010 03:24:35----65.126.nnn.nnn----/directory/gtb.asp ----Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB0.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; Tablet PC 2.0; InfoPath.2; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
02/04/2010 03:24:35----65.126.nnn.nnn----/directory/gtb.asp ----Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB0.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; Tablet PC 2.0; InfoPath.2; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
02/04/2010 03:25:06----65.126.nnn.nnn----/directory/gtb.asp ----Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB0.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; Tablet PC 2.0; InfoPath.2; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
02/04/2010 03:26:04----65.126.nnn.nnn----/directory/gtb.asp ----Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB0.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; Tablet PC 2.0; InfoPath.2; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
02/04/2010 03:27:05----65.126.nnn.nnn----/directory/realpage.asp ----Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)

BTW is Sidewiki still relevant ?

GaryK

11:01 pm on Apr 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That surprises me, Gary, that you had complaints about googletoolbar.

Do you mean prior to or after blocking it?

dstiles

7:12 pm on Apr 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Whilst it was blocked.

GaryK

4:23 am on Apr 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I know very little about either GoogleToolbar or GTB, but definitely got more complaints about GoogleToolbar while it was blocked which was why I added it back again.

dstiles

11:02 pm on Apr 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hmm. Looks like we've come full circle, then. Don't block anything. :(

dstiles

9:11 pm on Jun 12, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



An extra on this one.

As noted in the recent eSobi thread I've taken notice of the identifier GTB0.0 as a toolbar ID often seen with extra toolbar plug-ins (eg eSobi) and with high-speed browser scrapes. It probably exists witohut plug-ins as well.

I have just accidentally blocked a customer who has this in her browser UA and am chasing up on it.

From a couple of forum postings elsewhere it LOOKS as if it might be something installed by a google mail account, possibly when there is no other toolbar installed. This is conjecture only at this point.