Forum Moderators: martinibuster

Message Too Old, No Replies

Exabot causing AdSense impressions

This bot executes JavaScript for screenshots

         

zCat

10:22 pm on Feb 28, 2007 (gmt 0)

10+ Year Member



I've just noticed this pattern in my logs:

193.47.80.37 - - [28/Feb/2007:23:01:25 +0100] "GET /obscure/page/with-adsense-on.html HTTP/1.1" 200 14429 "-" "Mozilla/5.0 (compatible; Exabot/3.0; +http://www.exabot.com/go/robot)"
193.47.80.94 - - [28/Feb/2007:23:01:24 +0100] "GET /obscure/page/with-adsense-on.html HTTP/1.1" 200 14429 "-" "Mozilla/5.0 (compatible; Konqueror/3.4; Linux) KHTML/3.4.3 (like Gecko)"
66.249.72.5 - - [28/Feb/2007:23:01:26 +0100] "GET /obscure/page/with-adsense-on.html HTTP/1.1" 200 14415 "-" "Mediapartners-Google/2.1"

Exabot powers the French search engine Exalead ( [exalead.com...] ), which makes heavy use of thumbnail screenshots. There is a Linux utility which uses the KDE infrastructure to access webpages with the help of the Konqueror browser to generate the screenshots, and it is this which is evidently executing the AdSense Javascript.

I shall stop serving ads to the Exalead IP range; be interesting to see what that does to my AdSense statistics. (I wonder if Google ignores impressions originating from the Exalead servers?)

jonaspersson75

10:41 pm on Feb 28, 2007 (gmt 0)



<del>adsense uses javascript, bots dont know javascript. what you say is not true.</del>

sorry. i just realized what kind of search engine it is.

fredw

11:31 pm on Feb 28, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don't we suspect G is smart enough to ignore/discount clicks of this type generally?

zCat

11:43 pm on Feb 28, 2007 (gmt 0)

10+ Year Member



Don't we suspect G is smart enough to ignore/discount clicks of this type generally?

It's not causing clicks, but additional impressions, which could be skewing the statistics e.g. reducing the CTR.

Whether or not that is causing any kind of problem, I'm not sure though.

incrediBILL

4:17 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The solution to your problem is the following change to robots.txt:

User-agent: exabot
Disallow: /

Jafo

4:24 pm on Mar 1, 2007 (gmt 0)

10+ Year Member



Not disputing you, just wondering how you can determine from your logs whether or not exabot is actually rendering the javascript? I would not think it COULD be logged. It shows they access the page, but not whether or not it executes the javascript.

I would imagine exabot themselves would stop rendering adsense as it would be a bit intensive on their servers.

joelgreen

4:34 pm on Mar 1, 2007 (gmt 0)

10+ Year Member



It shows they access the page

I would assume one executes javascript if it loads blahblah.js files.

jomaxx

4:34 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Reread post 1. The Mediapartners spider visited 2 seconds after Exabot, which at the very least is highly suggestive. It's also stated that the spider is designed to create thumbnail images of the page, which is presumably why Javascript is executed.

[edited by: jomaxx at 4:35 pm (utc) on Mar. 1, 2007]

encyclo

4:34 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



User-agent: exabot 
Disallow: /

Exalead supplies search results to AOL France, so if you have French-language content it is worth allowing their bot. I'm tempted to cloak for the bot to remove the AdSense block.

incrediBILL

4:56 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Exalead supplies search results to AOL France, so if you have French-language content it is worth allowing their bot. I'm tempted to cloak for the bot to remove the AdSense block.

Most of my french traffic comes from other places so I blocked their little spider many months ago and couldn't be happier ;)

[edited by: incrediBILL at 4:56 pm (utc) on Mar. 1, 2007]

jomaxx

5:02 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



BTW I just took a look at Alexa out of curiosity. I don't know how they generate the thumbnails they show, but those screenshots also show that Javascript and even AdSense code were executed.

encyclo

5:22 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Most of my french traffic comes from other places so I blocked their little spider many months ago and couldn't be happier ;)

I get a very small trickle of traffic from AOL France, so it's just about worth it for me, and as I'm already doing some simple cloaking for AdSense (logged in, logged out etc.) it's pretty easy to check for Exabot. Alexa is totally banned, though. :)

Just wait until Googlebot starts parsing with Firefox code, then we'll have to come up with a solution. ;)

ConfusedButCommitted

6:32 pm on Mar 1, 2007 (gmt 0)

10+ Year Member




I've been suspecting behavior like this for awhile.
Besides exabot, how could I determine the extent of these javascript executing bots? Is there a list to refer to?

joelgreen

9:41 pm on Mar 1, 2007 (gmt 0)

10+ Year Member



Alexa is totally banned, though. :)

Does Alexa generate images for subsequent pages? It seems to me it generates image for home page only, which can be refreshed once per day or so (thus not generating noticeable impressions). Am I wrong? Is there a reason to ban Alexa?

incrediBILL

10:20 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Besides exabot, how could I determine the extent of these javascript executing bots?

I know of Snapbot, Exalead and Alexa for starters but the difference is Alexa just does a thumbshot of the home page opposed to Snapbot that tried to make 40K screen shots on my site and they got NADA.

zCat

1:00 am on Mar 2, 2007 (gmt 0)

10+ Year Member



The software being used here to make the screenshots is very likely "khtml2png2" ( [sourceforge.net...] ); I tried it out and it's quite nifty, and easily automatisable. I think I recall reading somewhere that Snapbot uses the same.

From my stats yesterday, Exalead accessed around 250 pages with this bot - if that translated into 250 AdSense impressions, that is (for that particular site) quite a dilution of the CTR / eCPM.

Looking at Exalead more closely, they have a "preview" function in which they present a cached version of the page embedded in their own site, sans AdSense code. Can't find any hits originating from their SERPs.

Bye bye Exalead.