Forum Moderators: martinibuster

Message Too Old, No Replies

Program strips Adsense

         

varya

7:29 am on Jan 9, 2004 (gmt 0)

10+ Year Member



The new "free" version of Netcaptor completely strips Adsense out of the page.

It also displays its own content-targeted text ads.

KenB

3:47 pm on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I did a test to see if the bot that provides NetCaptor with it's context sensitive ads properly identifies itself in the user_agent string. It does not. It identifies itself as:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR1.0.3705; .NET CLR 1.1.4322)

This user_agent string is the exact UA string that my browser records minus the NetCaptor 7.5 entry. Either this is an odd coincidence or NetCaptor is providing the client's UA string to their ad supplier. Not only does this mean that NetCaptor is giving up personal user_agent data, but it means that the bot is not following standard practices of identifying itself correctly.

If it isn't bad enough that the latest version of NetCaptor has bugs that are causing it's ad-blocking features to be turned on while it is itself displaying ads that are based on the content of the webpage, it is denying webmasters the ability to block the bot that helps deliver NetCaptor its ads. This is the very essence of the definition of scumware or parasite.

For those who are interested the bot comes from the Class B IP block of 168.75.0.0 -168.75.255.255, which according to ARIN WHOIS belongs to ClearBlue Technologies

For those who are interested in blocking this bot, I'd recommend blocking the Class C IP address range of 168.75.65. In your .htaccess file the entry would be:

RewriteCond %{HTTP_HOST} ^168.75.65.
RewriteRule ^(.*) - [F]

I am monitoring the situation to see if NetCaptor's ad provider changes the IP address of their bot.

NetCaptor

3:58 pm on Jan 12, 2004 (gmt 0)

10+ Year Member



To KenB:

NetCaptor is not providing the UA of the user to our context sensitive ad provider. I'm not sure what "bot" you are referring to. NetCaptor can, at its users preference, add "NetCaptor" to the parenthetical subsection of the UA string. The user can turn that off, as it appears you have done.

¦ the content of the webpage, it is denying
¦ webmasters the ability to block the bot that
¦ helps deliver NetCaptor its ads. This is the
¦ very essence of the definition of scumware or
¦ parasite.

Does Opera let you do this? What we're doing is not any different than what Opera is doing.

Now - it appears that some of you have configs that cause NetCaptor not to display some ads. It may be a javascript problem - we can't replicate the issue. Any ad blocking in the free version is entirely accidental. We're trying to determine what might be causing this, but we've been entirely unable replicate. Sorry,

Adam

Adam Stiles
Stilesoft Inc.

KenB

4:17 pm on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google AdSense's bot identifies itself correctly and obeys the robots.txt file(User-agent: Mediapartners-Google*). At least on new unindexed pages, that do not use Google AdSense, webmasters can block Opera from being able to display ads. In regards to sites like mine, that do make use of Google AdSense, we cannot stop Opera from displaying Google ads, however, Opera ALWAYS identifies itself with the keyword "Opera" in the UA string. As such we easily block Opera if we do not agree with it's use of Google Ads. In addition, even if someone hacks Opera and strips the word "Opera" out of their UA string, Opera can still be detected by seeing if the JavaScript object "window.opera" returns a true result.

The very fact that your product allows users to remove the key word "NetCaptor" from the UA string, displays context sensitive ads AND blocks banner ads on websites, makes it a parasite that should be held in contempt. If you don't like this designation, then you MUST remove the ability to remove the keyword "NetCaptor" from UA strings AND discontinue the use of context sensitive ads. By providing the ability to hide the word "NetCaptor" from the UA string, you are admitting that your practices are sleazy and you know we would block your browser if we had the ability to do so.

For the record, many of us do not feel Opera's behavior is acceptable either and have taken measures to prevent it from displaying it's ads in conjunction of our webpages.

crxchaos

5:31 pm on Jan 12, 2004 (gmt 0)

10+ Year Member



I've spent most of the afternoon playing around with NetCaptor, since like many others are reporting it is not displaying AdSense ads. I think I have made some progress.

The code fetched by the iframe in the AdSense blocks is totally different when using IE/Mozilla/Opera compared to NetCaptor. The good code draws the adverts we all know and love. The bad code, as seen in NetCaptor, looks like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=windows-1252">
<META content="MSHTML 6.00.2800.1276" name=GENERATOR></HEAD>
<BODY><IMG src=""></BODY></HTML>

Obviously, the entire body of the page is missing for one reason or another.

When I load the 'good' code into NetCaptor, it actually displays fine. So I am left beleiving that Google is making some kind of error when generating the code.

However, Adam claims to be able to view AdSense ads. The only other clue I have is this:

<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">

The above snippet is a line from the good code generated by Google when using IE6/WinXP. Perhaps some regional/language setting in conjunction with NetCaptor is causing the bug?

As a side note, why on Earth does Google generate code like this?

document.write('</ifr' + 'ame>');

NetCaptor

6:04 pm on Jan 12, 2004 (gmt 0)

10+ Year Member



Good writes the iframe javascript that way to keep HTML filters from blocking it. Some proxy-servers block ifram code, so they spilt it up so simple pattern matchers won't block those lines.

Would anyone who is seeing AdSense getting blocked mind trying our 7.2.2 version? You can download it here:

download dot netcaptor dot com slash nc722 dot exe

This is an older version and does not have our context ads in it. I still don't think this problem is related to our ads, but we might be able to isolate when the "buglet" was added to our code that caused the problem some of you are seeing.

Adam

loanuniverse

6:13 pm on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



KenB:

Geez, I have rarely seen such a good slam whitout having it degenerated into an unnaceptable flame.

I actually feel bad for netcaptor. Well, for a minute or so.... But, I am sure it will pass.

Jenstar

6:34 pm on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NetCaptor 7.2.2 displays AdSense perfectly when using the default settings the program comes with.

<added>There are no javascript errors when the "supress javascript errors" feature is turned off</added>

KenB

7:37 pm on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Geez, I have rarely seen such a good slam whitout having it degenerated into an unnaceptable flame.

My intention isn't to flame. Adam has taken an admiral stance of speaking up and address the issue raised by this thread. I do tend to believe that the ad blocking in unregistered NetCastor is an unintentional bug.

I am, however, trying to point out as clearly as possible that the overall problem extends well beyond just the ad-blocking issue. For instance, when I added a block for the IP address of the bot that indexes webpages so that the advertiser can provide NetCastor with context sensitive ads, it worked for a couple of pages and then ads started to appear again. While I have to wait until tomorrow to see my logs, I believe that this means that the ad supplier is using methods to circumvent efforts to prevent their bots from indexing a site. This is very sleazy and is an unethical behavior for any bot. If a website owner does not want bot 'X' to index their site, they have the right to block bot 'X'.

Now granted NetCastor is not in control of the bot their ad provider uses, however, they have an ethical responsibility to require that their advertiser properly identify themselves when they index pages and that the bots in question properly obey the robots.txt file.


Adam

NetCaptor is not providing the UA of the user to our context sensitive ad provider.

When Netcapter sends an HTTP request to their advertiser to acquire text ads, the HTTP request contains the UA string in the header per normal HTTP protocols (just as normal HTTP requests do). The advertiser is then taking this UA string and striping the NetCaptor information out and using this as the string their bot uses to index a page to determine what ads to display.

I'm not sure what "bot" you are referring to. NetCaptor can, at its users preference, add "NetCaptor" to the parenthetical subsection of the UA string. The user can turn that off, as it appears you have done.

No I am using NetCaptor in it's default configuration here are the two hits to a test file I created that I knew that only NetCaptor and any bot that would index my page for the purposes of providing NetCaptor with ads would hit. The file in question was immediately deleted from my server after the test was run.

{masked for security purposes}.example.com - - [11/Jan/2004:18:09:00 -0500] "GET /NetCaptor.html HTTP/1.1" 200 16182 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461; NetCaptor 7.5.0 Gold; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"

XXX.XXX.65.68 - - [11/Jan/2004:18:09:01 -0500] "GET /NetCaptor.html HTTP/1.1" 200 16182 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR1.0.3705; .NET CLR 1.1.4322)"

Note these things:

  1. This file did not exist until less than a minute before my running this test.
  2. The hit from XXX.XXX.65.68 came one second after my hit.
  3. It contains my UA data minus "NetCaptor 7.5.0 Gold;"
  4. The file was deleted from my server immediately after my test
  5. The ONLY way the second "client" could have know about this file is if NetCaptor had informed it of the existence of the file (which it did to ask for an ad).
  6. There is no way the advertiser can provide NetCaptor with context sensitive ads without indexing the page in question.

I don't think Adam is disingenuous, I just think he doesn't understand the mechanics of the way his browser gets its ads as well as he should.

[edited by: DaveAtIFG at 7:45 pm (utc) on Jan. 12, 2004]
[edit reason] Specifics deleted [/edit]

NetCaptor

7:51 pm on Jan 12, 2004 (gmt 0)

10+ Year Member



A clarification for KenB about the way the NetCaptor ad system works. We query our own search servers at http*//example.com/ with the URL of the active page. An ASP.NET application that I wrote takes that URL and queries ContextWeb for an XML feed of ads to display. The ASP.NET app gets that XML feed and merges it into HTML which is then displayed in a browser window under our toolbar.

Note that the client UA does get sent to example.com, but those are MY servers. Our server does not pass that UA on to ContextWeb to make the ad requests.

I'll check with ContextWeb about how they make their user-agent. I will ask them to to properly obey robot.txt. If they won't make the change, I won't continue to use their services.

Adam

[edited by: DaveAtIFG at 8:28 pm (utc) on Jan. 12, 2004]
[edit reason] Removed specifics [/edit]

loanuniverse

8:06 pm on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



KenB: Just a little clarification, I was not accusing you of flaming. I was just impressed at the way you explained your argument. Clearly enough for someone like me a "non-technical" to follow.

KenB

8:07 pm on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Very fair. Thank you Adam. I look forward to your follow up on this issue in the future. This response does put you in better stead than Opera, who refused to address concerns on this issue.

Again thank you.

Visi

11:22 pm on Jan 12, 2004 (gmt 0)

10+ Year Member



Just finished reading all of the posts....so is it fair to conclude some bug in 7.5 is creating this since previous version okay?

By the way Adam, glad to see you responses on this issue.

cubdriver

6:10 pm on Jan 13, 2004 (gmt 0)

10+ Year Member



My tuppeny's worth: I downloaded Net Captor the day it was featured in the Wall Street Journal. It's the free version 7.5.0 Gold Personal Edition. The AdSense adds on my site, The Warbird's Forum, simply vanish when I view the site with Net Captor. These are vertical ads. They have no border, and there is no blank space as such. (The material underneath them simply moves up to fill the space.)

The Java Script I have on my pages to load images of books (linked to Amazon) works just fine. So do images & links to Amazon that are coded directly to display on the page. I have no paid ads, so I can't say about them.

Visi

12:34 pm on Jan 14, 2004 (gmt 0)

10+ Year Member



Any update on this?

varya

7:30 pm on Jan 14, 2004 (gmt 0)

10+ Year Member



Yeah, I was wondering myself. The initial response from Netcaptor was heartening, but the repeated insistence of "we can't replicate this problem" followed by silence is disappointing.

ThatAdamGuy

8:16 am on Jan 16, 2004 (gmt 0)

10+ Year Member



Hmm... this is a bit disheartening. If we don't hear anything else in this thread, I'll wait 'til Monday or so to drop the other Adam a line and see what's up.

KenB

1:29 pm on Jan 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The other day I contacted ContextWeb, which is the company that supplies context sensitive ads to NetCaptor and similar programs, asking them how to stop the delivery of their context sensitive ads in relation to my sites. They replied back that they would block their bot from any domain name submitted to them. I submitted the domains I have control over and indeed they added them to their block list. They also said that they are working to get their bots to obey the industry standard robots.txt file and to identify itself correctly in the user_agent string.

To get your domains added to ContextWeb's list of sites their bot won't visit, go to ContextWeb's website (http://contextweb.com/contact.htm) and send them a polite email listing your domains you want them to stop indexing. When sending them an email politely explain to them why they need to obey the robots.txt file and user_agent string.

Brett_Tabke

2:20 pm on Jan 16, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Whatever the issues are here, it is clear there was some misunderstanding and some classic foo involved. Whatever issues are left - take em up with the author off site.

Thanks

This 78 message thread spans 3 pages: 78