homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 219 message thread spans 8 pages: < < 219 ( 1 2 3 4 5 [6] 7 8 > >     
Register Scolds AVG For Generating Fake Traffic As Link Malware
Webmasters Complain AVG Debilitating Traffic Analytics

 8:52 pm on Jun 13, 2008 (gmt 0)

In an otherwise interesting article about AVG LinkScanner the author spectacularly misses the point that because it can easily be identified it is worse than useless as a security tool.

But he does tell malware infested drive-by download sites how to fool it.





 8:08 pm on Jul 2, 2008 (gmt 0)

I don't think there was any intent to be malicious here

I would have agreed with you until a few days ago.

There have been four articles in The Register and countless posts on WebmasterWorld - as Jim said "We've raised the flag, we've rung the bell, we've shouted the alarum" - and several members (including me) have contacted AVG directly to show them the error of their ways.

The response from AVG was to try harder to deceive webmasters.

They may have been incompetent about it, but deception remains their intent.

I do not consider that anything but hostile.



 9:00 pm on Jul 2, 2008 (gmt 0)

The problem is that there is money to be made by showing the equivalent of blinking lights that do nothing on the computers of noobs. These clueless users have heard tales of getting infected online, but they got a computer for Christmas, and it's oh-so-cool to use Google or Yahoo search to find neat stuff. Wow, maps and everything!

So what does daughter or son do about grandma's new toy? You can't tell them to turn off JavaScript, and you can't tell them to forget about Flash, and you can't tell them to use common sense because they're new to computers and the web, and don't have any common sense. Chances are they'll be infected inside a month if they surf around for an hour a day. What you do is install the blinking lights and hope for the best.

This is the source of AVG's revenue stream for LinkScanner. They're going to ride out the storm and keep the revenue flowing. There's one thing they cannot do, and that is to show the blinking lights without a page prefetch. That would be deceptive advertising. If they just prefetched the headers instead of the whole page, that's deceptive advertising also. You and I know that any malware writer is having a good laugh over AVG, but the point is that AVG is making money, and that means they have to ignore the complaints from webmasters who don't appreciate the fake traffic.

AVG is between a rock and a hard place. They're making money and the revenue is legal. They're not likely to be stopped by regulators because AVG approaches the point where it ought to attract the attention of regulators, but it doesn't quite get there. At the same time, they're taking a long-term public relations hit. But still, that short-term money is very nice.

Complaints from webmasters will be ignored by AVG as long as they're making money and it's not illegal.


 9:21 pm on Jul 2, 2008 (gmt 0)

"They're making money" and at least some of it they will need to pay for their increased bandwidth use due to the "blinking lights" fetchers hitting the front door and being sent back to their own home.


 9:28 pm on Jul 2, 2008 (gmt 0)

There is something else that peculiar about AVG's legal situation. It's quite possible that their lawyers are telling them that they HAVE to be deceptive with their user-agent.

Look at it this way: If they used a unique and reliable user-agent, then their paying customers would end up on lists of IP addresses that are circulating on the web, and AVG would probably be partially liable if bad guys started using those lists. By being semi-deceptive, they at least have an argument that they tried to prevent this, and their liability is thereby reduced.

What a strange situation.


 9:48 pm on Jul 2, 2008 (gmt 0)

I've had enough of their joke system. This went live yesterday...

"You have been detected as using AVG LinkScanner on your computer. This product makes multiple repeated requests to every site listed in your Google, Yahoo, and Live searches, even those that you didn't plan on accessing yourself.

LinkScanner is supposed to protect you from bad websites, but cannot do so if the bad site you went to access has detected LinkScanner and has already sent it a fake message to fool it.

Of course LinkScanner can be detected and fooled. We have detected that YOU are using it. Think about it! Additionally, the LinkScanner access will be recorded by all of those sites in their log files, and recorded as if YOU had actually visited the site. This is true even for illegal and fake sites; you will have been recorded as a visitor.

Your usage of THIS site is hammering way too many of our limited resources. Your access to this site will be restricted when you reach the threshold we have set.

The only way to restore your access is to wait at least an hour, or to disable or uninstall the LinkScanner component from your computer.

More information about the problems that AVG is causing legitimate websites can be found at...."

[edited by: g1smd at 9:51 pm (utc) on July 2, 2008]


 9:48 pm on Jul 2, 2008 (gmt 0)

the source of AVG's revenue stream for LinkScanner

I happily pay normal bandwidth charges. I view the "LinkScanner Tax" rather differently.

When some pompous oaf in a faraway land tries to impose an unfair tax what do you do?

I believe there is a 230-year-old precedent.



 9:57 pm on Jul 2, 2008 (gmt 0)

So you're using a round-robin of IP addresses that you've detected as using LinkScanner? In other words, you have to detect them first as a prefetch, and then hope that a few seconds or minutes later they click on your link and come in on their browser. Your message is for their browser, obviously.

The round-robin of IP addresses would work nicely if your home page is served dynamically, because then you can check your list of IP addresses just before you serve every home page. You can also detect LinkScanner with such a page and keep your list of IP addresses up to date. But how do you do it if your home page is static?

It's a nice message, and if all webmasters could do this, it would be very effective in getting LinkScanner recalled by AVG.


 10:27 pm on Jul 2, 2008 (gmt 0)

You can do this for static pages if you can auto-prepend a script.

There's a setting in (I think) .htaccess that you can use to auto-run a script before the real page is loaded.

That's for Apache. I have no idea if IIS can do anything similar.


As for my message above, the IP addresses of obvious LinkScanner users are filed for 2-3 hours and the message is shown to anyone from that IP address that returns in that time-frame.


 10:57 pm on Jul 2, 2008 (gmt 0)

IIS is a problem. MS screwed up badly on server features starting with htaccess and going on from there.

Two methods of holding IPs: in a file and in a global AppVar - like session vars but across all sites and not dependant on cookies of any sort.

Holding the list in a file is not really feasible across several sites on a server (virtual hosts). There are problems and delays in modifying the file except for simple append - at least in the time frame usually available.

Holding IPs as a list in an AppVar is more feasible but there are, in my experience, latency problems.

In either case, "who" does the periodic cleanup? If it's a designated site it could be a site that doesn't get triggered for ages. If any site can do it there are bound to be clashes with the File or Appvar possibly being corrupted.

I suppose an AppVar per site but on busy sites but duplication where sites are hit in the same google and there can still be multiple users within a short time-frame.

I still wish I'd never listened to the guy who said, "ASP and MS servers are easy!" when I had the option, 12 years ago. :(


 5:33 am on Jul 3, 2008 (gmt 0)

I think I'm gonna try to filter the user agent below that has NO referrers comes with it.

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

Do you think it's gonna work?


 7:00 am on Jul 3, 2008 (gmt 0)

Cyclob, that's what I do.

I redirect the 2 versions of AVG's user agent IF they come without referrer.

Humans who happen to have Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) as UA get to see the page like anyone else and usually they look at more than one page so it works.


 7:20 am on Jul 3, 2008 (gmt 0)

cool! thanks Staffa. I'll do it today.


 2:01 pm on Jul 3, 2008 (gmt 0)


You missed off the last line. The one that goes

"Security software that does not place you at additional risk can be found at <a href='..."



 3:55 pm on Jul 3, 2008 (gmt 0)

LinkScanner for Dummies

LinkScanner was originally released in 2006, the invention of Roger Thompson of the amusingly named Exploit Prevention Labs. It was easy to exploit then and it is easy to exploit now, but it didn't cause any real problems because almost nobody used it.

Grisoft bought LinkScanner in December 2007, and rebranded as AVG Technologies in advance of the release in April 2008 of version 8.0 of their hitherto popular and successful anti-virus package.

As part of the deal Roger Thompson became AVG's Chief Research Officer, and took all his eighteen staff with him - Greg Mosher became Vice-President of Engineering and Chris Weltzien became Vice-President of Business Development, according to a CNET report.

Problems with AVG LinkScanner first came to light on 31 March when WebmasterWorld contributor Umbra reported that a mystery user-agent (the infamous "1813") was blundering into his security traps. It was a while before DanA linked it to AVG on 28 April - just as users were starting to upgrade in their millions.

As "1813" was clearly a dishonest robotic tool the general response from webmasters was to leave it blocked - for many of us there was no need to do anything, as it already tripped our security traps.

But LinkScanner was so poorly designed that it reacted badly to a straight "403 Forbidden" response, making many repeated requests, and it also came to light that blocking it would mean that AVG users would be discouraged from visiting our sites.

There was a lot of confusion at first, but after a little testing - something AVG Technologies might employ in future - an effective method of dealing with LinkScanner was arrived at on 10 May which would stop it causing problems for webmasters.

Meanwhile the many other deficiencies of LinkScanner were exposed: the colossal waste of bandwidth, the destructive effect on statistical analysis, other poor design features and (most obviously) the fact that as it was so easily fooled it was a security risk for anyone who used it.

WebmasterWorld members contacted AVG and pointed them to our findings.

AVG ignored us.

The Register apparently got involved when alerted by another webmaster whose statistics were going haywire, and had no idea what the full facts were. Journalist Cade Metz was savvy enough to check out WebmasterWorld but understandably (he is not a webmaster) found all the technical debate confusing. He had a worthwhile story, though, and first published on 13 June.

In the article AVG's Roger Thompson gave short shrift to webmasters' concerns and gave The Register his immortal quote: "I don't want to sound flip about this, but if you want to make omelettes, you have to break some eggs."

In the comments to the story AVG asked for webmasters to contact them to help solve the issues. I was probably one of the first to do so and sent a cordial email to Pat Bitton suggesting that WebmasterWorld had all the information AVG needed. The response was arrogant, dismissive, and very close to offensive.

Over the next few weeks AVG happily posted anywhere but WebmasterWorld seeking help. Some websites proudly tell how they were contacted by Roger Thompson himself, and even by AVG's CEO Karel Obluk. The obvious conclusion is that the company had nobody on the payroll who had a clue about the web.

Or about security. The original article in The Register told any malware writer with reading skills exactly how to fool LinkScanner and to safely deliver a drive-by download - if they didn't already know. LinkScanner was so obvious and so easily fooled that anyone could do it.

It seems to have taken quite a while for this to sink in at AVG Technologies, though they were told about it often enough. Eventually they realised that LinkScanner was a security risk and tried to fix it. But they didn't know what they were doing and mistakenly introduced two more obvious fake user-agents before deciding to go back to the original Exploit Prevention Labs method of falsely claiming to be a genuine IE6 user.

Which, of course, was just as easy to fool.

Security companies - naturally - rarely admit to making security gaffes. So AVG's public utterances merely said they wanted to help webmasters with the analytics and bandwidth problems and that "we still enable those webmasters who want to filter our requests out of their results to do so".

But AVG would not tell webmasters how this could be done.

Possibly because they didn't know.

The entire exercise seems to have been a sham. As The Register pointed out this week, if there is any way for LinkScanner to be detected then it remains easy to fool. And The Register proved it by publishing details of one detection method for the whole world to see.

Even the "security experts" at AVG Technologies will understand that method by now, and action to prevent it should be expected in the "Service Pack 1" they have scheduled for release in mid-July.

So where does this leave us?

AVG appear to be saying they allow - by design - methods for concerned webmasters to detect LinkScanner, but will not say what they are and appear to be working hard to make any detection impossible.

Webmasters are busy preparing new methods. They don't like being forced to pay the "LinkScanner Tax" along with their normal bandwidth charges, they don't like having their statistical analysis rendered unusable, they don't like dishonest robots acting like malware to access their sites, and they don't like being insulted by clueless muppets like Roger "The Eggbreaker" Thompson and his friends at AVG.

The Register is also not finished - this sentence in the most recent article caught my eye:

"If a web master can identify AVG scans, so can a malware writer. But more on that later."

Watch this space. Or that one.



 4:21 pm on Jul 3, 2008 (gmt 0)

"an effective method of dealing with LinkScanner was arrived at on 10 May which would stop it causing problems for webmasters."

What did that effective method finally end up being? If it was the htaccess file, would someone please post the complete code and just exactly where it should be placed in the htaccess file (beginning or end)as I have other things in my htaccess and want to put the code where it needs to be.


 4:47 pm on Jul 3, 2008 (gmt 0)

What did that effective method finally end up being?

One point of my long and boring post above is that there is no "finally".

Explicit instructions posted here will have a short shelf-life in the current situation, and as AVG claim to enable webmasters to filter out LinkScanner my advice is to contact them and ask how.

Also, some webmasters use IIS servers, some do not have .htaccess privileges, and copy/paste solutions are not advisable as in some cases they will conflict with existing code and kill your site. You will have to do some reading, but everything you need is on WebmasterWorld.

I come to WebmasterWorld to learn from and (where possible) help other webmasters.

I think the best advice is to contact AVG Technologies for the official solution.



 7:46 pm on Jul 3, 2008 (gmt 0)

LinkScanner for Dummies

Samizdata, thank you so much for this summary and all your efforts!

On a related note, looks like this issue has made it to slashdot [slashdot.org]


 7:59 pm on Jul 3, 2008 (gmt 0)

*** the invention of Roger Thompson ***

That explains a few things.

There's no way he is going to back down on this, and I can certainly guess that he will not tell AVG that they bought a pile of junk from his old company.

It needs someone from the old AVG staff-base to get a clue and sort this debacle out. The ex-EPL people aren't likely to be admitting they produced a turkey.

*** slashdot ***

Hmmm. Kinda funny to see AVG being slash-dotted. I guess that the bandwidth charges at AVG Towers might be a bit elevated for the next few days.


 8:12 pm on Jul 3, 2008 (gmt 0)

I noticed that there is a tool called LinkScanner Online at explabs.com. Enter any url and "LinkScanner will tell you if it is exploit free"

Not sure yet if it differs from the LinkScanner embededd in AVG, but it does seem to use the same headers and SV1 user agent with no referer.


 9:24 pm on Jul 3, 2008 (gmt 0)

looks like this issue has made it to slashdot

The comments seem to be a repeat of the largely ill-informed ones The Register gets.

This would suggest that webmasters are much smarter than nerds and geeks.

Personal thanks to you Umbra for starting the original thread.



 9:27 pm on Jul 3, 2008 (gmt 0)

So much for LS's accuracy, I just tried one of my URLs at explabs ...
what a joke.

Result : a green star and Congratulations! LinkScanner Online did not find any exploits.

Of course they did not, they came in with UA Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1) (disregard the +s, IIS logs)
but from IP range 38.nnn.nnn.nnn which is completely blocked by my site so they got a 404

I guess they must consider that 404 is harmless to their customers ;o)


 11:25 pm on Jul 3, 2008 (gmt 0)

I noticed that there is a tool called LinkScanner Online at explabs.com. Enter any url and "LinkScanner will tell you if it is exploit free"

Yes, and that tool let's you experiment with hacked sites and I found a few that don't work, it said the hacked site (which I posted before) was OK and still does weeks later.

I passed the info to Cade about this problem but never heard back from him so he probably didn't care that the Emperor has a wardrobe malfunction.

You can't protect against what your code doesn't alert against and this was a simple failure of an infected site 101.


 11:34 pm on Jul 3, 2008 (gmt 0)

Since people keep asking, I'll cough up the real trick here to detecting AVG's Link Scanner. There's only two things you need to check to stop the link scanner regardless of their UA so using these two conditions will always nail them to the wall as well as some other junk out there that's improperly written.

RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{HTTP:Accept-Encoding} ^$

If you don't check for the UA and only check for these 2 things you'll stop them forever unless they change this behavior as well.

Now don't say I never gave you anything ;)


 4:35 pm on Jul 4, 2008 (gmt 0)

Declaration of Independence

I do not submit to the LinkScanner Tax imposed by AVG Technologies.

I hold these truths to be self-evident:

AVG Technologies has no right to impose costs upon me for their own financial gain.

AVG Technologies has no right to access my server by deception.

AVG Technologies has no right to take my documents by deception.

AVG Technologies has no right to use my bandwidth by deception.

AVG Technologies has no right to abuse and usurp my property and resources.

"When a long train of abuses and usurpations, pursuing invariably the same Object evinces a design to reduce us under absolute Despotism, it is our right, it is our duty, to throw off such Government, and to provide new Guards for our future security."

The AVG LinkScanner Tax is levied without authority and imposed without consent.

Up with this I will not put.

4th July 2008



 5:05 pm on Jul 4, 2008 (gmt 0)

RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{HTTP:Accept-Encoding} ^

I understand that some firewalls will strip those headers. Won't these rules block the occasional person behind a firewall?


 5:31 pm on Jul 4, 2008 (gmt 0)

I understand that some firewalls will strip those headers. Won't these rules block the occasional person behind a firewall?

Maybe the referer, but not the combination as far as I know.

I've been blocking things that don't send the Accept for ages and very few things hit that barrier until LinkScanner showed up.


 5:58 pm on Jul 4, 2008 (gmt 0)

Maybe the referer, but not the combination as far as I know.

I've been blocking things that don't send the Accept for ages and very few things hit that barrier until LinkScanner showed up.

Are we talking about HTTP_ACCEPT or HTTP_ACCEPT_ENCODING? -- as the rewritecond rules above shows HTTP_ACCEPT_ENCODING.

I ran a short test yesterday that captured headers and found a handful of blank referer/accept_encoding that on the surface appears to be from residential IPs. I did a Google search and saw some anectodal evidence that some firewalls may modify or remove the accept-encoding header. However, I did notice a scanning tool that gave HTTP_ACCEPT_ENCODING (but a blank HTTP_ACCEPT) and either a blank referer or a fake referer.

[edited by: Umbra at 6:01 pm (utc) on July 4, 2008]


 7:07 pm on Jul 4, 2008 (gmt 0)

One interesting "feature" of LinkScanner surfaced in the comments on Slashdot.

Google, as we all know, often changes the logo on their home page to mark an event of some sort (current or historical) and hyperlinks it to a trusted primary resource on an external site - an inbound link I am sure we would all appreciate if it pointed to one of our sites.

Countless millions of people hit the Google homepage every day, and though only a small percentage are likely to follow the link the number who do will be significant enough to be noticeable.

Most people don't even notice the link is there, of course.

But LinkScanner does.



 8:02 pm on Jul 4, 2008 (gmt 0)

I understand that some firewalls will strip those headers. Won't these rules block the occasional person behind a firewall?

Followup: I found a request with this header:

HTTP________ = ----:----------<snip>
HTTP________________ = ----- -------

Could it be a firewall modifying this?

HTTP_REFERER = http://www.example.com


 11:38 pm on Jul 4, 2008 (gmt 0)

I cannot offer a definitive explanation Umbra - I have never seen what you describe and don't use the same methods anyway - but it seems to me that a request which so deliberately refuses to say what it will accept in response cannot reasonably expect to be fulfilled.



 1:05 am on Jul 5, 2008 (gmt 0)

I log submitted form content plus environment variables for anti-form-spam traces if the browser parameters look dodgy (missing referers etc). Looking through one such log I find a variety of actual dummy characters: typically underscores, hyphens, tildas and X's.

The format of the dozen or so I've just examined suggests the blanks to be HTTP_ACCEPT_ENCODING and HTTP_REFERER - the character counts fit, the latter's value is a variable length and both variables always seem to be otherwise missing.

I've always put it down to some half-baked anti-virus / firewall app that's trying to hide stuff that it deems unimportant just so it can claim to be protecting the ignorant victim - sorry, user! None of the important stuff ever seems to be missing apart from those two - which, of course, turns the user into a rejected robot by incredibill's logic.

I do know that the dummy/missing params sets are often in valid browsers: I can determine this from the contents of the trapped forms, which are obviously different from auto-submitted spam.

This 219 message thread spans 8 pages: < < 219 ( 1 2 3 4 5 [6] 7 8 > >
Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved