homepage Welcome to WebmasterWorld Guest from 107.20.25.215
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 219 message thread spans 8 pages: < < 219 ( 1 2 3 [4] 5 6 7 8 > >     
Register Scolds AVG For Generating Fake Traffic As Link Malware
Webmasters Complain AVG Debilitating Traffic Analytics
Samizdata




msg:3674412
 8:52 pm on Jun 13, 2008 (gmt 0)

In an otherwise interesting article about AVG LinkScanner the author spectacularly misses the point that because it can easily be identified it is worse than useless as a security tool.

But he does tell malware infested drive-by download sites how to fool it.

[theregister.co.uk...]

...

 

mlduclos




msg:3683410
 7:29 pm on Jun 25, 2008 (gmt 0)

Hello

My server is being overloaded by this stupid new "feature" for AVG. I think people should continue to notify AVG about this problem, so they can fix it.

My DB is causing a huge increase of server load and I got 2 Ddos already
. Can someone please send me a .htaccess code to redirect user agent

"User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

to a static html page?
Thanks

jimbeetle




msg:3683439
 8:00 pm on Jun 25, 2008 (gmt 0)

I don't have anything to add to the conversation (besides the obvious "Duh!?!). Have been following this and the other related thread since the beginning. Just want to give a hat tip to everybody who has contributed and made it so even a dunderhead like me can understand the problem.

And thanks also for the entertainment value.

wilderness




msg:3683455
 8:12 pm on Jun 25, 2008 (gmt 0)

Can someone please send me a .htaccess code to redirect user agent

[webmasterworld.com...]

Samizdata




msg:3683478
 8:46 pm on Jun 25, 2008 (gmt 0)

And thanks also for the entertainment value

They certainly seem to have hired all the wackiest clowns at AVG Technologies.

...

mlduclos




msg:3683535
 10:19 pm on Jun 25, 2008 (gmt 0)

hey wilder, tks

So no solution at all? We should pressure AVG so they revert this incredible "feature" in the next version.

About 50% of my traffic are currently pseudo-traffic of bots and AVG. The server is not handling well.

Samizdata




msg:3683541
 10:38 pm on Jun 25, 2008 (gmt 0)

people should continue to notify AVG about this problem, so they can fix it

I would strongly encourage you to contact AVG with the exact details of your problem.

The bad news is that what you are seeing IS their attempt to fix it.

...

mlduclos




msg:3683543
 10:45 pm on Jun 25, 2008 (gmt 0)

The AVG linkscanner is a kind of robot

But an honest robot identify itself as a robot and follow robots.txt directives. The excuse of their users security is falacious, its illegal what they are doing.

Samizdata




msg:3683550
 11:04 pm on Jun 25, 2008 (gmt 0)

its illegal what they are doing

IANAL but I don't know of any law against being ridiculous or stupid.

--

I propose that their next user-agent should be:

LinkScanner (Incompetent; Dishonest; Wasteful; Easily Fooled; AVG 2008)

Funny thing is, that would not trip any current filters on my sites.

I feel another RewriteCond coming on...

...

mlduclos




msg:3683596
 12:20 am on Jun 26, 2008 (gmt 0)

Hey, I just discover that AVG is flagging my site with red "X" in google search results, and I have no idea why...

They overload my server sending pseudo visits and keep the real visitors away? Thats not fair

How to proceed?

Samizdata




msg:3683603
 12:46 am on Jun 26, 2008 (gmt 0)

LinkScanner is telling you (and everyone else) that there is malware on your site.

There should be a popup when you mouse over the link.

It should say "For additional information click here".

I advise you to click there and start taking screenshots.

Then check your site very carefully for any code or files you did not put there.

If you are sure your site is clean contact AVG.

While you are waiting for a reply read this thread.

[webmasterworld.com...]

...

mlduclos




msg:3683604
 12:54 am on Jun 26, 2008 (gmt 0)

Hmm theres a popunder code provided by cpx interactive and in anotherpages the code from popuptraffic.com

Both companies are very reputed.
Could be this?

Samizdata




msg:3683628
 1:29 am on Jun 26, 2008 (gmt 0)

The question is "did you put the code there".

If you did and have confidence in it then contact AVG.

If you didn't then you have a bigger problem.

I know nothing about either company.

And you only need to post in one thread.

...

dmje




msg:3683657
 3:05 am on Jun 26, 2008 (gmt 0)

I have added code to my htaccess file and I am still getting bombarded with requests for files from this crap....I am not exactly sure where the lines of code should be placed in the htaccess file...or just what to put there now as I understand that the user agent(s) have changed or there is now more than one being used.....

Is there any chance that by blocking the fakes that actual customers are being blocked?

I am confused as to whether it should or should not be blocked...is there a consensus on what should be done?

I would love to stop the bombardment but by the same token I do not want to take the chance of losing any customers either because of the block or the chance that the block may cause me not to get the green check mark...

This is so confusing.....

mlduclos




msg:3683662
 3:15 am on Jun 26, 2008 (gmt 0)

Its bad to the user side too
Think about the lots of people that still use a slow dial-up connection
A lot of DSL and cable plans also have tiny limits of bandwidth usage, its very expansive for them.

Samizdata




msg:3684168
 3:41 pm on Jun 26, 2008 (gmt 0)

I am still getting bombarded with requests for files

You cannot stop the requests - they are made automatically every time your site appears in the search results of a LinkScanner user, so some sites can expect thousands a day.

Example: if a LinkScanner user searches for "widget" and the word appears anywhere on your site there is a good chance you will see a hit, even if the kind of widget your site mentions is completely unrelated the the kind of widget that was being searched for.

Therefore almost any well-ranked site will be plagued by LinkScanner.

Is there any chance that by blocking the fakes that actual customers are being blocked?

Firstly, blocking with a straight 403 is not a good idea - LinkScanner responds with repeated requests (in one case I had 120 in 12 seconds), and will mark your site as "questionable" (which will naturally discourage people from visiting it).

What many here have been doing for the past six weeks is either serving a very small file to LinkScanner or redirecting it to AVG's site. In both cases their site is marked as clean. If done correctly I would say there is no chance that actual customers are being blocked.

Various examples of how to do it have been posted already, but as AVG recently introduced some new (and very obvious) dishonest user-agents the examples may have to be amended slightly.

This is extremely easy for those who understand .htaccess (or the Windows equivalent) but those who are new to it really need to do some reading and learning as simply copy/pasting .htaccess that you don't understand is a recipe for disaster.

This is so confusing

A reasonable comment, but if you read WebmasterWorld you should achieve enlightenment.

And you will be a lot less confused than AVG, who are apparently clueless.

Once again, I strongly encourage any webmaster who is experiencing problems due to LinkScanner to contact AVG with specific details and examples. Be polite, be patient, and do not rant.

...

Seb7




msg:3684578
 9:00 pm on Jun 26, 2008 (gmt 0)

Samizdata,

I'm finding this agent is crawling through some of the pages without any cookie control, and not able to read URLs correctly either. In my case it was ignoring the terminating quote in a URL.

Where its become a problem on some of my websites, I've outputted just this to the user agent:

<html><button onclick="location=location">Enter</button></html>

Anyone think this was a bad idea?

Seb7




msg:3684581
 9:05 pm on Jun 26, 2008 (gmt 0)

also,

Any suggestion on how AVG non-subscribers go about contacting AVG about these issues?

Samizdata




msg:3684602
 9:29 pm on Jun 26, 2008 (gmt 0)

Any suggestion on how AVG non-subscribers go about contacting AVG about these issues?

You only need to type three letters into your favourite search engine to find a selection of postal addresses, phone numbers and fax numbers in a variety of countries (no email addresses though).

If you click the link in the first post in this thread and look at the comments on the article in The Register you will see the email address of AVG's head of communications under a specific request for webmasters to get in touch.

Once again, please be polite, I suspect they may be feeling rather sensitive.

...

Samizdata




msg:3684702
 11:48 pm on Jun 26, 2008 (gmt 0)

The Register publishes yet another hopeless article on LinkScanner:

[theregister.co.uk...]

Make of it what you will.

...

Seb7




msg:3686027
 5:03 pm on Jun 28, 2008 (gmt 0)

...which will allow us to continue to provide the best possible protection for our customers, without imposing too much extra bandwidth on websites

I'm sure the world wide bandwidth meter has made a jump. ISPs everywhere are probably wondering whats going on.

incrediBILL




msg:3686081
 7:11 pm on Jun 28, 2008 (gmt 0)

My site has officially seen a major shift from the AVG use of ";1813" to the "SV1" being dominate so the 'fix' update is rolling out quickly and now with all the HEAD/GET nonsense the requests are 2x.

Already had 10K hits by noon just by AVG and it's a slow Saturday.

Not a happy camper.

Samizdata




msg:3686107
 8:15 pm on Jun 28, 2008 (gmt 0)

Draft: LinkScanner's Many Qualities
Corrections and additions welcome

--

For Good Guys:
Dishonestly poses as a human visitor
Does not adhere to established robot protocols
Marks unexamined sites as "questionable" and discourages visitors
Wrongly blacklists some sites with shared IPs as malware sites
Does not inform sites they have been blacklisted
Cannot handle a simple server response e.g. 403
Cannot handle some JavaScript files properly
Makes numerous repeated requests unnecessarily
Makes incorrect and unnecessary HEAD requests
Prefetches unnecessarily
Fails to use caching
Wastes significant bandwidth
Renders some analytics techniques unusable
Possibly disables "first click free" techniques
Tries to deceive webmasters
Verdict: Nuisance

--

For Bad Guys:
Easily fooled by malware sites
Uses no bandwidth if redirected to AVG by malware sites
Identifies user IP to malware sites
Vulnerable to placement of identifying cookies by malware sites
Vulnerable to delivery of drive-by downloads by malware sites
Potential tool for denial of service attacks
Verdict: Bonus

--

For Users:
Slows down search results on some setups
Has been mistaken as a Google feature
Poor performance on dial-up
Crashes Firefox sometimes
Increases bloat
Gives false sense of security
Identifies user as AVG user
Allows simple exploits
Verdict: Danger

--

For AVG:
Costly embarrassment
Public relations nightmare
Source of ridicule
Verdict: Liability

--

With careful use of regular (English) expressions all this might be reduced to:

LinkScanner = Nuisance for Good Guys + Bonus for Bad Guys + Danger for Users + Liability for AVG

Submitted for peer review.

...

incrediBILL




msg:3686112
 8:36 pm on Jun 28, 2008 (gmt 0)

Does not adhere to established robot protocols

It's technically not a crawler therefore the required use of robots.txt would be questionable at best.

It's a link checker and link checkers have historically not used robots.txt.

Samizdata




msg:3686125
 9:33 pm on Jun 28, 2008 (gmt 0)

Thanks for the clarification - the list was drawn from many WebmasterWorld posts.

There has been a lot of confusion about LinkScanner, not least at AVG Technologies.

...

dstiles




msg:3686132
 9:42 pm on Jun 28, 2008 (gmt 0)

Samizdata:

Nice list but you forgot about corporate monitors checking employee site access. Poor sod of an employee searches for innocuous site, clicks on site s/he's looking for, gets logged as looking at sites s/he shouldn't be, gets reprimanded / fired / arrested depending on seriousness of site / situation.

"False sense of security" should actually translate to "No security whatsoever" since both good and bad guys serve up "good" pages - or at least serve up AVG pages. :)

I'm not sure what the effect would be of someone searching for private data. Several of my customers use the google search bar instead of the Location/Address field in browsers. This is very common in general. It causes me any amount of hassle sometimes along the lines of "I can't find my new web site / demo site / Control Panel - did you give me the correct address?!" The demo one would be particularly annoying as it would point AVG to my local server. (Same applies to Phorm on that one!)

Also, not just poor performance on dialup - a lot of people still use slow 3-year-old-plus computers (I still log Windows 98 hits) and some (eg me) have slow 256/512 broadband. Even on fast broadband and a reasonably new XP computer my brother complains of the machine slowing down on google listings - or did until I told him to switch off LinkScanner.

Samizdata




msg:3686141
 10:34 pm on Jun 28, 2008 (gmt 0)

Thanks dstiles for the additions and further clarification.

After allowing sufficient time for corrections or additions a suitably amended definitive list might be useful to many webmasters, and to our friends at AVG Technologies. Or to The Register.

...

Samizdata




msg:3686164
 12:17 am on Jun 29, 2008 (gmt 0)

Apologies for omitting Umbra's original 31 March report:

"mod_security always throws an error for this one"

I really should have remembered that, something similar happened for me.

"Blunders into Good Guys' security traps then marks their site as questionable"

...

g1smd




msg:3686480
 7:26 pm on Jun 29, 2008 (gmt 0)

*** LinkScanner = Nuisance for Good Guys + Bonus for Bad Guys + Danger for Users + Liability for AVG ***

Good executive summary there.

Samizdata




msg:3686574
 11:13 pm on Jun 29, 2008 (gmt 0)

Thanks, but I would say it was pretty self-evident.

I notice that a simple web search turns up a few sites devoted to LinkScanner.

Some proudly boast that they have had emails from AVG bigwigs Karel Obluk and Roger Thompson seeking help, but none seem to have any real understanding of what is wrong with LinkScanner.

As I told Pat Bitton, WebmasterWorld tells you all you need to know.

Additions for the list:

Reportedly fetches image files by mistake
Reportedly fetches high bandwidth PDF files deliberately
Identifies user on unvisited and potentially incriminating sites
Identifies AVG user IPs in publicly viewable records

...

Cyclob




msg:3686766
 10:44 am on Jun 30, 2008 (gmt 0)

I've already filtered the ;1813 out from my log file and created them in the new category which I called "AVG" couple weeks ago.

But I just noticed that the number of my 'Unidentified' referrals are getting higher again and the number in my "AVG" category are also decreasing.

Some folks here said they changed their UA name from ;1813 to something else, is this true?

If so, how should I filtered them out again since I don't know what's their new name.

Please help. Thanks in advance.

Cyclob




msg:3686779
 11:08 am on Jun 30, 2008 (gmt 0)

Also, is filtering only ;1813 is enough? I've read that the 'SV1' is also one of them. Should I filter them out as well?

About the .htaccess that Samizdata recommended to serve a small file determine by their User Agent name, which the first line is...

RewriteCond %{HTTP_USER_AGENT} ;1813\)$

Should I add one more and serve them to 'SV1' as well?

I'm not a programmer so I apologize if my question looks a little bit baby for someone.

Please advice.

This 219 message thread spans 8 pages: < < 219 ( 1 2 3 [4] 5 6 7 8 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved