|Speculation on Determinants of Click Fraud |
How does Google detect fraudalent clicks?
| 10:42 pm on May 27, 2004 (gmt 0)|
One of the first questions I often get when introducing a prospective client to PPC services after the "light bulb" flicks on, is..."So what keeps my competitor from clicking on my ad?"
Well...the simple answers include:
1. If your competitor is clicking on your ad they are not spending enough time worrying about the quality of their own business.
2. They could actually be helping you in some cases by improving your CTR which is a determinent in where your advertisement will rank.
3. Google has algorithms in place to detect fraudalent clicks, and you should not worry about this.
Where I often faulter is on point #3...This is the diplomatic answer given at conferences when the subject is raised by the casual viewer, and tells me nothing.
Obviously if G released how they detect fraud it would counteract any effectiveness of that detection method. I am hoping we can have a discussion on the potential methods that G (or the other PPC providers) detect click fraud. This will help to "balance" log files against PPC statistics more effectively.
I am not very saavy on Internet Protocol, but seemingly obvious red flags may be: Extremely high (>30%?)CTR
Multiple clicks from the same IP ranges
Higher than normal(?) visits from the same proxies
Higher than normal visits from proxies in general
Seemingly automated activity
High volume activity for normally low volume phrases
Short visit times and extremely low (relative) conversions.
Obviously, there must be "thresh holds" of relevance for all of these variables that throw up red flags. Where would these thresh holds lie? How many of them does it take to trigger a second glance?
The methods of detection not only effect advertiser's bottom line, but it may also effect distributor's ability to keep adsense (which many folks here rely on as a revenue model), among other things.
Please try to keep this thread from becoming a discussion of ethical concerns, but rather keep it to suggestions on methods that PPC providers may be using to detect click fraud.
I would like to see some speculation so we can draw our own conclusions on the potential effectiveness of detection. This will help fo balancing of log files with the statistics given by PPC providers, as well as avoiding being booted from being an adsense publisher.
[edited by: stuntdubl at 10:54 pm (utc) on May 27, 2004]
| 10:47 pm on May 27, 2004 (gmt 0)|
Clicks on Google vs. Search/Content Partners
If 99.7% of the clicks are coming straight from the AdWords on google.com search, then there's probably some fradulent activity going on.
There should be a normal amount of clicks coming from search/advertising partners.
| 12:09 am on May 28, 2004 (gmt 0)|
Stuntdbl has covered off many of the points that would make sense.
I think looking at how Google (or any PPC for that matter) is one thing, but it tends to be more what we do to raise the matter for investgation and the documentation to provide that matters.
We tend to look at analytics a lot more, run custom reports and look at visits that last less than 10 seconds, where the originating click was from a PPC visit.
We'll look at inconsistent publishing sources, in comparison to others. By that if publisher x, y and z convert clicks to leads/sales/actions at say 3% and publisher a does so at 0.3% it may warrant a look. Same keywords, title, description and landing page should get broadly similar results.
Foreign country IP ranges where the product/service woudl not be relevant (although I blame the advertisers in that scenario for using Global as the target audience).
The keyword used would be the same more often than not.
I pay a lot of attention to what my competitors are up to.
On occasions, I will have clicked an ad because I wanted to see a strategy (deep linked vs. non/ relevancy etc.). In an ideal world I'd like to come clean and register our IP addresses as an agency and not have them count but that would be utopia. In the meantime....
I can't see the scenario above being any different to picking up a sales promotional brochure at a conference or ehxibition, when I have no intention of buying but am curious on pricing, model etc.. They will have paid for the production costs of the material. That's just business.
We always try to factor in a "noise" level before we launch an investigation. Often the cost of the investigation may be more than the budget lost. The noise level will vary depending on volume of clicks, CPC, objectives, target CPA and others.
I think just using ROI as your barometer and flagging abnormalities with the relevant supporting documentation makes the refunding process easier.
If PPC providers don't take action then the advertisers will vote with their credit cards. PPC providers do take this matter serious. We've received many refunds, often without even having to provide documentation, so it does get picked up, often days/weeks after the event.
At the very least it may get a rogue publisher, or entrepreneurial labour sweat shops flushed out of the system and will ensure that Google et al are kept true.
If a competitor is smart then there will be no obvious trail back to them. If they are that dumb in leaving the trail and that quiet that all they can do is click your ads then rejoice, they will not be around for long.
Dial up would be used, which would mean a lot of visits from the same ISP.
If in doubt flag your concerns.
| 4:41 am on May 28, 2004 (gmt 0)|
Personally, I don't think Google does much to detect fraudulent clicks. Google likes technical solutions, and fraudulent click detection resists a clean technical solution (mostly due to the cleverness of their adversaries).
Instead, I think Google relies on conversion rates to detect "fraudulent" clicks. If a publisher's traffic converts at a rate significantly lower than average (and only Google knows what average is), then Google flags that publisher account for fraudulent clicks. It doesn't matter if the clicks are from the publisher, his competitors, a boiler room in India, or just from legitimate but non-responsive visitors - they are fraudulent in Google's eyes.
This is a clean technical solution to a myriad of problems, from click banks to sites with low quality traffic. And it is nearly impossible to reverse engineer how it's applied.
| 8:13 am on May 28, 2004 (gmt 0)|
I strongly suspect that on certain competitive keywords, such as "web hosting", competitors will create fraudulent clicks on everyone ranked above them.
Google has a hard time pin pointing this as for them, the fraudulent clicks could simply be a factor of CTR is being affected by rank.
| 8:32 am on May 28, 2004 (gmt 0)|
Google does look for it and gives refunds when they do see it. The best answer is it is just part of doing business. If you make money do it.
| 10:03 am on May 28, 2004 (gmt 0)|
I think that advertisers expect their competitors to make reasonable clicks, e.g. once in a while, on their ads.
It could be interesting to create "neighbourhood fingerprints" consisting of a search term, ad group and some data from Hoovers -- you could come up with a DNA-like family of competitor neighbourhoods. User behaviour could associate some individuals with specific neighbourhoods and anything resembling an "unhealthy" obsession within a defined narrow footprint could be flagged.
While user IP addresses are likely to change, the SEs can better monitor behaviour within these neighbourhoods, e.g. a skulker who clicks through a neighbourhood - is it someone researching a product to buy or a paid clicker? If a clicker is flagged, a process would then investigate past IP ranges that visited that neighbourhood, time ranges, days of the week, etc. The silly but malicious clickers who use a fixed address will be easier to spot, others less so.
| 11:44 am on May 28, 2004 (gmt 0)|
|Personally, I don't think Google does much to detect fraudulent clicks. Google likes technical solutions, and fraudulent click detection resists a clean technical solution (mostly due to the cleverness of their adversaries). |
I disagree. It may not be a matter of a simple yes/no formula, but remember that some people working at Google have made lots of great work in the AI area (see e.g. Peter Norvig's great Book).
Google has a long experience with Google Search and Adwords, remember the "long-life" cookie they place? That would be great to distinguish behaviour of one-time anonymous users (fradulent clickers fall under this category, too) and the overwelmingly great many "normal" searchers. There are lots of (AI) methods on how to extract "common patterns" of user behaviour, and less common behaviour. The ideas posted above are great, but there is likely more you can get from the loads of data, than from our little brainstorming here.
So when Google is detecting fradulent clicks I am sure it is not (only) weighing the characteristics of single clicks - it is looking at the whole usage patterns of ad campaigns. When larger numbers (than on average) of clicks come from "anonymous users" that is a hint (it's not hard evidence, but it is a hint). When searches are done for the sole purpose of clicking on on the ads (never clicking on search results), that's a hint. Repeated searches from the same users is a hint. Going over proxies is a hint. etc. etc.
So when you have an ad-campaign where the whole usage pattern of it has lots of hints for lots of its' clicks, there is something you can infer from that. (Fuzzy Logic, if you will). And you can be pretty sure they have done a lot of investment in that, otherwise they would not have revived CPC from the dead - because the major reason why it failed in the past was the inability to find an answer to the problem of fraudulent clicks.