How to determine if your CTR has actually changed

Hi Everyone,

DISCLAIMER: Some of the following requires an understanding of statistics. But, fear not, I am going to present a very simple set of numbers that anyone can apply to their AdSense data. That's info is labeled with "THE BOTTOM LINE". If you aren't interested in thinking about math, that's all you need to know. For those that are interested in discussing the math involved, feel free to read and respond to the other details. That being said, here we go...

I repeatedly see posts ("My CTR is up today, is this real?", "My CTR is down today, is this real?", "My CTR is so far up/down that I am worried -- should I tell Google?", etc.) that have as a prerequisite for any informative answer a knowledge of whether or not the changes seen are statistically significant. But, I never see any statistics provided. Of course, that could be because everyone fears disclosing them due to the gag order that the AdSense TOS provide for. However, I suspect that it is more because most people that ask such questions are not informed as to how to do basic statistics on their CTR rates. So, I thought I would provide some useful "rule of thumb" figures for everyone to use.

The big concept that everyone needs to understand is that random variations ("variance" or "standard deviation" in math terms) in small samples of data tends to be larger than random variations in large data sets. Why? Because in large data sets the variations tend to cancel themselves out. As a result, the more points you have in a set of data, the more accurately you can detemine its average. That is the key concept here: Yesterday I saw one average, and today I see a higher/lower average, BUT, do I really know that those averages are accurate enough (because I have enough data points) to determine that the different I am seeing is a real difference, and not just the result of random variation in one set of data versus another.

***THE BOTTOM LINE*** The more data points that make up your click-through rate (CTR), the more accurately you can tell one CTR from another CTR. Here are the rule-of-thumb numbers to use:

Impressions / CTR difference that is significant
10 / 4.07%
50 / 1.73%
100 / 1.22%
200 / 0.86%
500 / 0.54%
1000 / 0.38%
5000 / 0.17%
10000 / 0.11%

What does this mean? It means that if you do not have at least the number of impressions in the "Impressions" column, and a difference in the two CTR's that you are comparing equal to or greater than the number in the second column, there is a good chance that you are seeing just a random fluctuation. For example, if you are basing your CTR figures on 500 impressions, and you are concerned that today's CTR is lower than yesterdays' CTR, don't even THINK about being truly concerned unless the difference between the two CTRs is at least 0.54% (such as yesterday's CTR being 2.54% and today's being 2%).

Now, let me stress something else: These are MINIMUMS, which assume that there are no other forces in the unviverse skewing your numbers. We all know that isn't true. Time of day, day of week, Google changing the targeting algorithm, etc, etc, could all affect your numbers and make them truly different even though there isn't anything actually "wrong". So, don't assume that just because your CTR difference is above the threshold number that you can assume there is some problem going on (or that the new banner you just tested really is better or worse than the previous one). In other words, these numbers are not very useful for determing when there truly is a difference DUE TO A GIVEN CAUSE because there are so many complicating factors. But, what they are very useful for is to determine when there is NOT actually a significant difference, which allows you to just stsop thinking about the matter all together.

WARNING: ***MATH FOLLOWS***

For those of you inclined to read this, these figures are done using a T-test at the 99% confidence interval. Is a T-test really the correct choice? Maybe not for small sample sizes because this data really binomial, not interval scale (in other words, someone can click, or not click -- they can't partially click). But, the binomial distribution approaches the normal distribution at sufficiently large sample sizes. So, the numbers presented where N < 200 may not be completely accurate, but they are good enough for these purposes.

Why the 99% confidence level -- do we really need that level of certainty? Probably not, but I know that plenty of people sit around and check their stats many times per day, so to some extent that extra surety represents a Bonferroni correction for multiple comparisons. I should probably really make two charts: One assuming you check your stats once, and one assuming you check your stats 20 times per day. But, that really removes some of the simplicity that I was trying to present so that these rules of thumb are as easy to apply as possible.

Eventually I'll build an AdSense stat cruncher that does all this stuff automatically, but time does not permit at the moment.

Hope this was helpful.

James

How to determine if your CTR has actually changed

statistical analysis of CTR

JamesR3

doingthistoolong

JamesR3

JamesR3

linear

darkmage

JamesR3

frox

JamesR3

oddsod

FromRocky

jetteroheller

JamesR3

no9t9

JamesR3

no9t9

FromRocky

JamesR3

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week