Forum Moderators: martinibuster

Message Too Old, No Replies

standard deviation measurment

how STANDARD is standard?

         

dibbern2

11:08 pm on Mar 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not very good with stat math, so I was very happy to follow 21_blue's post some time back about setting up averages and standard deviation formulas in my AS Excel sheets.

I was one of those who had a sudden crunch in earnings last weekend (since back to normal, and then some) and it has me thinking I had the wrong idea about standard deviation.

To illustrate: let's say I have a very steady $100/day average earnings pattern. And that my Excel sheet formula shows a standard deviation over the last 100 days of $10.

As I understand it, I should expect normal ups & downs between $90 and $110, with very occassional lows and highs of about $70 (ouch!) to $130 (yeah!). Is that a correct understanding?

What if I fall -like last weekend- to the $50 range? Thats 5X the standard deviation, and seems alarming to me... or is it?

Are there any math brains out there who might shed some light on this? Thanks much in advance.

G_Smitty

11:36 pm on Mar 7, 2006 (gmt 0)

10+ Year Member



I have a little experience with Statistical Process Control in the automotive industry (primarily x & r bar charts). When trying to keep certain Key Characteristics in process control you must have some type of control over the variables affecting the process. Since we really have limited control over many of the variables affecting our earnings, I would think that any SPC tracking would be futile.

jomaxx

11:53 pm on Mar 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't see much point in tracking dollar earnings because it's a function of 3 largely independant variables: site traffic, clickthrough rate, and CPC.

It's true that dollar earnings should generally speaking demonstrate a normal distribution pattern, but if there is an anomaly you still have to look at each of the underlying components to see what happened.

Also, now that we have the tools to do so, I would look at your CPC ads and site-targeted numbers separately. It looks to me like variations in the proportion of site-targeted ads that are served can mess up your statistics.

btas2

4:46 am on Mar 8, 2006 (gmt 0)

10+ Year Member



Applying statistics to Adsense earnings is like applying statistics to the stock market.

Past performance is a poor predictor of future results. If it wasn't, people wouldn't lose money.

You have no idea what the variables are since Google changes them in an unpredictable manner (smart pricing, search ranking etc.).

hunderdown

5:28 am on Mar 8, 2006 (gmt 0)



I'm no stats guru, but a drop of that amount is significant! NOT at all likely that it's caused by chance. What caused it you still need to figure out.

dibbern2

6:53 am on Mar 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"You have no idea what the variables are since Google changes them in an unpredictable manner (smart pricing, search ranking etc.). "

Thank you for your suggestions btas2 and jomaxx, but I have some problems with the total unpredicatbility you suggest.

If what you say was true, then my $100/day (for 3 months) site tomorrow might bring in $1 or a $1000. I can't believe that.

Secondly, it suggests that AdSense as a business is totally different from all other business ventures. For after all, all businesses are predictable to some degree even with their particular variable factors. Google's variables exist, but something like them takes place in every other commercial market; they are NOT totally unique.

40 years of being in business tells me that everything is measurable and predictable, some things less, some things more. But never NEVER.

I'm debating with you, nor arguing. I do appreciate your answer to my post.

Scruffy

7:02 am on Mar 8, 2006 (gmt 0)

10+ Year Member



You've sort-of got it right except that dollars are not 'statistical events'.

(You could be tracking on pennies in which case $100 = 10,000 pennies and the standard deviation is 100 pennies. i.e. $1)

You have to go to the actual process and work out what the 'event' is. With adsense, the number of impressions is the fundamental number.

jomaxx

7:35 am on Mar 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I didn't say it wasn't predictable within certain parameters, I said that tracking the dollars alone is pointless because they are a side-effect of 3 other variables.

To put it another way, if your earnings dropped by half last weekend, what caused that? You haven't given us any clue. Was it a drop in AdSense impressions, CTR, and/or CPC?

Tastatura

8:39 am on Mar 8, 2006 (gmt 0)

10+ Year Member



I know a little bit about this, however I don’t consider myself an expert, so take it with a grain of salt. Also I am new to AdSense, etc and I am not an expert on how it works.

Statistics is a great tool, however as any tool it should be used properly. At a basic level, for your process (or business, or AdSense part, or anything else) you have inputs to it and outputs. Say input parameters are page views, number of visitors, etc, and outputs are # of converted visitors, revenue, etc. Note that I am just throwing parameters around – almost each case or model is different and parameters might be different, so it’s very important to correctly identify those at the beginning of the ‘design’ process. Essentially you want to identify direct and measurable parameter – those would be your Control Parameters. If you change ‘value’ of those you should be able to see change in output. (let me stress again they have to be direct and measurable). Also almost every process will have some kind of a ‘noise’ – things that you can’t control, but they don’t have any measurable affect on the outputs. Then you will want to start changing values (or evaluating) impact of your control parameters to see what kind of a impact they have on your outputs. Some will have significant impact, some will have less – you want to identify them, in order to figure out what do you need to do to get desired output values….This is very basic, nutshell, description. Part of this is called pFMEA, and is also part of ‘lean’ and ‘six sigma’ methodologies.
So if AdSense/Google is identified as your control parameter, you can try to model it and use their outputs as your inputs. Remember that if you get those wrong, your model will be wrong….
Someone also brought up stock market, and to me it sounds as a good comparison (at the moment based on what I know). Yes I have securities, however I have no influence on how the companies in question work, nor how they make decisions (which at the end will influence if they will make money for me or not). I do have access to historical performance, SEC filings, etc. however that does not guarantee future performance( thing bubble burst few years back). It’s risk analysis.

my $0.02

Wizard

9:11 am on Mar 8, 2006 (gmt 0)

10+ Year Member



The essence of normal distribution is, that small fluctuations happen more frequently, while big fluctuations are rare. So the big flux you're talking about may happen sometimes, it's precisely said in statictics what percentage of results happen in what distance from average value. About 66% results lie in the radius equal to standard deviation, about 90% in the radius of 2x standard deviation etc.

As for predictability of AdSense business, one site can indeed earn $100 one day, while $1 - $1000 other day. Especially if a significant part of traffic comes from Google search results and there is an update happening. Some sites earn better each weekend, other the opposite, but some special holidays may cause especially significant fluctuations.

The only way to make AdSense business stable and predictable is to own many sites, perhaps over a hundred, in different niches, on different servers, using different SEO strategies. Just like with stocks - you can't be sure of your investment stability if you trade shares of only one company.

SeanW

2:24 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



My stats are rusty, but on a normal distribution 67% of the measurements should fall within $90 and $110, and 95% within $80 and $120. It's certainly possible that you have a $1 or even $1,000,000 day. You'd expect at least one day a month to be outside of 2 std devs.

That said, do adsense earnings follow a normal distribution? Do you have enough samples such that your standard deviation is accurate? I'm pretty sure the answer to #1 is no, so #2 is moot.

Perhaps a good conversation would be, "which metrics should we be measuring?"

Earnings = Visitors * CTR * Avg CPC

Sean

Scruffy

2:47 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



OK it's worth repeating. maybe a bit louder...

YOU CAN'T DO STATS ON DOLLAR AMOUNTS.

It makes no mathematical sense.
If you try to take one SD on $100 you get $10. i.e 10%

convert to rupees at (whatever?400)
The SD on 40,000 rupees is 400 rupees i.e 1%

You can only do statistical calculations on EVENTS. i.e clicks.

I'll shut up now...

Scruffy

2:49 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



Yes I know, 200

nonni

2:55 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



If your data is normally distributed, and the average is $100 per day, s.d. of $10, then there is less than a five percent probability that earnings would drop under $80 or rise over $120 on any given day due to random variation. It could happen, though.

Scruffy: why can't you do stats on dollar amounts? Dollar amounts are on a ratio scale - with a true zero, two is twice as much as one (unlike most temperature scales) - no different than grams. Each dollar in Adsense earnings is in fact an event as well as a thing - it represents the act of crediting one greenback to an account. The fact that there are different currencies shouldn't be a problem as long as the units are not mixed improperly.

Scruffy

3:16 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



Look at the math.

100 dollars is 10,000 pennies.

The sqroot of 100 is 10 that is 10% - the distribution curve is broad.

The sqroot of 10000 is 100 that is 1% - the distribution curve is extremely narrow.

You have to take it back to the clicks that generate the dollars and only apply the conversion to dollars AFTER you do the stats.

Example:
I have 10000 clicks on an ad at $0.50 each ($5000)
The standard deviation is 100 clicks ($50)
That's a nice tight 1% SD - good stats.

On the other hand:
I have 100 clicks at $50 each ($5000 - same amount)
the standard deviation is 10 clicks ($500)
lousy stats. broad distribution

nonni

4:47 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



But the standard deviation is not simply the square root - it is the square root of the average of the squared deviations relative to the mean. You could calculate a standard deviation without all the squaring and square rooting simply by averaging the absolute values of the deviations from the mean.

If you scale a data set with a std deviation = $10 into pennies, it would be 1000 pennies, which is the same monetary value. And if a daily observation drops by $23.17 (relative to the average), the z-score for that drop would be identical whether you calcuated it in pennies or dollars. Same with the probability values, which indicate the likeliness that something occured by chance.

Scruffy

5:49 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



Suppose you got 1 click yesterday and you got paid $100 dollars on it.

Would you expect to get between $90 and $110 today?

Chances are you would get none, one or two clicks today. All things being equal you get $100, $200 or nothing.

nonni

7:31 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



Your changing the question, I think. If clicks can only be worth $100 or $200, then you would need a different model.

If properly used, standard deviation would give you a good idea whether that behaviour was normal, or unusual, and whether to expect it to repeat on any day.

It depends on what the long term average and deviation are. If that one extreme click came out of the blue, then statistics (and common sense) would say it was an anomoly, not the norm. If the web site regularly got hundred dollar clicks, then the stats might predict tommorow's earnings at $107.44 ± $36.02. It would rarely be exactly $107.44, but long term, the prediction would be close if you had a good model.

For extreme events, log-normal stats are preferred to normal distribution/bell curve.

Scruffy

7:53 pm on Mar 8, 2006 (gmt 0)

10+ Year Member



You are making the assumption that the value of a click is in itself a 'normal distribution' around a mean value.

There is no way of knowing the distribution of monetary values that apply to your pages but I doubt very much that it is even vaguely 'normal'. As to what it's SD might be? - anyone's guess.

The only statistical analysis that can be performed is an estimate of (the variation of) the daily/monthly click rate. Beyond that there simply isn't enough information to generate any meaningful conclusion.

Given that fact, the best you can do is to convert 'expected clicks' into 'earnings' by factoring in an average value for clicks. If you can even work out what that might be.

toldan

9:48 pm on Mar 8, 2006 (gmt 0)



I'm not very good with stat math, so I was very happy to follow 21_blue's post some time back about setting up averages and standard deviation formulas in my AS Excel sheets.

Standard deviation does not mean anythinig. It's worthless when it comes to Adsense. You can go from $15/day to $1500 next day. It doesn't mean you are a cheater. It's many factors: traffic, ad positioning, sources of traffic etc.

Don't bother with standard deviations, trust me - they are worthless when it comes to Adsense.

rbacal

2:47 am on Mar 9, 2006 (gmt 0)



1) Nonni is right on the nose on all points (I also used to make my living as a research statistician).

2)

Standard deviation does not mean anythinig. It's worthless when it comes to Adsense. You can go from $15/day to $1500 next day. It doesn't mean you are a cheater. It's many factors: traffic, ad positioning, sources of traffic etc.

Why is it that people who don't seem to know even the basics about something (ie. like SD or other statistical calculations, their use, etc), insist on educating us with ridiculous information.

To an earthworm, quantum physics does not mean anythinig (sic). For me photons aren't really informative, but ehh...maybe for those that understand these things, they have a use.

casting off...knit perl.

dibbern2

5:57 am on Mar 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<quote>You can go from $15/day to $1500 next day.</quote>

No you can't, Toldan.

Thez

2:26 pm on Mar 9, 2006 (gmt 0)

10+ Year Member



The drop in your earnings I've explained here:
[webmasterworld.com ]