|Long Tail Searches, a statistical perspective|
Let's nail some definitions
Given the recent 'MayDay' update there has been lots of talk of the long tail. I've never before tried to pluck any stats related to such searches out of my logs directly.
I have in front of me monthly lists of keyphrases to hit my site and the number of hits for each phrase. Let's imagine this is my data set.
blue widgets 1
fluffy widgets 1
blue fluffy widgets 1
super blue fluffy widgets 1
super blue fluffy acme widgets 1
I am going to reduce that to the number of words in each keyphrase and number of hits. This will give me the following.
Now suppose I arbitarily want to say that anything with 4 or more words is a 'long tail' search. I could present that in two ways.
I could say that of the 7 unique searches to my find my site 2 of them are long tail, therefore about 14% (I would perhaps call this LTU4)
Alternatively I could say that 2 of the 100 searches to hit my site were long tail, therefore 2% (Perhaps the LTH4)
Another stat to consider is the average length of keyphrase. Again I could use either of the above methodologies. Either 18 words divided by 7 phrases giving a KeyPhrase Length (KPL) of 2.57 or I could say there are 111 keywords divided by 100 searches giving a length of 1.11.
I'd be interested to know what sort of strategies others use and if you think the ones I'm suggesting are at all useful! At some point I will process my last few months logs and see how things shaped up and post the results for May in the ongoing Google thread.
check if these LT keywords earn you anything (money, lead etc).
if yes, optimize some content for them, otherwise sit back and enjoy the traffic you get your widgety gizmos which account to 95% of your traffic and have probably a much higher earnings potential.
All other statistical exercises just waste time (imho).
In my opinion, long tail doesn't necessarily equate to number of words or length of query. I think it has more to do with obscurity, and the infrequency of the search.
For instance, a popular product may have a lot of searches on its name, description, and model number. These are all short tail searches, even if they are several words. However, this product may also have a manufacturer's stocking number or UPC code that is very obscure. This may only be one "word", but it is definitely long tail and likely coverts better than the short tails terms even though the number of searches is relatively low.
Others may define it differently, but that's the way I use long-tail in segmenting my search terms.
|All other statistical exercises just waste time (imho). |
Point taken, I just love stats though, so for me this is out of pure interest.
I'm not too worried about how the LT converts and I'm not out to deliberately optimise for it I am purely interested in spotting any shifts in the handling of the LT by search engines.
|I think it has more to do with obscurity, and the infrequency of the search. |
Excellent point. Quite often people will quote it as the length of a search term, but you are right the strict definition of LT searches is all to do with frequency. I wonder if the ratio of single hit search terms over a given period is a useful enough measure.
I can share some numbers from Google Analytics that might be of interest.
"Search sent 287,613 non-paid visits via 38,977 keywords"
The keywords aren't real but the numbers are. However, the mixture of product and company name variants is true to the original.
Keyword 1 - 64,243 (coyote widgets) (67% new visits)
Keyword 2 - 35,073 (Acme) (29% new visits)
Keyword 3 - 14,192 (Acme widgets) (34% new visits)
Keyword 4 - 6,142 (coyotes widgets) (66% new visits)
Keyword 5 - 5,089 (Acme Corporation) (28% new visits)
Keyword 6 - 5,034 (coyote widgets region) (68% new visits)
Keyword 7 - 4,667 (Acme coyote widgets) (37% new visits)
Keyword 8 - 3,630 (example.com) (34% new visits)
Keyword 9 - 3,117 (coyote widget) (68% new visits)
Keyword 10 - 3,050 (roadrunner widgets) (76% new visits)
Keyword 20 - 1,278 (www.example.com) (36% new visits)
The top ten terms brought half the organic search traffic. Thousands of terms had two to five searches each, and the "onesies" began at term 9697.
The terms with only a few searches were sometimes just weird things that happened to match, but in most cases the relevance was strong.
Note the pattern: searches which only mention a generic product name or descriptor are roughly twice as likely to bring first-time visitors as a search which includes some variation of the company name or domain name. That was consistent as far as I checked down the long tail.