|How does your content rate in the 'Language Filter'? Possible Signal?|
|..ever looked at "Search Tools" "All Results" "Reading levels" before and after successful improvement of his pages. |
Observations (Advanced Level Selection)
My site (which teaches grammar) fails the filter.
The 'blank site' (3 word content) ranks #2 for the competitive term it's associated with.
The 'language filter' works by detecting bad grammar or usage, rather than focusing on quality. (Fair enough)
If used as a signal this would mean that 'less content' would rank higher, as it has less chance of 'failing' that filter.
A grammar site 'failed' the filter (Teachers + Students = excellent + low quality language)
A blank site 'passed' the filter (no content = no bad 'hits')
Evidence in the SERPS:
[webmasterworld.com...] (is less content better?)
[webmasterworld.com...] (the blank site, partial EDM)
The almost global drop in UGC content.
How do you rank?
Try it out here: [google.com...]
(Please state the type of site you're running)
Niche directory site. Most directory pages rank as basic, some as intermediate. Pages displaying current data (update hourly) are intermediate.
Search terms I'm always #1 for the site disappears using the advanced filter.
These are not articles, and by their nature not structured for grammar, but contain plenty of text. Looking at the other sites that hold on for the advanced filter, it doesn't look like grammar is their strength either. Detailed complex technical terms (and plenty of them) are driving it here.
Note that it's called Reading Level, not grammar level.
@treeline We did a little, basic test on another thread:
The results (for my site / forum) show that the public SERPS appear to use a setting of 'basic + a fraction'
IE: When you look at the 'basic' setting, then look at the 'intermediate' setting, the public results seem to be something in between that (nearer to basic).
Do you see the same?
|The 'language filter' works by detecting bad grammar or usage, rather than focusing on quality. (Fair enough) |
I suspect that it has nothing to do with bad grammar. It's probably just the Fleisch-Kincaid reading level formula. You can check wikipedia for details, but it boils down to how many words in a sentence, and how many syllables per word.
You can get some pretty awful copy to show up as "Advanced" simply by creating run on sentences with lots of long, multi-syllable words. Advanced does not necessarily imply "Good", and "Basic" does not imply "Bad".
|The results (for my site / forum) show that the public SERPS appear to use a setting of 'basic + a fraction' |
Well....that's roughly what is at the top for many terms, for others it's the intermediates, but the advanced are on the first page for many terms.
What I think is happening is this filter isn't used at all on normal search results pages and that these happen to be the pages that are "most useful" according to Google's normal thinking.
Sort of like on advanced image search you can ask just to have small, medium, or big pictures.
The 'language filter' works by detecting bad grammar or usage, rather than focusing on quality.
And, apparently, no attention at all to vocabulary. Last time this came up, I found it pretty hilarious that even the most technical areas of my site came in at "basic". This time it's funnier still, because I've added stuff. According to google, Volume II of Gairdner's edition of the Paston Letters (the earliest letters, cutting out in 1454) is Basic. One or two later volumes may cross the line to Intermediate.
I am relieved to find that they threw up their hands at my one offering from the Early English Text Society, and didn't even guess at a reading level. Could they not recognize it as English, although at least half the text is modern (i.e. no later than 1922)? On the same list, they didn't hesitate to offer one page that's 95% in a language google doesn't know-- and another that's 99% Italian.
Where's that "noidea" emoticon when you need it?
|this filter isn't used at all on normal search results pages |
It would explain a lot if it was though, a real lot. Mainly that almost no UGC shows up after basic. As I said, i think it's at about BASIC+.3 if it is being used.
If what Rish3 supposes is true it's a pretty terrible way to analyse content. At any rate, it does appear (as mentioned) to simply favour long bombastic phrases, and rewards empty pages with 'advanced' (that empty page did have some Slovakian in the side bar). It is a very immature technology from what resultant data (thanks @lucy24) we can see.
I hope you're right, something like this should not be in a complex algorithm that affects so many people's time, effort and livelihoods.