Forum Moderators: Robert Charlton & goodroi
There is a system of points assigned to emails
The spam border is usual 5
No single attribute can reach this.
Only the combination of more than one attribute can reach the 5.
Sure there are also many false positive, important emails blocked by the spam filter.
I think there is something similar at Google.
All we hear "My site lost 80% of Google traffic"
"My site escaped from the filter"
Are caused by
* New attributes
* Changing attributes
* Changing the values of attributes
Let's imagine domain A has
R10 2
R11 2
R12 2 spam points and is in the filter
Let's imagine domain B has
R10 2
R15 1
R16 1 spam points and is good to find
Now maybe is there at Google some sort of conference
They give the rules new values
R10=2
R11=1
R12=1
R15=2
R16=2
Next day, we heva here a post of webmaster A
"Great, my domain escaped from the filter"
and a reply from webmaster B
"I lost 80% of my Google traffic"
Email spam and web spam are very different critters
But the effects are the same.
It's like an email has 4,9 spam points and comes straight in my inbox or has 5,1 spam points and is delivered in the spam folder.
There is nothing in between. It's like marked as spam or not marked as spam. It's like pregnant or not, there is nothing in between.
Lets assume there are 3 different domains.
A: 4,9
B: 4,5
C: 2,0 spam points
Now duddenly is a new spam rule introduced and applied to all 3 domains. It's only a minor point with 0,2. Not the rating is
A: 5,1 filtered
B: 4,7
C: 2,2
Now the webmaster tries to find all the differences between A, B and C. Why is A filtered and B and C not?
Maybe he discovers exactly the small problem, what brings the 0,2 spam points, but he does not apply the solution, because all the 3 domains have the same bad feature, but only one was moved into the filter.
This was always my problem at the June 27th 2006 disaster. 10 subdomains, some in the filter, some not, mobing out, moving in.
And I tried always to find a common feature for all filtered subdomains, but this was not possible to find.
I've worked a fair bit with spamassassin and I think you may be on to something. SA is a fairly complex piece of work and can be set to 'learn' by who you mail to and how you handle suspected spam.
Also, a long time ago, I took some post master's statistics (and realized that stuff was for smart people).. Anyway, one of the things we studies was multiple variables and interaction between variables.
I think your hypotheis is totally valid except I think google might have an item or two in the list that will mark the site as spam regardless of other factors (That's not supported by anything on my part, just a feeling). Regardless, your theory would still hold true for most spam indicators.
If you are right, then it could mean that cause and effect get blurry.
i.e.
- Rule 15 changes something that pushes a site into low rankings.
- The site owner thinks it's related to something that falls under rule 12 and changes that which just happens to get them under the penalty point.
- The owner thinks that it was something google did new with 12 but in reality it was 15 that pushed them over the edge.
I'm not sure if I can post a link to spamassassin but if anyone wants, they can look up SA and see how the rules work and the cumulative nature of the indicators.
cg