Forum Moderators: open
I was wondering if there are any automated techniques which google (or any other SE) uses.
For instance: to locate doorwaypages (or parked domains) a check on pages/domain can be done. If equals one there is a big chance it is a doorway or parked domain
But as far as I can see now, there is no automated spamdetection for webpages (like with email)...
or is there?
The latest antispam tools are called bayesian semantic filters - basically, it's an attempt to program "natural" language traits into computers, allowing them to detect machine generated text, dup content, etc.
Believe me, these spam filters are a little more sophisticated than most people believe :)