homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Marketing and Biz Dev / SEM Research Topics
Forum Library, Charter, Moderators: phranque

SEM Research Topics Forum

Spam detection techniques
What techniques does Google use to detect spam?

 7:28 pm on Nov 23, 2004 (gmt 0)


Does anyone have a clue on what algo's/filters/rules Google uses to detect spampages. On what distict marks of a webpage can Google identify spam?




 8:19 pm on Nov 28, 2004 (gmt 0)

funny question. as there is no "this is spam" definition, there can't be a clear answer to your question.
there are hundreds of different ways of spamming.
if you just need one simple example, it's hidden text.


 2:01 pm on Nov 29, 2004 (gmt 0)

I know spam when I see it, indeed...and there are a lot of techniques (one of the most obvious to be invisible text).

I was wondering if there are any automated techniques which google (or any other SE) uses.

For instance: to locate doorwaypages (or parked domains) a check on pages/domain can be done. If equals one there is a big chance it is a doorway or parked domain

But as far as I can see now, there is no automated spamdetection for webpages (like with email)...

or is there?


 2:12 pm on Nov 29, 2004 (gmt 0)

Actually, there is, and the basic methods are quite similar (checking for textual analysis, hosting, domain, content, etc).

The latest antispam tools are called bayesian semantic filters - basically, it's an attempt to program "natural" language traits into computers, allowing them to detect machine generated text, dup content, etc.

Believe me, these spam filters are a little more sophisticated than most people believe :)


 5:40 pm on Dec 8, 2004 (gmt 0)

This will help:



 8:38 am on Dec 9, 2004 (gmt 0)

link doesn't seem to work?

Global Options:
 top home search open messages active posts  

Home / Forums Index / Marketing and Biz Dev / SEM Research Topics
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved