homepage Welcome to WebmasterWorld Guest from 54.198.25.229
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Marketing and Biz Dev / SEM Research Topics
Forum Library, Charter, Moderators: phranque

SEM Research Topics Forum

    
Spam detection techniques
What techniques does Google use to detect spam?
Rooseboom

10+ Year Member



 
Msg#: 736 posted 7:28 pm on Nov 23, 2004 (gmt 0)

Hi,

Does anyone have a clue on what algo's/filters/rules Google uses to detect spampages. On what distict marks of a webpage can Google identify spam?

Thx

 

manute

5+ Year Member



 
Msg#: 736 posted 8:19 pm on Nov 28, 2004 (gmt 0)

funny question. as there is no "this is spam" definition, there can't be a clear answer to your question.
there are hundreds of different ways of spamming.
if you just need one simple example, it's hidden text.

Rooseboom

10+ Year Member



 
Msg#: 736 posted 2:01 pm on Nov 29, 2004 (gmt 0)

I know spam when I see it, indeed...and there are a lot of techniques (one of the most obvious to be invisible text).

I was wondering if there are any automated techniques which google (or any other SE) uses.

For instance: to locate doorwaypages (or parked domains) a check on pages/domain can be done. If equals one there is a big chance it is a doorway or parked domain

But as far as I can see now, there is no automated spamdetection for webpages (like with email)...

or is there?

Sanenet

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 736 posted 2:12 pm on Nov 29, 2004 (gmt 0)

Actually, there is, and the basic methods are quite similar (checking for textual analysis, hosting, domain, content, etc).

The latest antispam tools are called bayesian semantic filters - basically, it's an attempt to program "natural" language traits into computers, allowing them to detect machine generated text, dup content, etc.

Believe me, these spam filters are a little more sophisticated than most people believe :)

Jon_King

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 736 posted 5:40 pm on Dec 8, 2004 (gmt 0)

This will help:

[newdbpubs.stanford.edu:8090...]

Rooseboom

10+ Year Member



 
Msg#: 736 posted 8:38 am on Dec 9, 2004 (gmt 0)

link doesn't seem to work?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Marketing and Biz Dev / SEM Research Topics
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved