homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Gold Sponsor 2015!
Home / Forums Index / Yahoo / Yahoo Search Engine and Directory
Forum Library, Charter, Moderators: martinibuster

Yahoo Search Engine and Directory Forum

HOT - Yahoo research papers on spam, VLSI and other topics!
How will yahoo judge if something is spam?

10+ Year Member

Msg#: 3283330 posted 1:22 am on Mar 16, 2007 (gmt 0)

This is pretty interesting:


Some more cool stuff here:


And here:




WebmasterWorld Senior Member marcia us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 3283330 posted 3:20 pm on Mar 16, 2007 (gmt 0)

I've seen, printed out and read the second paper, but not the first, and it's a real eye-opener. I agree that this is HOT because it's kind of confirming some suspicions I've been toying with.

I've wondered about hyphens for quite a while, and have even re-done some sites to eliminate some hyphenated filenames and subdirectories (though I use them for ease of maintenance), and have been going through search after search at Yahoo recently, looking for hyphens and underscores. I have noticed a substantial absence of pages with hyphens in the top 20-30. It may not be just one of the factors mentioned, maybe it's a combination of factors that can push a site over the edge.

What I do wonder, though, is whether in reality they detect sites with what they consider to be negatives algorighmically, or use human review.

Sometimes they really are out to get paranoid people.


WebmasterWorld Administrator coopster us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 3283330 posted 11:53 pm on Mar 17, 2007 (gmt 0)

Are you referring to section 5 in regards to the hyphens?

Qualitative aspects of spam hosts
Finally, we wanted to evaluate the prevalence of different spamming aspects. For this end, and as
a preliminary study, we ran a second round of evaluations by sampling at random 200 hosts that
were tagged by at least two judges as Web spam. We wanted to examine the most relevant features
found in hosts that were tagged as spam. After inspection of these hosts, we decided to tabulate
them using the following (non-exclusive) criteria:

Keywords in URL: The host contains keywords in the URLs, separated by minus, underscore
or the plus sign. This is not necessarily a spamming aspect.


I think they have just discovered a "qualitative aspect" of many hosts then, spam host or not. I do note that they discount this as "not necessarily a spamming aspect" though.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Yahoo / Yahoo Search Engine and Directory
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved