Forum Moderators: mack
we look at a number of signals that suggest possible use of URL keyword stuffing, such as:
Site size Number of hosts Number of words in host/ domain names and path Host/ domain/ path keyword co-occurrence (inc. unigrams and bigrams) % of the site cluster comprised of top frequency host/ domain name keywords Host/ domain names containing certain lexicons/ pattern combinations (e.g. [“year”, “event | product name”] Site/page content quality & popularity signals
To amplify this, we try to cluster sites (by various pivots such as domain, owner, etc…) and then look for patterns of the signals listed above in the same cluster. This helps improve detection precision because spammers often create dozens/ hundreds of similar looking sites.Bing: URL Keyword Stuffing and Detection [blogs.bing.com]