Idea for new algorithm to prevent scraper sites from outranking you

Just a little more intelligence could weed the scum out of the index

9:01 pm on May 25, 2006 (gmt 0)

joined:May 5, 2006
I have a great idea for Google’s algorithm architects. Since there are over 65,000 scraper sites out there that have copied contents from my web site alone, and probably yours as well, and they rank higher than you for searches of your own content, I have a great suggestion for Google’s algorithm.

Maybe Matt Cutts can pass this suggestion over the wall to his coworkers:


They don’t seem to do that now!

Use an automatic WHOIS lookup to see which sites have been around the longest.
If you come across 2 sites with duplicate content, give the ranking to the site who has been online the longest, and delete the 2nd web site.
Period. End of story.

Further enhancement: Delete any site that looks like search engine site.
Further enhancement: Delete any site with a ton of keywords stuffed in the bottom of their page.

Example algorithm:

Crawl Jeff’s Original bridal tips and diamond buying guide site

Crawl scammer’s scraper site

Find Duplicate Content (which scraper site stole from our site)

Perform WHOIS lookup on Both sites

Jeff’s Site Online: 8 years Scammer site: 8 days

Result: Scammer site not in index, and URL sandboxed.

Jeff’s Site Rank=1, PR=7.

End of Algorithm

12:51 am on May 28, 2006 (gmt 0)

joined:May 5, 2006
Looks like I opened a real can of worms when I started this post.

It appears each of us has found loopholes to be plugged, and contributed algorithm ideas, which collectively, would pack a mean wallup against the scraper sites.

Let's all do the patent for algorithm together and start our own search engine!

And we'll hire a team of coop students to act as human editors to spot check sites and respond rapidly to spamdexing complaints.

And we'll have a better search engine than Google, just like as Google was better than the search engines it killed.

Remember, when Google was formed, people asked "who the heck needs another search engine?"

because we had Alta Vista and Escite and Infoseek, and now look, they are all dead dinosaurs, and Google is king of the jungle because it was a better search and more nimble.

But We could easily out maneuver Google and they too would be the dinoasaur who never kept up. And we will be the king of the jungle.

....But because this was my idea, my site gets stickied at #1 ranking.......

