Forum Moderators: Robert Charlton & goodroi
@mattcutts Matt Cutts
Scrapers getting you down? Tell us about blog scrapers you see: [goo.gl...] We need datapoints for testing.
Google is testing algorithmic changes for scraper sites (especially blog scrapers). We are asking for examples, and may use data you submit to test and improve our algorithms.
This form does not perform a spam report or notice of copyright infringement. Use [google.com...] to report spam or [google.com...] to report copyright complaints.
Exact query that shows a scraping problem, such as a scraper outranking the original page:
[edited by: Brett_Tabke at 1:11 pm (utc) on Aug 27, 2011]
URL of specific scraper page: (Required)*
Tedster said:
I don't assume that. Scraping an article and then "spinning" it by rewriting each sentence is a practice that Google already tries to catch. There have been several Google engineers posting about it on their Webmaster Forums. So if you see such pages outranking the original, and you feel like reporting them, I'm pretty sure they would take in the information.
some examples of Google engineers talking about scraping etc
Wysz: The images are hosted on buzznet.com, which has an article about the same concert...
If you read both articles, while the wording may not be exactly duplicate, there are very strong similarities.
First sentence from your site: "The Prince serenaded Leighton Meester during his concert at New York City's Madison Square Garden on Tuesday night (Jan. 18)."
First sentence from Just Jared: "Leighton Meester gets serenaded by the legendary Prince during his sold-out concert at New York Ctiy’s Madison Square Garden on Tuesday night (January 18)."
Note these phrases: "serenaded," "New York City's Madison Square Garden," "Tuesday night (Jan[uary] 18)"
Beyond the first sentence, note the similar order and structure:
1. Leighton was sitting in the front row.
2. Prince invited her to the stage.
3. "I Don't Trust You Anymore" was playing
4. She was smiling and laughing/giggling
5. There were other celebrities there.
6. She’s wearing a cute sweater.
[google.com...]
2) Put a copyright notice saying that they are FREE TO USE THIS CONTENT On THEIR SITE if they are in a related business niche. (No need for a link back, even).
First sentence from your site: “The Prince serenaded Leighton Meester during his concert at New York City’s Madison Square Garden on Tuesday night (Jan. 18).”
First sentence from Just Jared: “Leighton Meester gets serenaded by the legendary Prince during his sold-out concert at New York Ctiy’s Madison Square Garden on Tuesday night (January 18).”
Note these phrases: “serenaded,” “New York City’s Madison Square Garden,” “Tuesday night (Jan[uary] 18)”
Beyond the first sentence, note the similar order and structure:
1. Leighton was sitting in the front row.
2. Prince invited her to the stage.
3. “I Don’t Trust You Anymore” was playing
4. She was smiling and laughing/giggling
5. There were other celebrities there.
6. She’s wearing a cute sweater.
[google.com ]
If Google was to apply some kind of automated detection routine, then they may end up nuking most of its news sites.
Any reporter who covered this concert, and wrote about the moment, would have identified the same facts, and probably utilized similar vocabulary and descriptive style.
But not in the same essential order and structure, with the same side details and quirks.When it is a press release from a company promoting an event or a device, then that is exactly what would happen. Ever read a newspaper where you see AP or Reuters credited at the end of an article?
Even though the final results of the Google algo may be out in left field for some queries at any particular time, it is wise NOT to think of Google as "stupid".I don't think of them as being "stupid". Some are smart but that does not mean that people should have a fanboy attitude to Google's efforts to sidestep a problem because its employees just are't quite smart enough to solve it efficiently. This is why I think that Google's attempted Socialisation of the problem is a cop-out and tantamount to an admission of defeat.
These are some very savvy folks, and if they set their minds to handle an issue, sooner or later there will be penalties.Really smart people would come up with a solution rather than just trying to apply penalties. Applying penalties does not solve the problem but then they probably have a patent where they claim it does.