Welcome to WebmasterWorld Guest from 54.163.100.58

Dictionary.com scraping my content?

   
12:31 am on Jun 13, 2011 (gmt 0)

5+ Year Member



So to test for scrapers...I inserted a random sentence in direct quotes from my home page (that I absolutely wrote...about 12 words in length) and lo and behold dictionary.com appears 1,2, and 3 in the google SERP's...THEN my website...then 8 more different scrapers.

My question is this...how big of a deal is this? If dictionary.com (a titan at page rank 8) is outranking me in the SERP's (page rank 5)...does that mean google thinks not only that my content is duplicate and therefore is useless...but in fact might even penalize me for 'copying' the web darling dictionary.com?

The way dictionary.com appears to be doing this is by tracking(?) embedded ask.com searches from their reference section? So the three dictionary.com URL's that appeared in front of me (for my unique sentence in quotes) looked something like this:

dictionary.reference.com/browse/keyworda+keywordb (another might be keywordc and so forth)

Dictionary.com then displayed (not in a frame) ask.com results with the header stating something like:

"You are seeing Ask web results for keywordx because there was not a match on Dictionary.com."...folowed by a number of listings including other scrapers that copied my content.

So is this a problem? Am I overreacting to the SERP order? Is this something authorship markup could fix?

As a general rule is googling your own content in unique quotes and checking your listing order a good measure of whether you're getting credit for your own content?
12:59 am on Jun 13, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What does cornigashen mean? Anyone know the definition, I couldn't find it in the dictionary


cornigashen, (kôrn-e-gA-SHin), n., of or referring to flatulent squatters who eat opulent poultry nightly, from Middle English corongater, "He was one of the cornigashen, and I hated him for his odor." First used by relative of Queen Elizabeth II in 1954.
1:00 am on Jun 13, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



..lets see who scrapes that.

(and yes, I think quoted string is a good test)
1:10 am on Jun 13, 2011 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



As a general rule is googling your own content in unique quotes and checking your listing order a good measure of whether you're getting credit for your own content?

It used to be - but in recent months that's all gone to hades. Your page might not even be returned for a long quote and still rank top for important searches using keywords in the quote.

I personally think and hope that this is a temporary issue and will return to sanity soon.
1:15 am on Jun 13, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



You'd have thunk that Google would (at least internally) know that dictionary.com has scraped content, and therefore know that authorship attribution should be applied to some other site in the SERPs. Maybe they do?
1:28 am on Jun 13, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Goog has indexed this thread and now has the word "cornigashen" (and its definition) in its index. Now lets see who takes the bait.
1:50 am on Jun 13, 2011 (gmt 0)

5+ Year Member



LOL at what you're doing lexipixel...

Although in this case I suspect it might be trickier to reproduce in this fashion (it will be curious to see what happens though!)

In order for 'my case' to be duplicated the following steps have to happen.

1) A scraper copies this page (strangely enough unless I'm being totally obtuse all the sample forum results from webmasterworld.com don't have scraper results...or perhaps google isn't giving them credit for webmasterworld.com duped content)

2) Ask.com indexes this scraper page (duplicate content of this page)

3) Ask.com gives enough credit to scraperx.com so they they rank in the top ten for 'cornigashen'.

4) Then here is the big mystery part...somehow google indexes a broken dictionary.com page with ask.com embeded serps (which in their top ten include scraperx.com and cornigashen text in the SERPs) . Why google let's a website embed SERP's, give credit to it and of all sites gives credit to its competitor ask.com...is a great question.

I've entered 'cornigashen' as a search phrase into dictionary.com and got a 'not found' result but I'm not this is what is needed to get this working. What's strange is why google would be crawling undefined words in dictionary.com...
2:06 am on Jun 13, 2011 (gmt 0)

5+ Year Member



To g1smd...good question. My fear is that google isn't using modification dates and internal past db comparisons to determine duplicate content...but rather perhaps a third 'trust' factor that is messing everything up. If google sees content from a darling university site (high chance page rank 8) and that same content duplicated from Jo-schmo blog...google might think that because universitysitex.com is so much bigger, older and has such higher page rank that they automatically get the credit for the duplicate content (that's my paranoid theory anyways). It did see seem that Panda really rewarded old/big corporate/big government established websites in general (like walmart.com) and because a good portion of Panda was supposed to relate to duplicate content I do think this could be a Panda problem in particular. Would kind of make sense too since checking for duplicate content (which in essence are series of phrases) based on past database comparisons has to be extremely processor intensive while simply rewarding a page with the best 'panda trust factor' would be much more processor efficient.
2:55 am on Jun 13, 2011 (gmt 0)



My apology if I sound abit harsh, dictionary.com is a branded/powerful authority site, which means they can scrape your content, write lower quality content (than most less powerful sites) with impunity.

I remember a thread about a person putting his company name on twitter page, and that page appears before his main company site .......
4:39 am on Jun 13, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



...so far it's only spread to the meta SE's, (Metacrawler, Ask, Mamma, Dogpile, etc), all fed from Google.
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month