Forum Moderators: Robert Charlton & goodroi
If the Yahoo Answers page doesn't at least link to you, you should demand at least that much. If this page's ranking is important to you on your own domain, then it sounds like a DMCA filing might be appropriate.
I should also add that my page has 20 external links to it including from yahoo answers on other topics. The yahoo page replacing mine for a string of text has no external links which the page about the patent says is the most important thing.
So i really do not understand what bit of logic code results in this. I wonder if indeed its a mistake as they did recently with the posttion 6 thing. But it takes enough people to shout about it for it to be looked at.
Oh and i should also say the page ranks for keywords, just not for text string searches.
The quickest way to stop it is hope they have Adsense on the page and fax a DMCA to Adsense. After that you have to wait until the cache updates which can be 3+ months. Yahoo has fax numbers. Also look for hidden links.
The biggest problem though you can encounter in this BB-scrapebook-social bookmark type area is many of these sites scream first amendment free speech even though their TOS prohibit it. They can hide your content in private areas. You’re very lucky its in Yahoo. It can get really rotten.
Has the text being pasted into Yahoo! Answers in the United States? Yahoo US has removed content copied from my site when I emailed them following the procedure outlined here [info.yahoo.com...]
Unfortunately, I have had some difficulties when trying to contact Yahoo! Copyright Agents in other countries.
Ok somewhat simplified but you get the picture. Original page predates yahoo answers by YEARS, has 20 external links while yahoo answers has none and google will not return a quoted phrase at all unless you click for omitted results. It returns yahoo answers and two other sites with a selected copied amount of text. So what bit of code could legitimately result in this? This surely must be a coding error within the algo.
For a look at just part of this long list of algorithm factors, check out our thread on the Historical and Age Data patent [webmasterworld.com].
Also take into account that Google's first job is keeping their end users happy, and not keeping webmasters happy. They do care about webmasters - with clearly more communication for webmasters than any other search engine - but that will never be the most important goal of the Google algorithm.
I doubt that search can ever be free of errors - it's a MACHINE intelligence run over an immense data set the size of which you and I will most likely never need to manage. Google uses almost a million networked servers and they have hundreds of people writing and tweaking code all day.
So just do what you need to do to make it right and don't wait for Google. There's good advice up above. Another interesting experiment would be to get one new backlink to your url.
If you were hoping to get Google to change something in the algo based on your post here - well that's not very likely. It's also not the purpose of this forum, as we mention in the Google Forum Charter [webmasterworld.com].
You can get more direct access to the Google algo team by first doing the problematic search, and then using the link toward the bottom of the page that says "Dissatisfied? Help us improve."
Well, I'm currently back at #1 for many search terms and phrases but often the forum page will be right under me at #2. For some phrases neither my page or the forum page turn up in the results - even when there is not another page out there on the whole entire web that comes anywhere near to providing that exact match. Overall, I've lost a lot of traffic from very specific search queries.
I'm currently in the process of rewriting and restructuring these pages to make the tutorial easier to understand and follow. Hopefully, I'll kill two birds with one stone in the process!
I'd say that the ability to match exact strings of text - especially to the original source - is also not high on the average user's list of priorities. Google has clearly been willing to trade off in that area, whether we like it or not (and I don't.)
I'm glad to hear ChicagoFan67's report of the effect of that link. It's the effect I would have expected, but it's good to hear confirmation. It'seven possible that just one new direct backlink can pop a page out of the Supplelemental index partition.
What im seeing is something very specific for long strings of text. I was interested to see if anyone could explain an actual way this could come about. I really dont see how it would happen.
This is how Google has been reacting to duped content for several years now. I first observed it when a page of ours that gets scraped incessantly had gotten temporarily knocked out for a competitive two-word search. I immediately checked by searching for a whole sentence in quotes, and only one scraper page was ranking. It took adding &filter=0 to the serps page url bring our page back.
What was curious, though, was that our page was still ranking on a desirable three-word phrase. My thought is that we'd gotten classified as the dupe (temporarily), probably because of a bunch of links pointing to the scraper. We had enough inbounds to overcome the filter for the three-word phrase, but not enough to overcome it for searches on the two word phrase. This, even though the scraper was not ranking in our place for the two-word search, at least not on Google.
But we didn't have the kind of links that would rank us for the whole string I'd searched, which didn't include any words we were optimized for. So we had no link text boost or other sort of algo boost to overcome the dupe filter. That's my reasoning on it anyway... and I've observed this behavior a number of times over the past several years when we've gotten scraped.
Google generally adjusted this kind of dupe problem pretty quickly... within a week or so... and there was a period of almost a year when I didn't see it happening very often. But in the past couple of months I've been seeing several of our pages apparently reacting to scrapers again, and taking much longer to recover than they used to. Adding &filter=0 to the Google search is restoring them, and I can find the duping pages, generally with Copyscape if not with exact text searches.
I should mention with regard to scrapers that long exact quotes won't always find them, because scrapers are often breaking up the text they scrape, borrowing parts of sentences instead of whole ones.
I'd say that the ability to match exact strings of text - especially to the original source - is also not high on the average user's list of priorities. Google has clearly been willing to trade off in that area, whether we like it or not (and I don't.)
Yes, it's as if Google gets bored. Google likes results with differentiation. On legitimately duped content, like the Declaration of Independence, Google will display many more results on a longish but occasionally-quoted passage (eg, "deriving their just powers from the consent of the governed") than on a longer and little-quoted passage (eg, "for the sole purpose of fatiguing them into compliance with his measures").