Forum Moderators: Robert Charlton & goodroi
Here's a snapshot of the problem:
-This is a site that organizes info on 4 topics by state. Think of it as 50 directories with content like *Alabama Rutabaga Farms* where the dir is named Alabama Rutabaga Farms and the title/content on the index page in each dir is Alabama Rutabaga Farms.
-All content is about 4-5 years old, gets updated about once a year, and has normally ranked between 3-12 for the main 3-word term, depending on the competition for the term. Secondary pages in each dir have ranked very well for less competitive terms.
-A search today for *Alabama Rutabaga Farms* would not find my Alabama index page at all, until I select "repeat the search with the omitted results included". After that I'll find my secondary pages scattered through-out the serps, with one or two in 20 position, several more in the 70's, and a bunch somewhere south of 200. The dir index page will still not show up.
-A site: command returns all pages, usually with the index page listed first. My GWT account is showing no major issues, except a big fall-off in visits by the G-bot.
Any ideas as to what direction I might look in to? Is it a penalty? Thanks in advance for the excellent help here.
In order to show you the most relevant results, we have omitted some entries very similar to the [number] already displayed.
If you like, you can repeat the search with the omitted results included.
If your site duplicates content available elsewhere, then your pages have probably gone supplemental. This might be because your site has been scraped and duplicated elsewhere, or because it's been made available through a site-proxying "service" -- likely with someone else's ads plastered all over it. Or, if you paid someone else to write the content, they may have copied it from somewhere else (not very likely, from your description of your site as being fairly mature).
You'd do well to carefully check the sites that appear for your search terms, and see if any of them have duplicated your content. Also, try searching for long phrases from your pages, enclose them in quotes, and see what turns up. This helps reveal problems where the scraper sites don't rank very highly, but there are several of them.
Jim
If you really think I have gone supplemental, I'll go back to that lib thread. But it seems kind of hard these days to tell what's supplemental.
However, I have always wondered what might happen when a site that has been copied updates its content, but not enough to escape being a duplicate of the copies, and now the copies are all "older" in Google's eyes. I could see it all going pear-shaped for the site that was copied.
However, I have always wondered what might happen when a site that has been copied updates its content, but not enough to escape being a duplicate of the copies, and now the copies are all "older" in Google's eyes.
I've wondered about this myself. It's a worry that kept me for a while from fixing a particular -950 page that was also a page that was scraped to death. Ultimately, we didn't disappear when I changed it, but we also had higher PR than the scrapers.
dibbern2 - Generally, if it is scraping and a page does disappear, Google manages to figure it out in a few days to a week. Google is much better at sorting out dupes, in my experience, than the other engines... but there's a lurking worry that they are not perfect at it.
I recommend you also check your pages with Copyscape, which will detect the kind of scraping that uses bits and chunks of your page. I've noticed some ranking drops in pages that have experienced this kind of scraping, but I haven't established cause and effect. The drops could be completely coincidental.
We also had some threads here about fighting off site-proxies and scrapers.
A basic discussion on proxies is available in our Hot Topics area [webmasterworld.com], which is always pinned to the top of the Google Search forum's index page. Note the Defending Your Rankings section, and this thread...
Proxy Server URLs Can Hijack Your Google Ranking - how to defend?
[webmasterworld.com...]
There's also a discussion on scraped and stolen content in that section.
My sites have returned, often in the Goog top 5 for 2+3 word searches.
I don't know if this is simply a tweaking of the algo dials, but I doubt it. In all respects my problems seem to have been a penalty.
I undertook a few steps in internal linking that affected each page, and perhaps that was the answer. If anyone is interested, I'd be glad to share what I think was the problem.
My site deals with health care. Among other content, there are three large directories, each devoted to a specific issue of healthcare. As these grew in content, I thought 'why not link them to each other, since they all appeal to my audience'. To illustrate with some meaningless subject, think of (a) colds (b) flu, (c) sinus problems. Its obvious how I thought my users would find each topic of equal interest.
So during July and August I started adding links to and from each page in each directory to its partners in the other; i.e., cold symptoms <<>>flu symptoms<<>>sinus problem symptoms; and so on, in a round-robin.
On Sept. 17 I went in to penalty territory. My standard measure, a three word search term (example: flu treatment in Arizona) went from about #8 to nowhere, except for searches that contained "repeat the search..." and then, I'd get some obscure page listed in about #300.
My Yahoo rankings actually went up. Long tails at G kept my site alive; I kept about 30% of my old G traffic.
On October 10, I started to remove all those cross links, got about 50% finished. And that takes me today.
I did NOT: modify titles, metas, content (except links) or design factors. Each page update was submitted via GWT.
Don't know if this helps anyone, buit I hope it does. Watch out for cross linking, even though it does make sense to your subjects.
There was only one subset of cross links for each locale, but there were several hundred locales.
Also: standard upward links to home for each topic. I did not change these.
I too have wondered many a time about what jdmorgan and g1smd have said.
Social sites can hide copied content in thumbnails and lengthy pages that don't show up in phrase searches. I haven’t seen any scrapers effect rankings but I did file DMCA’s against two social sites and had a site soar in the rankings. You stumble on the copiers just by accident.
I'm having issues with low ranked pages not getting indexed.
I have lots of search result pages on our general listing site. Broken up by cities and then categories and sub-categories. On those category and sub category search result pages there are 25 results with links to individual listing pages, the anchor text of which is unique and derived from unique page titles. But underneath each of those links of unique anchor text we've also included breadcrumbs for each of the 25 results. So as an example:
TOTALLY UNIQUE PAGE TITLE ANCHOR TEXT LINK
whatever city the posting is from anchor text link >> whatever category the posting is from anchor text link >> whatever sub-category the posting is from anchor text link
TOTALLY UNIQUE PAGE TITLE ANCHOR TEXT LINK
whatever city the posting is from anchor text link >> whatever category the posting is from anchor text link >> whatever sub-category the posting is from anchor text link
TOTALLY UNIQUE PAGE TITLE ANCHOR TEXT LINK
whatever city the posting is from anchor text link >> whatever category the posting is from anchor text link >> whatever sub-category the posting is from anchor text link
This repeats 25 times down the page and the pattern repeats itself around the site. Make sense?
Thoughts?
Essentially, that means that many pages begin to show similar relevance signals instead of each page being focused. It's another good reason to limit total links on any page.
OK so if I understand you correctly.. just having them could really screw things up for the whole site.
However I do need to have the result pages with links to each individual post.. otherwise how else can people browse and search. What I've now done, as I mentioned above is to convert from links to just text the links underneath each unique post link.. these:
"whatever city the posting is from anchor text link >> whatever category the posting is from anchor text link >> whatever sub-category the posting is from anchor text link "
In your opinion, would that satisfy you or would you do something else? Feel free to suggest even random unfounded ideas.. the discussion is helpful either way.