| This 71 message thread spans 3 pages: < < 71 ( 1 2  ) || |
|Theory on Panda/Penguin false positives|
There are always claims of false positives when Google does something with the algo. A lot of these claims are incorrect, the result of webmasters not looking at their sites harshly enough to see what the algo is seeing. I've never been convinced there were any false positives, until Panda hit people I know, whose sites were extremely unique and creative, nothing like "content farms." But they were, we agreed, perhaps a bit "over-optimized" - targeting hot keyphrases and so on, which we somehow thought might be part of Panda.
But supposedly that's not part of Panda, because now we know it's part of Penguin. Right? Or is that right?
When Panda came out, and we heard is was about thin content, I knew one of my sites was vulnerable. I was working slowly to improve the content, but my focus was on other sites so I was just waiting for Panda to catch up and hurt me. I had never been any good at link building, so all my links on all my sites are natural. On this particular site, my backlink profile was really weak because in that niche, people are very stingy about giving out free, natural links. I thought that might get it in trouble with Panda, too.
To my shock, Panda left my site alone, but Penguin got me - on my least optimized site. Yes, I realize I'm self-reporting and you have the right to be skeptical.
One possibility: my Penguined site is over 6 years old, so I have the sort of unsolicited spam links everyone accumulates over time, like updowner. I only have a handful of quality links, because of my lack of skills at link building of the white or black hat variety. Could Penguin be mistaking weak profiles on older sites for spammy profiles?
Or are Penguin and Panda intersecting and overlapping in ways we still don't understand? Why did people with unique and interesting content get Panda slapped, and why did my least optimized site get Penguin?
[edited by: tedster at 9:35 pm (utc) on Jul 3, 2012]
1. The site doesn't have quality links as compared to its competitors.
2. Time on site and page/visit went down since the new design and other layout changes.
3. Mass deleting not the right pages or too many pages.
4. Thickening the thin content - Sometimes thin content is what expected.
5. 301 to a new domain - It is a new site for Search Engine with all the consequences..
Sometimes 'improvement' takes the opposite direction - deteriorate.
Bewenched, none of my sites have had left-hand navigation in several years, so I don't know.
Zivush, great points. My response:
1. I think that is the case for this site, BECAUSE I didn't link build. That's why I think it's so ironic the site got hit by Penguin, which was supposed to punish link builders.
2. User metrics have stayed the same or improved.
3. This is entirely possible, because I was going strictly by user metrics and my own judgment about narrowing the site's focus. Metrics suggest the users are liking the change, but with Google, who knows.
4. Yep, it's quite possible that during the pre-Penguin "thin content" scare I fattened up some pages I shouldn't have. Since Penguin, I've focused on giving each page precisely what I feel (as a reader) its topic deserves... but even so, G may have a different take than I do.
5. Not quite clear on this one? I've 301'd several sites in the past and it's my feeling the SE's do not really treat it like a new site. I find you can expect a small traffic drop, maybe 10% for a few months, but not the 90% I've experienced.
If the content of a site is on another site and the other site has a plain-text url to indicate the source of the content rather than a link that can be clicked on (this is on several pages of the other site), would the content from the source be given credit as the source of the information?
Or since the links are plain-text, is it possible that the site where the information is from is not recognized as the source and the other site is? Does the site where the content is originally from then have to deal with duplicate content issues?
Are you asking whether the search engine can tell the difference? If so, Google has stated they can't always tell which is the original source. But assuming your page has already been indexed, it shouldn't be beyond the bot to notice that the two pages are identical other than the unlinked link, and since the unlinked link refers to your page, yours has to be the original.
On a related side note, it's been noted around here that Bing is slower to index pages, but better at just never indexing scrapers, compared to Google. It's also been suggested that Google lets in scrapers on purpose because most of them run Adsense, but another possibility is that Google is prioritizing fast indexing and that doesn't leave time to do the checks Bing is doing. Whatever the case, it's always a 50/50 chance whether or not Google will detect scraped content.
I would definitely issue a DCMA, though. When I've done that, I've *never* had Google fail to determine that mine was the original content. And after doing that, I don't often see another scraped version of that particular page make it into Google - at least not in the top 10 pages or however many I search. I do believe this method helps.
Over the past few months, I've examined my site extremely closely and found a few things that are spammy out of ignorance or carelessness, not an attempt to game Google.
--A near duplicate page. This is such a head-slapper, but I had written an article about a personal experience one year, and a couple of years later forgotten I wrote the first one and thought "Oh, that would make a great article."
--A link out (from my former top performing page) to a nice page from a site that I realize in hindsight was heavily targeting a particular lucrative keyphrase. Looking back in my stats, I realize now I was ranking a tiny bit on that keyphrase myself, so that would totally look like spam to Google. (And Bing, apparently - they later dropped the page too.) This was all done totally innocently on my part - I never wanted to rank for that keyphrase. This page was totally missing from the index before I changed it 20 days ago, and already it's back - hundreds of positions down, but still, it's back.
--I've found a few pages where I repeated keywords too frequently. This again wasn't deliberate keyword-stuffing, a practice I always abhorred even when white hats urged me to do it - it was simply lazy writing from a time period when I was preoccupied by a personal issue. I've replaced/removed most of the repetitions of the keywords.
BTW, if you can't understand why Google Penguin-slapped your site, start by examining any pages that Bing, too, has dropped significantly. Most of my pages do great in Bing, so when I see one that BOTH engines are rejecting, I ASSUME there's something wrong with it and keep studying until I find it. That's how I worked out a lot of this.
I've also learned a lot about reading my stats, and that enabled me to determine the following:
--I was initially hit by Penguin 1.0.
--Then the Panda refresh on 8/20. (Can you be hit by a refresh without having been hit by Panda initially? I see no indications I was ever hit by Panda before this.)
--The 9/27 Panda update hurt me further.
--I'm up 17% since the 11/6 Panda update.
So the steps I've taken seem to have done at least some good.
I am convinced you can now "spam" without knowing it, based on that outbound link I'm convinced hurt me. This was a really subtle thing, and it wasn't impacting user experience - the site I linked to was perfectly useful and nice to visit. It was only spam in the SEs' eyes.
So be forewarned. I was pretty ignorant of SEO before Penguin, and thought that would protect me from ever being seen as a spammer. Instead, ironically, learning SEO is what's enabling me not to be seen as a spammer. The advice so frequently given about "just give users what they want" is insufficient, if well-meant; nowadays you can "just give users what they want" AND accidentally send Google a negative signal, because the negative signals are much broader than they used to be.
First, I haven't made it all the way through the thread, but it's great so far and hopefully I'll have time to make it through later on...
One of the things that 'jumps out at me' is it seems like I'm reading: 'I changed [blah]', 'then I did [blah blah]', 'then I tried [blah blah blah]' ... Only an SEO would do that.
Normal people don't change things, then quickly change them again when the changes don't have the immediate desired effect on rankings, but one of the things stressed wrt rankings over the years is time/longevity of a site (IOW Patience), which makes me wonder if reacting too fast is part of the issue, because changing and changing back or changing again when rankings change (for the negative) is something only someone who's watching their rankings and trying to control (manipulate) them would do.
### Added Below ###
Let me put what I'm thinking another way with a couple of questions...
How many times do Wikipedia, eBay, Apple, Amazon, eHow, CNN and all the Mom & Pop site builders who don't know SEO change the title of their page(s)?
How many times have you?
The 'big boys' (from what I've seen) 'get it right the first time' and leave it alone. (Yes, some may change occasionally, but go with the point on this one.) Mom and Pop who don't know SEO set it and forget it ... An SEO is generally the only one who makes changes to something like the title of a page.
|which makes me wonder if reacting too fast is part of the issue, because changing and changing back or changing again when rankings change (for the negative) is something only someone who's watching their rankings and trying to control (manipulate) them would do. |
Yes, I believe you're right about this (that's what the rank detection patent seems to be about) and it's definitely part of my problem NOW. But I didn't make these types of changes before Penguin, so I don't believe it was what caused Penguin to hit me. I don't know if you read my post directly above yours, but I stated there what I believe caused it.
(Before Penguin, I did do some things that might have let Google know I read at least the big highlights from SEO news - when Panda came along, I beefed up my "thin content", not just to avoid Panda but because I realized I had a lot of pages that weren't of much value to users, and why not either refurbish, replace or delete those so that my site was jam-packed with visitor pleasing stuff. I don't think that is the sort of thing Penguin was meant to penalize. I think Google probably believes I sold that link I refer to in my post above yours here, for example, which is a much graver "offense" from their perspective.)
Yeah, I've only made it about 2/3 of the way through the first page @ 40 post per page, so I'm a long way from your post above mine ... I just posted when I thought I saw a 'theme' from some of the posts (not all yours) jumping out at me and I don't really have time to finish reading the rest of the thread yet, so it might be a bit before I get to where I actually posted.
I'll definitely try to get through the rest of the thread though.
No rush, TMS, but I look forward to your input! You are certainly right about the theme in general.
Aright, almost done with the first page and the next thing 'jumping out' is it seems like people are looking for 'the one thing' wrong...
Sorry everyone, but there's not one thing wrong any more these days.
With the complex implementations G is using (Panda, Penguin) my guess is there will not be 'one thing' (accurately, definitively) found to be causing a rankings drop with any affected site (there may be one that 'tips the scale' more than others, but I think to say it's 'the thing' would be inaccurate), but rather a number of things which when put into the 'complex #*$!tail', as tedster very aptly calls it, contribute to the ranking loss (IOW: when combined they 'add up to a problem').
(I'm going to stick with basics and this is just an example off the top of my head to show 'things adding up', I'm not saying these are the metrics used.)
Your titles are changed with a frequency of N/TimePeriod and the average title change frequency is .8N/TimePeriod.
Your pages have a backlink profile where N% of your backlinks are exactly the current or previous title of the page, where the average backlink profile of sites having exactly the title of the page is only .9N%.
You have repetitive keyphrases (or even highly similar co-occurring phrases) within your Title, H1, H2 and Link text at a rate of N/.6N/.4N/1.5N and the average is N/.5N/.3N/N.
You have N(Total) co-occurring phrases per N2 paragraphs which gives you N/Paragraph where the normal usage within your niche on the same topic (<-- that's important) is only .7N/Paragraph.
And when the preceding is 'all mixed together' your site appears to be 'spammy' (for lack of a better phrase) even though you're not 'way out of line' in any single area and some of the areas you may or may not even have control over.
Hey! How come tedster can say #*$! and I can't? Grrr ... I guess administration has it's privileges lol
Just an update here... I mentioned before that I had had a link out to a site that looked perfectly nice, didn't have dozens of spammy links in the footer or Adsense blended in or anything, but I realized in hindsight that site was targeting a keyphrase very heavily.
Well, in second hindsight, I realized I linked to the site with that keyphrase in my link, so Google probably thought I sold them that link. Would that explain the Penguin slap? Paid links are a definite no-no, but not necessarily an SEO thing (people who know nothing about SEO can be asked to sell links, and do so for the money, not because they understand how pagerank works or that Google even has an opinion on the topic). But I call this one a "maybe."
I removed the link in November, and I've begun to get some VERY long tail traffic to that page, which wasn't happening before.
Today I filed for reinclusion (I've done it before on this site, so it can't hurt to do it again) and explained precisely why I thought they had believed the site was violating guidelines, and what I had done to make it comply.
| This 71 message thread spans 3 pages: < < 71 ( 1 2  ) |