One more for the basket:
- missing positive reviews from the usual review sites count as a minus
>>>Thin pages cause substantial bigger problems for a domain
Also the definition of "thin pages" has changed. Thin pages used to mean content pages with very little content.
Apparently this has now also been applied to directory pages, index pages, and user generated content. Google has no ability to differentiate between crappy, thin content and legitimate pages design for organization or for user profiles, etc.
It just counts them all as the same thing and applies a big penalty.
From what I read, the algorithm changes are aimed at content farms. So the affected sites would either be content farms or sites that have some of the characteristics of content farms.
|Too many external named links "widget keyword" instead of "more..." (eg) cause penalties |
haven't heard about this one, do you mean if there is a lot external links to a widget page with widget name, google will penalized it? please clarify, thanks.
I'm not sure that the dial change on backlink anchor text is actually part of Panda. I think it may be an independent change to the older parts of the algorithm.
|From what I read, the algorithm changes are aimed at content farms. So the affected sites would either be content farms or sites that have some of the characteristics of content farms. |
I think we're moving away from the word "farm" as that was never used by Google. Some small sites have been hit as well (and "farm" would imply breeding for large quantity). I have seen 100 to 200-page sites hit. "Shallow content" might be a better phrase to use as Tedster suggested in another thread. Also, I am looking very closely at the phrase-based scoring [seobythesea.com...] as (for me) this penalty seems to be very phrase specific. I am trying to connect the "shallow" pages of my site with this phrase detection (perhaps via anchor text in my internal links?)
We've got a decent amount of discussion about Phrase-Based Indexing [webmasterworld.com] in this forum, too. And I agree with crobb305. Panda is designed to measure "quality" so it seems clear that some kind of semantic processing is involved, even if there's been little discussion about that possibility so far.
@tedster: agreed. overSEO'ing might not be connected to Panda - but, I want to be sure with it!
I still wonder, how I would dial the "quality" aspect in my ranking equation. Yet, I wanted to start a thread with minimal discussions, only short headlines of what you guys think... I have one more:
- outgoing links (no matter if follow or not) to content that contains large parts of your copy could connect you to a "less important" evaluation (penalty would be too much) farm aspect.
Another key area that many people feel is involved. Page loads with little to no content above the fold, no matter how much content is included after that.
- user satisfaction on the site (site is useful or not)
I think Adwords have a good algo to detect what happends with clicks, what user id doing after that. Maybe they've implemented something related to this.
Website trust/credibility factors. I believe that Google asked users to evaluate a sample of websites during Panda's development. This sounds similar to what Microsoft did in their recent website credibility study [webmasterworld.com...]
I am pondering anything that can be programmed into the algorithm to be an indicator of quality, in particular, "Reputed Credibility" (i.e., 3rd-party certifications and awards). Among these MIGHT be SSL Certification, privacy verification, hacker-free certifications, respected/recognized awards, etc. These may or may not be playing a role right now.
On a different note, I wonder if external links that are blocked by robots.txt at the destination or somewhere en route during multiple redirects (i.e., affiliate links that are blocked at the destination or at an intermediate redirect -- similar to what happens with CJ links before they are redirected to the final destination) could be a quality indicator.
Another item I am pondering: author names. I wonder if profiles could be created for author names, similar to any other expected query/phrase. If an author tends to be published on hubs before/after they have contributed works to your site, maybe this is a negative (particularly if those hubs have been hit by Panda)? I don't see many author names or "staff writers" listed on articles of top ranking sites in my niche. This could just be my over thinking, but I have seen my contributors publish (unduplicated) works on hubs (a reasonable expectation if they are freelance), so I fear their name on my site, in and of itself, may be bad.
I'm sorry for being off topic a bit, but why would a blog, for instance, need a ssl certificate? Sorry if i sound ignorant, but i would really like to know what could be the reason for getting one of those for a blog. They're not cheap...
IMO it has less to do with the weight of the links changing and more to do with a 'reverse scoring' (for lack of a better phrase), meaning I think a page with links pointing to a thin page may have it's quality scored lower, when a page the link(s) are pointing to are determined to be lower quality.
IOW: If Page A links to Page B and Page B's quality score is low, the overall quality score for Page A is lowered by linking to Page B.
We know link text counts forward (to the page the link is pointing to) I think part of what Panada does is reverses the scoring and the quality score of the linked page counts backward (to the page doing the linking).
Keep in mind this is 'speculation only' atm, but I really think people are looking in the wrong place when they're simply looking at link based scoring 'the old way' ... Simple link weight based scoring is soooo 2000, imo.
[edited by: TheMadScientist at 11:44 pm (utc) on Mar 30, 2011]
|but why would a blog, for instance, need a ssl certificate? Sorry if i sound ignorant |
Zerillos, bear in mind that I was just thinking out loud....trying to identify all the possible signs of trust that could be used. If you glance through Microsoft's credibility study, you will see one of their factors is "Reputed Credibility". They use the words "certifications" and "awards".
An SSL cert is only one type of certification (useless for non-SSL pages and non-commerce sites, so for a blog it may not even matter). Having said that, I do see some well-ranking e-commerce sites that apparently have SSL installed but utilize only http urls. They went out and got an SSL Certificate and placed the seals on all of their unsecured pages. It is a certification, so it *could* count as a credibility factor (especially to the end user who may not know the difference -- they just see the seal and feel more comfortable).
This type of certification might benefit a site pegged as an e-commerce site the most. I happen to force all https requests to http in my htaccess, but I do have SSL installed. So I could generate the code and get an SSL certificate for all of my pages. I honestly doubt it would help me given that Googlebot can detect that the pages are not https, but I can't confirm or refute that.
I'm really not sure how much all of this plays into Panda, but given that they surveyed users, I must say that I feel like *some* sort of credibility indicators that may have been common among all of the best-reviewed sites could be sought by the algorithm.
[edited by: crobb305 at 11:54 pm (utc) on Mar 30, 2011]
In trying to wrap our minds around Panda - or even the Google algorithm as a whole - I suddenly realized today that many people are still thinking of it like its a checklist, where a total score for each item gets ADDED UP. I'm sure it's not that.
All these people saying "oh no, now you're telling me I've got to do ________ (fill in the blank) are not understanding the complexity involved, in interrelated decisions that are made within the algorithm's calculation.
Remember that Biswanath Panda's specialty is decision trees. These are complex networks that are trying to simulate a human being's neural processes. Human processes do not work in a linear, additive manner, and neither does Google's algo.
Google works in mysterious ways :) This certificate issue is one thing i never took into consideration.
One more for the topic: i know for a fact that g has really serious tech in the image analysis department. They could use that in the quality assesment they do with panda.
ok, thx for the thoughts - back to the headline style:
- Internal links devalued, only external count really
- Thin pages cause substantial bigger problems for a domain
- Duplicate content snippets on your page cause substantial bigger problems
- Too many external named links "widget keyword" instead of "more..." (eg) cause penalties
- missing positive reviews from the usual review sites count as a minus
- the (low) quality of a link destination could backfire on the quality score of the link source
- missing certificates/page seals of organizations (BBB maybe?) could give a missing signal
- user behaviour (satisfaction) on-page (measured by plug-ins or analytics you have on the pages) could give a quality signal
- Content above the fold: implying that G renders the page and estimates the content quality early shown
some stuff I could believe in, some not. yet lets just gather without judging..
@tedster: neural paths... well noted, but before I build MY tree I want to have all possible nodes listed. how to put them in order is another process :-)
I completely agree tedster ... people are looking for a single, simple answer, like those of the past have been, and that just plain doesn't seem to be possible any longer from what I'm looking at...
It's a very difficult challenge, especially if uniqueness or 'originality' is part of the score, because Site A may do something and Site B's owner may think 'Oh, this is the answer' and replicate what Site A does, and it could totally backfire, because by replicating Site A (in someway) they lower the 'originality' of Site B ... It's really difficult to explain what I mean with originality and uniqueness, because IMO it would have to be 'niche' specific on a very granular level.
If Site A in Niche 1 does [blah here] and Site B in Niche 2 finds a way to incorporate [blah here] then Site B may be the originator in Niche 2, which could be a benefit, while at the same time, Site B in Niche 2 replicating what Site A in Niche 2 does could be a detriment ... I hope this makes a bit of sense to someone, and I know currently they have a scraped duplicate content issue, but imo that's something separate Panda 'walked over' during the processing and will be fixed, independently.
Google's really turned in to a complicated issue to explain, because for me it's more of a 'concept' than a set answer, and explaining a concept with the complexity of G is very difficult for me...
I was glad to see immediately after the Panda update, my keywords were mostly unaffected.
But then about 1 week ago from today, my rankings that i've held solid for several months for a particular important keyword...sent me from 3rd place to 17th place.. in which i'm practically off the map now.
Is it possible this is a delayed panda effect ? Or was everything that was suppose to happen...happen back then when the update was first done ?
I'm flip-floppin (dancing) all over the place now... but for the most part i'm fluctuating between 10th and 17th place. And i've made almost no changes.
My guess is their continuing to nitpick around with algorithms...and it's driving me insane....
[edited by: tedster at 10:00 pm (utc) on Apr 29, 2011]
[edit reason] maintenance [/edit]
Add to your list: spelling and grammar. I have mentioned early on that I found a spelling error in a header tag <h3> on my hardest-hit page (-400 positions). I corrected the mistake and within days, WMT reported that the average position of that page had rebounded to #2. This could have been a coincidence, but I do think spelling (particularly in critical places, like title/desc/headers) could be important. That page was unduplicated/unscraped, thick, and had no ads or affiliate links. The page did have a short/duplicate title and description (and was warned for in the html suggestions of WMT), so all three could have caused the page to suffer. Remember that Google told us that with Panda, just a few bad pages can negatively affect the entire site.
[edited by: crobb305 at 12:14 am (utc) on Mar 31, 2011]
I'm sure that Google continues to "nit pick around" with their algorithms. If last year was roughly 400 tweaks, I wouldn't expect that to stop any time soon.
That's an interesting, if depressing, observation. Please consider adding it to the monthly SERP Changes discussion [webmasterworld.com]. It doesn't sound like pure-panda to me.
Don't forget that DATED CONTENT also may be a factor, many of us turned off our dates when dated evergreen pages went south for the winter.
nope. we're talking dated content, like 1/1/99, it went south for the winter.
|outgoing links (no matter if follow or not) to content that contains large parts of your copy could connect you to a "less important" evaluation (penalty would be too much) farm aspect. |
A while back I read about how the big sites don't link to the small guys. I think it was Danny Sullivan that said he was referenced by a NY Times article but they said "One report suggested" (or something like that). No links.
The bottom line is: they were worried that sites would be afraid to link to anyone and that would basically kill the Internet. What would the net be without links?
Mad Scientist - I think you are right - linking to a low grade site can drag down your site (I think that is what you were saying).
Here is some speculation - let's say you take a paragraph from a source and then link to it. Now the SE may look at you as less important as the site you are linking too.
When I put up articles on other sites (I rarely did that) linking to my site, I would also include a link to a government site as a source. I doubt Google would downgrade a site for linking to a site to .gov.
[edited by: Dan01 at 1:23 am (utc) on Mar 31, 2011]
|I doubt Google would downgrade a site for having a site to .gov. |
I have been considering the possibility that G would downgrade a site for linking to the same source too frequently or, perhaps, to .gov (or the same family of .gov pages) too frequently. My site has tended to do this and following my hunch, I have cut down on some of these.
Also, I am considering the possibility that placement of outbound links on the page (particularly if they are followable) could play a role. I tend to create a "Resources" section at the bottom of my pages, with some useful outbound links. I have started interspersing links throughout the body text a little more.
|nope. we're talking dated content, like 1/1/99, it went south for the winter. |
So I should remove that? All my pages have it. Most are updated weekly but others are not. Does Google like old or new content? I can scam them, it's a php line :)
I remember a while back we were discussing JohnMu's words, and one thing he had said about uploading a "fresh copy of the code..." I wonder if they are also looking at the server date for last upload.
Don't know if dated content should be removed yet, but it appears to be ONE signal Google is using, we suspect it's targeted at REVIEW sites and I'm certainly not running a review site but bugs are bugs.
| This 216 message thread spans 8 pages: 216 (  2 3 4 5 6 7 8 ) > > |