homepage Welcome to WebmasterWorld Guest from 54.198.94.76
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 337 message thread spans 12 pages: < < 337 ( 1 2 3 4 5 6 7 8 9 [10] 11 12 > >     
Analyze Panda Losers That Don't Fit The Mold
Shatner




msg:4297725
 5:55 pm on Apr 14, 2011 (gmt 0)

So we've had two iterations of Panda now, and with each iteration has come a publish list of the biggest losers. We all know, if we're honest, that a lot of the losers on those lists deserved to lose and lost for obvious reasons.

The point of this thread is to pick out the sites from those lists which DO NOT fit that mold, sites which it's not obvious why they lost, and figure out why they were hit.

In doing so, maybe we'll understand why Panda has hit so many here who don't seem to deserve it either. Here's the list of sites to discuss, I suggest we take them one at a time and simply go down the list one at a time and each list reasons we think each site might have been Pandalized. Once we think we've come up for an explanation for that site, we check it off and move on to the next one:

prnewswire.com
blogcritics.org
cinemablend.com
digitaltrends.com
technorati.com
daniweb.com
popcrunch.com
techradar.com
reghardware.com
pcadvisor.co.uk
techwatch.co.uk
just-food.com
computerweekly.com

 

Shatner




msg:4301527
 10:33 pm on Apr 20, 2011 (gmt 0)

@wheel I don't really see how that viewpoint is relevant to this discussion or any discussion on Panda. It's just confusing the issue.

If you don't care then stop posting in Panda threads. It's not like someone has a gun to your head.

wheel




msg:4301541
 10:45 pm on Apr 20, 2011 (gmt 0)

It's confusing nothing. You're just not liking what I'm saying.

People are complaining that panda is causing scrapers to outrank the original. There's pages being written about it. You yourself are tilting at it like it's a windmill.

It's very relevant that people move on from thinking that their income is being ruined because some scraper is outranking them. That is not the problem, and there's no fix for that in the near future that involves Google figuring out who wrote something first.

johnhh




msg:4301551
 11:13 pm on Apr 20, 2011 (gmt 0)

if full members and senior members start arguing it must be bad.

That means no-one knows.

Leosghost




msg:4301556
 11:17 pm on Apr 20, 2011 (gmt 0)

wonderhowto.com isn't "the original content source" ..the various people whose work he "curates" / scrapes are.

And a lot ( but not all ) of sites on the original list are just like wonderhowto.com ..themselves with no original content ..just repackaging other peoples content..

And they now being out ranked by yet another non original source ..

People are definitely not happy with scrapers out ranking original sources ..but G are going to be refining this series of algos all year long ..it will probably "shake out" and the innocent will come back for their own original content ..if they can signal "original" clear enough for the algo to think they are..

But what is ironic, is that so many ( but again not all ) of those complaining about being downgraded by panda, turn out to be scraping from / or using others original content without permission anyway .. [webmasterworld.com...] ..there are others here that some of us older members know of who also have run sites for years based on using content and images which are not theirs ..

They are complaining here at WebmasterWorld as well ..not everyone posting that they were hit is like that ..but some are ..and have been scraping and using content, text and or images or files that were not theirs to use, and running sites with ads around them for years ..and are now complaining that Google has done them wrong and Google is broken / or has "sold out"..no one is going to do any "outing" ..but they know who they are ( and they are not all "big" sites )..and as netmeg said a while back ..the shoes will be falling for a while yet ..hopefully all the scrapers and content thieves will suffer and disappear ..and hopefully the innocent truly original content sources will be able to make their signals ..meanwhile looking at real data will continue ..

But it isn't a "mold" ..it is more like a meal or a cake ..very very many ways to make it healthy and wholesome and nourishing and attractive to the palate and to the eye ..and to the critics ( our users )..But using stolen ingredients and copied recipes is not a way for cooks or restaurants to prosper and get good things said about them .long term ..nor is sticking jumping circus posters on the doors to the restaurant the best way for a really quality restaurant to attract a good clientèle, good reviews from the critics, or the good food guide to put them on the front page.

edited to allow for ..
It's very relevant that people move on from thinking that their income is being ruined because some scraper is outranking them. That is not the problem, and there's no fix for that in the near future that involves Google figuring out who wrote something first.


exactly :)..something you learn when you are brought up in the country on a farm ..when it is pouring with rain ..no use shouting your disgust at the sky ..get to shelter , either it will pass or you'll have to make sure you can go out into the world without getting wet ..build your own umbrella ..or make yourself waterproof ..or find a way that it won't matter to you ..what the sky( net ) does :)

wheel is confusing no one or maybe only a very few ..he is being totally realistic ..the sky doesn't care if you get wet ..and helping to shout at it won't make it stop raining ..the sky is not fair..nor is the world ..nor is Google ..and none of those things are likely to change in the near future ..

[edited by: Leosghost at 11:32 pm (utc) on Apr 20, 2011]

Swanson




msg:4301558
 11:29 pm on Apr 20, 2011 (gmt 0)

There are two questions being asked here:

1) Why did the Panda losers lose?
2) Why are scrapers out-ranking the original article?

I think they are both linked.

I also think that looking at individual sites that have been demoted since Panda isn't going to give the absolute answer that the OP is looking for.

There are many new factors in play now, that is fundamentally clear - and even from this thread we can guess what the broad issues could be.

From my own experience I am now in a position to look at what my sites that were penalised have in common with the other losers from Panda.

1) Big site lots of pages
2) Too many pages of "thin" or profile type pages
3) Too many ads above the fold
4) Not enough content in the "heatmap" area
5) Too many internal links on content pages
6) Too many link based pages used to get deep pages indexed
7) Not enough authority links

I feel what you are seeing with the original content ranking below scrapers is a sitewide penalty on the domain rather than a boost to the scrapers.

I also feel that smaller sites have an easier time now and that is why wordpress scrapers can do better than the original content on big sites.

I believe that this new algo looks at percentages of "low quality content" versus high - which means big sites suffer much more as they have loads of problems with tag pages, category pages, link pages etc.

Smaller scraper blogs will show a higher percentage of content to link pages - just by their nature.

That is what I am seeing - all my smaller content sites (that are purely duplicate datafeed sites) are doing just fine. All my "big" sites have been affected in some way even though they have thousands of pages of unique content - but they also have a lot of pages that are "thin" which are designed to get lower pages indexed.

It is as if in this new algo my unique content has been ignored as the percentage to the total pages is small - and the ads above the fold are the icing on the cake.

To me it is not valid to look for the one killer factor - it is based on a range of identifiers, all of which I think have been mentioned in this and other threads. They may have differing degrees of "penalty" but they all contribute to a lower scoring. The reason big brands win is that their backlinks are so much better than the rest of us.

Swanson




msg:4301561
 11:37 pm on Apr 20, 2011 (gmt 0)

I honestly think if you have been hit hard by Panda and you are small site then start again.

If you are a big site then I would consider starting again as well.

I started again a few thousand times as I got tired of messing about with every update.

Ironically the sites that I put all the effort in with thousands of unique handwritten articles (as I wanted "brand sites") have all been affected - all the pure datafeed keyword rich sites with no content are doing just fine.

Surely that is not what Panda was meant to fix. But a plan B feels good right now.

walkman




msg:4301562
 11:43 pm on Apr 20, 2011 (gmt 0)

Swanson,
the problem is that 2 months later no one has come back. This is what really is p*ssing me off. Google changes their algo completely, going even against the advice they themselves offered, and doesn't take any changes in consideration for at least 2 months.

exactly happy!..something you learn when you are brought up in the country on a farm ..when it is pouring with rain ..no use shouting your disgust at the sky ..get to shelter , either it will pass or you'll have to make sure you can go out into the world without getting wet ..build your own umbrella ..or make yourself waterproof ..or find a way that it won't matter to you ..what the sky( net ) does happy!

As soon as I read your post I called a farmer. He told me that the first thing you learn, is to differentiate being rain (an act of God) and someone spraying water with a hose towards you. In the second case he said, you try to figure why, how to stop it, what will it take for him to stop etc. And farmers, he said, hate false analogies.

Swanson




msg:4301572
 11:54 pm on Apr 20, 2011 (gmt 0)

walkman, I totally understand.

But I think in a different way - I am not thinking about pleasing what Google say, that is a fools gold.

The tide has changed - there is no longer "white hat" or "black hat" attitude to Google rankings.

Everyone has to get their heads around that it is survival of the fittest - the days of disecting a Google update and then waiting for a roll back are gone.

If you run a site now - expect that it will be gone from Google every day that you get traffic from them.

Use that mentality and then you can start to think of getting a long term strategy to keep Google traffic coming in.

And that might mean breaking some of those "I love Google and they have told me to do XYZ and that will be good" attitudes.

With every update Google pushes us away - and the winners will be the guys that always have a plan B.

Leosghost




msg:4301586
 12:32 am on Apr 21, 2011 (gmt 0)

When I was kid in Ireland we had no hoses on our farm ..nor did we have running water or electricity..

I can still read and predict weather better than the folks on TV ..and I no longer live in the same country ..and live on the coast in what is considered to be an unpredictable area of North West France ..and still beat the TV weather forecasts here..even the 3 and 6 month long range ones..so do others here.

Can do the same with search engines ..and crops ..the good ones take longer than 2 months to show results ..if you really knew a farmer ..and had really called one ..he would have told you that I spoke the truth and to have patience..unless of course he was an industrial type farmer ..used to hoses and artificial irrigation ( involves taking the water from elsewhere to use it to grow your own crops )..which of course is exactly the kind of thing that Google were trying to stop ..when we were all calling it farmer before we all began calling it panda.

I'll take what I know from growing up as a son of a real farmer over the supposed advice of your fictitious farmer.

You think shouting at the sky when its raining will get it to stop ..you are just going to waste your energy and those of the people around you that you convince to shout with you ..and you'll all get wet next time it rains ..again..and again ..

I've been saying for weeks now..if not months .as have many others here ..adapt, evolve, survive or die ..that is how it is ..none of us like it ..but ranting on every thread there is here about panda or Google will change nothing ( but if you find it is cathartic ..fine :).but it makes them "noisy" as shaddows commented elsewhere here today ) ..and doesn't help any who were not hit to formulate what Swanson calls a plan B ..or C or ..( being "pissed" is not a plan ..nor is hoping their share price tanks or that they do a roll back ) a way not to depend on Google for your living ..which again many of us here have been saying is the best way to cope with algos ..for years ..some were too busy counting the adsense or abrite or whatever ad supplier money to make the effort ..or thought it would last forever ..the web evolves ..as does life ..

netmeg




msg:4301588
 12:43 am on Apr 21, 2011 (gmt 0)

If it were I, I would not be looking at my site, I would be looking at my business model.

But maybe that's just me.

walkman




msg:4301591
 12:46 am on Apr 21, 2011 (gmt 0)

"I've been saying for weeks now..if not months .as have many others here ..adapt, evolve, survive or die ..that is how it is ..none of us like it ..but ranting on every thread there is here about panda or Google will change nothing"

You assume that we don't have 10 Firefox windows open...but anyway, thanks for the advice.

onepointone




msg:4301592
 12:48 am on Apr 21, 2011 (gmt 0)

Having a Plan A,B,C,D,E,...Z is best.IMO

Webmasters kind of grew up with a 'techie-driven', 'want's everybody to love us' g.

Now seems to have evolved into a 'money only-driven', paranoid g. Catches some people off guard.

Read about how they treat customers in adwords threads. They sure won't treat a 'free-ride' webmaster any better!

maximillianos




msg:4301594
 12:53 am on Apr 21, 2011 (gmt 0)

My two cents. This is not a permanent penalty unless you don't fix what they penalized your site for. It can be remedied. Google acknowledged there would be some false positives. It is statistically impossible to release such a big change without a few false positives.

I think there are many factors. If you are good enough (lucky?) to figure enough of them out you will then have to wait for G to agree your site is fixed. Then, at that point you have to wait it out for x weeks or months for G to approve the changes.

So even a site that may have fixed their issue within days may still be stuck for a few months in a queue. For big sites, it may take a few weeks or months for G to even re-crawl and confirm. Then a few more months of waiting to see if the changes stick.

At least that is how I would program it. ;-)

So moral of the story. Don't listen to anyone who says throw away you 12 year old domain and start over. The domain is not the problem.

Shatner




msg:4301619
 1:48 am on Apr 21, 2011 (gmt 0)

>>>The tide has changed - there is no longer "white hat" or "black hat" attitude to Google rankings.

Everyone has to get their heads around that it is survival of the fittest - the days of disecting a Google update and then waiting for a roll back are gone.

If you run a site now - expect that it will be gone from Google every day that you get traffic from them.

Use that mentality and then you can start to think of getting a long term strategy to keep Google traffic coming in.

-------------

This. The problem with walkman and people like him is they are unable to wrap their head around this. They want to believe that the people who have been Pandalized somehow "deserved it" because it makes them feel more secure in their own position. If everyone who was Pandalized didn't "deserve it" as they like to put it, then they might be next, and that scares them.

Their fear is sidetracking a legitimate discussion of what's going on with Google.

It boils down to some people are so afraid they don't even want to discuss Panda and they're trying to stop the discussion.

crobb305




msg:4301620
 1:54 am on Apr 21, 2011 (gmt 0)

From my own experience I am now in a position to look at what my sites that were penalised have in common with the other losers from Panda.

1) Big site lots of pages
2) Too many pages of "thin" or profile type pages
3) Too many ads above the fold
4) Not enough content in the "heatmap" area
5) Too many internal links on content pages
6) Too many link based pages used to get deep pages indexed
7) Not enough authority links


Good summary. I think with my Pandalized site, I am wasting too much space above the fold on a stupid date script (that displays today's date) and it resides in an otherwise empty cell that has way too much whitespace. The date stamp was something I thought was neat about 10 years ago when I first got started, but I can't remember why I left so much white space (probably just to divide the page up and make it easy to navigate visually). Anyway, I think it fits with "not enough content in the heatmap area" (or even insufficient content above the fold -- whether it's banner ads, whitespace, etc). Now, I'm just trying to decide what to put there without triggering another filter. lol

Also, it's good to hear some data from someone with multiple sites that they can compare (some Pandalized, others unPandalized). I have an unpandalized site that has improved 40%, so I can see the similarities you mention when comparing my sites.

Leosghost




msg:4301638
 2:40 am on Apr 21, 2011 (gmt 0)

Something that seems to come up even here, as irritating most of us ..and I know from watching users that it irritates and confuses the hell out of them as they try to close them, or get around them to the actual site.

Pop up ads on entry to site..or those that you can make wait and ambush the visitors..you know 3000 milliseconds and they slide slowly over the veiwport and have to be clicked on very precisely in a hard to find, or catch, part on their frame to get them to go away ..

Or sites which auto size to what the webmaster thinks is ideal..I don't mean minimum size ..but pages that onload to max screen or very near it..

I used to have one of those onload autosize scripts on a site ( yes crobb305 ;-)..everyone has their "datestamps" and their wooshing email gifs in their closet..we all thought that it was coool ..years ago, like our hairstyles ;-) ..I took it out when widescreen monitors went mainstream ..but I'd have probably forgotten to do it if a friend hadn't said to me that he didn't know how to get the thing ( browser window) smaller.

And it used to trigger the "this site is trying to run scripts"..do you want to etc etc on IE ..scared people ..actually had someone say to me "my computor says your site has a virus!" IE messages or AV messages worry your average non geek user ..makes them feel less trusting about your site..

We think things are cool ..or some things are not intrusive or difficult or don't get in the way, or are normal for a site ,like we tend to pay less attention to ads than the average user ..but watch your average user ( you can actually just spend your own money to test "user friendliness ..you don't need "an institute"..don't use family or friends ..they may well say what you want to hear or what they think you want to hear )..rent a room for a day in a mall or similar ..give people ( choose average looking folks don't go for the geeky looking ones ..ask to make sure they are not ) $10.oo a time to 'try" a site or two sites ( ten dollars =ten minutes ..and don't quibble with them about the time or anything at all )..don't guide them ..ask them ..or better yet just watch them ..

Thats a small version of how ad agencies test "response" prior to "rollout"..or how manufacturers test packaging changes..or even how toy companies test potential winners ..

Cost you a few hundred dollars maximum ..can save you thousand or tens of thousands or more ..your "pride and joy" may not be as cute and friendly as you think ..

Test this way for enough variants of age etc ..and you get to the point where you no longer need to ..you can make a site for your visitor, whoever they may be and whatever the subject might be ..and know that it won't put the average target off as soon as they try it, or as soon as the page opens..

Tangible proof that anyone can do and see with their own site ..or someone else's..or with an idea they may have for a site..

Nielson only brought to the net what some of us had been doing for decades and others had been doing for decades before us off line .

tedster




msg:4301679
 4:22 am on Apr 21, 2011 (gmt 0)

It's true - the game has changed, very seriously changed. I've been communicating with all kinds of experienced SEOs and no one has managed to bring a site back from this, or see a predictable pattern.

Some philosophy:

It's both frustrating and painful when your livelihood is on the line. But let's not get crazy on each other because of something Google is doing. All of us frustrated webmasters should be natural allies. If we lose our sense of community, then we've really let Panda do some serious damage.

I think this forum's Charter [webmasterworld.com] says it best:

SEO is a fluid profession. Techniques and tactics are always changing, and absolute rules are in short supply. There are only educated opinions - and it's common for opinions to run counter to each another. Tolerance helps to clear up the discussion much more than conflict.


Sermon over, let's return to the regularly scheduled program.

semseoanalyst




msg:4301723
 6:37 am on Apr 21, 2011 (gmt 0)

Shatner makes this thread wealthy,that makes people like mine spending hours reading the analysis ...but demoralized, distracting the thread subject lines that way....

incrediBILL




msg:4301724
 6:43 am on Apr 21, 2011 (gmt 0)

It boils down to some people are so afraid they don't even want to discuss Panda and they're trying to stop the discussion.


Or more annoyingly, keep driving the discussion into debates about quality content or not. The algo can't tell, that discussion is such a waste of time yet some just keep chanting it and other meaningless platitudes like it has anything to do with Panda.

bluntforce




msg:4301742
 7:43 am on Apr 21, 2011 (gmt 0)

I didn't get hit by Panda, but if other people did, I work on the expectation that it will happen to me also.

I always revert back to Brett's 26 steps.
UGC always provides new pages, I try to fill those out with more information, links if appropriate, but always those pages will provide the user what they needed/wanted.

Links? One a day isn't always practical, sometimes I'll do five a day, other times there will be gaps. Always "on theme", I have no interest in out of area or off topic links.

I have quality content, my job is to make sure that content is accessible to internet users, that's where my focus lies.

onepointone




msg:4301751
 8:07 am on Apr 21, 2011 (gmt 0)

I've looked through the list of losers a few times...

Wondering, have many or have any "niche" sites reported big "pandalization"?

Personally, I don't consider big sites based around "how to", "shopping", "articles", "movies", "tech" to be niche. They cover tons of ground. Niche means niche.

And I was thinking the "big general site" aspect alone wouldn't cause a penalty, but maybe the way internal links were distributed, possibly along with "unnatural" site growth? Maybe that is a google definition of content farm? Just speculating!

[edited by: onepointone at 8:32 am (utc) on Apr 21, 2011]

walkman




msg:4301752
 8:11 am on Apr 21, 2011 (gmt 0)

This. The problem with walkman and people like him is they are unable to wrap their head around this.They want to believe that the people who have been Pandalized somehow "deserved it" because it makes them feel more secure in their own position. If everyone who was Pandalized didn't "deserve it" as they like to put it, then they might be next, and that scares them.


HUH? When did I say that

Shaddows




msg:4301769
 9:01 am on Apr 21, 2011 (gmt 0)

HUH? When did I say that

Pretty sure he meant "wheel"

Shaddows




msg:4301791
 10:42 am on Apr 21, 2011 (gmt 0)

I've been thinking about a couple of mental traps that some people find themselves in.

1) Panda is a "penalty"
2) Pandalised sites fit a mold

BEFORE YOU JUST SLATE ME, PLEASE JUST READ ON.

Panda is not a penalty, it is a new scoring strand- like PR is a scoring strand, or TrustRank.

Similarly, Pandalised sites do not "fit a mold" and therefore cannot fail to fit a mold. Sure, there are commonalities- the big one being non-unique content (regardless of who published first).

In order to maybe see this issue from another perspective, consider PageRank.

Imagine PageRank became a factor today. Yesterday, everything was onpage factors, with maybe some seed sites giving some weighting, and some semantic relationships being analysed. Actually, many important factors use PR methodology, but try to ignore that for this thought experiment

Suddenly, new sites shoot to the top of SERPs. Studying them for commonalities doesn't seem to lead anywhere.

Some perfectly good content suddenly drops out the top 10. No one knows why. None of the old tricks work- semantic siloing, changing page titles, even Hx schemes.

Funnily enough, sites with a lot of traffic pop to the top. Google must be punishing the little guy. Google only wants people with high traffic. It's all about brands. It's so unfair.

Reports start coming in. Some sites with low-ish traffic have been rewarded. It seems like links are giving value in themselves. There are sceptics:
"Why would just getting someone to link to you give you benefit? My site is MUCH better looking, and uses funky markup. I never needed links before."

Counter examples start being put forward
"No, I comment on lots of blogs and have a link to my pet site, and that has no effect"
"Yeah, and I have a personal blog, with links to all my sites on every page, and that doesn't help"

Eventually, consensus emerges. It takes months. It turns out that links are the key. Positioning, repetition, PR of linking page, quantity of links on the page, templates- so many factors. Strategies change, the world moves on.

Years later, people say: "Remember the Larry update. Before then, you just needed some good directory listings and onpage optimisation. I can't believe people think those things still work- I can get a site ranking with just a few high-PR links"

My point? PR emerges from the system, it's not a tick-list of factors. Losing out to sites is not a penalty either. And while we might not yet have a methodology to exploit the new system, it doesn't mean its random, or unfair. New techniques might be needed, and it might take a LONG TIME to overcome the differentials inherent when the new score came into play- getting 1,000 scoring links isn't easy.

jecasc




msg:4301792
 10:52 am on Apr 21, 2011 (gmt 0)

Or more annoyingly, keep driving the discussion into debates about quality content or not. The algo can't tell, that discussion is such a waste of time yet some just keep chanting it and other meaningless platitudes like it has anything to do with Panda.

But it is all about content quality. If you take a look at the sites in the original posts the pattern is obvious:
- rewritten manufacturer brochures and manuals disguised as product "reviews"
- reporting press releases as news
- reporting news from other newssites as news. "the guardian reports that the new york times reported a report from a newssource somewhere on the internet".

It is the difference between actually watching a movie and writing about it or rewriting studio press material. It is the difference between testing a product - holding it in your hands and then writing an article or rephrasing the brochure or manual. The difference between being at an event and reporting about it or reading about an event and reporting about it.

You may think an algorithm can't measure this? Why not? You can measure publishing times, you can even look for keywords like "as reported by". You can measure how often other sources are mentioned to check if it is original. You can check how many other websites have a similar topic at the same day. You could even check the images next to the article an look if they are from an archive or are new. Or from a manufacturer. Anylyze the meta data. Check geotags. Look for news agency names in the image url. There are no limits, use your fantasy. You could even measure pronouns like "I" and "we" and create an "original personal experience index". "I liked the book because" opposed to "The book is regarded by critics as".

What would you measure?

And if you have original content don't blast it out throughout the internet in real time. How is Google going to find out that you are the source when the instant you click "Publish" different snippets from your articles appear on hundreds of websites. Why should Google even care if the orginal source ranks and not one of the websites you sold your content too.

I don't think for a moment this is about background colors, or moving the ads 20px to the left, or using two instead of three keywords in link text.

They want to believe that the people who have been Pandalized somehow "deserved it"


When you look at some of the post of the recent days some clearly do. Some here are practicing SEO like "painting by colors" trying to sell it as art and then cry: But all the colors I used were of the finest quality!

Maybe. But your painting just looks like all the others all the same: average and artifical. If you are trying to solve your Panda problem by moving ads from left to right and the navigation from top to bottom you are like a painter that paints a human face like this:
- Coordinates of eyes: X/Y color blue
- Smile: corner of the mouth 20° up.

Then he wonders why he doesn't stick out of the crowd and tries the "appropriate" changes:
- saturation of hair: +10%
- smile: 22° up

crobb305




msg:4301966
 4:20 pm on Apr 21, 2011 (gmt 0)

Panda is not a penalty, it is a new scoring strand- like PR is a scoring strand, or TrustRank.


Good analogy. It would be funny if Google trademarked PandaRank. haha.

Or more annoyingly, keep driving the discussion into debates about quality content or not. The algo can't tell, that discussion is such a waste of time yet some just keep chanting it and other meaningless platitudes like it has anything to do with Panda.


I agree with jecasc, I think they can model and detect quality (based on what they define "quality" to be -- very subjective as each of us has a different definition of quality). What they can't do (yet) is fact check. So if quality stems from offsite and onsite signals, such as an author profile, inbound links from specific/trusted sources, slow load times, intrusive ads that popup or scroll the screen down, layout, rendering/browser/mobile compatibility, 3rd-party reviews/certifications (mood detection in the reviews), trust gained from established branding or association with other trusted sources, etc., they can do it. If the "quality" is about the factual accuracy of the information, they can't do it.

londrum




msg:4302001
 5:01 pm on Apr 21, 2011 (gmt 0)

one thing that worries me is this: i've heard people suggesting that if you repeat the same snippet of text in multiple places on your site then will get you branded as "low quality".

but that is a completely normal piece of web design. you cant alter it.

take an events site as an example... if an event ran from the 1st january to the 20th january, then you can imagine that the snippet will appear in multiple places -- on the 1st jan page, on the 2nd jan page, the 3rd jan page etc etc. if the events site allows people to search by date, then the event could crop up on different 20 pages.

we cant just noindex those pages. we need them. they are perfectly usable pages. and there is no point re-writing the same snippet 20 times for each individual page. and how many times can you rewrite a snippet anyway? and how would that improve the "quality" of the site? it doesnt.

so what are we supposed to do in a situation like that?

crobb305




msg:4302004
 5:05 pm on Apr 21, 2011 (gmt 0)

i've heard people suggesting that if you repeat the same snippet of text in multiple places on your site then will get you branded as "low quality".

but that is a completely normal piece of web design. you cant alter it.


I have thought about this also. What about a slogan that is repeated sitewide? Navigational text repeated sitewide? I think there is some flexibility, but I do feel like I am walking on eggshells with respect to the use of certain phrases (i.e., competitive/money phrases).

jecasc




msg:4302009
 5:08 pm on Apr 21, 2011 (gmt 0)

If the "quality" is about the factual accuracy of the information, they can't do it.


If you have a text about:

"How to be liked by everybody."

and the text reads something like this:

"Having friends is important and everybody want's to be liked. To get new friends and be liked by everybody buy a baseball bat, go out on the street and if you meet someone you would like to be friends with, hit him over the head with the bat, drag him home and lock him up in the cellar. People say having friends is important, so go out and start making friends today."

Now of course this is total garbage. And algorithms are too stupid to understand the text and determine that.

But there are plenty of other ways. I don't say they do it like this but here is what I would do:

- look for keywords that could indicate shallow content.
"everybody", "one", "people say"

You can easily recognize shallow articles by the sources they use. The sources usually are "people, the news, commons sense, everybody knows"

Should not be to difficult putting that into an algorithm.

There are other patterns. Sentence structure, repetition, variations. You can analyze the surroundings - Is an article acompanied by an unique image? Or archive material.

You can analyze whole websites. Most of the content farms are using guidelines for their content writers.

- Minimum number of words
- Maximum number of words
- Number of keywords in headlines
- Number of keywords in image names
- Number of keywords in URL
- Number of keyword repetition in text.

Over and over the same content ideas, like top 10 lists, howto do this, howto do that, reporting what was in the news, blabla.

If you are creating patterns - and especially only one pattern (do you really use a single guideline for all your writers) why are you surprised this pattern can be recognized?

It is all becoming more and more complex. It is no longer about how you place your text and how you link to it. It is about how they are written and on peoples actual reaction. It is about measuring signals that indicate people trust the source. Have you ever wanted to watch a movie and wanted to know if it is worth watching and then googled for "movie title imdb" instead of just "movie title"? The most simple measure for trust for google is if someone is actually searching for your website. Can you imagine there are women out there searching for "shoe name zappos"? That's what being a brand means. If you want to survive in the long term you have to be a brand IMHO. Brands are one of the oldest means of accumulating and binding trust. And they make you distinguishable. That does not mean you have to be a big brand. But you have to be one, have to be distinguishable and your brand has to be trusted.

incrediBILL




msg:4302127
 7:50 pm on Apr 21, 2011 (gmt 0)

But it is all about content quality.


Then explain all the collateral damage of quality sites thrown under the bus.

The algo is deaf dumb and stupid, it just contains a definition, a fingerprint, of what a quality Panda site should be, and it appears from the results that the fingerprint has smudges.

For instance, when I'm link checking I use a "fingerprint" to determine when I've hit soft 404-like pages. For instance a site just goes away and leaves a graphic "We'll be back soon" and nothing else or you get something from the server "nothing here" and status is both "200 OK". The fingerprint is the page has no links, no frames, no meta redirects, no image maps, no flash, no way to navigate off the page and virtually no content, a dead-end page, mostly 404-like in behavior. Sure there will be a few pages linked up around the web that incorrectly match that fingerprint, just like what happens to Panda sites.

The problem is Panda is way more complicated, the soft 404-like behavior not so complicated, so the level of complexity involved and the inability to easily narrow down such factors probably makes accurately implementing Panda next to impossible.

The only thing we can do is figure out the basic Panda fingerprint attempt to make sure we match as many points as we can to stay out of Panda purgatory.

crobb305




msg:4302135
 8:04 pm on Apr 21, 2011 (gmt 0)

jecasc, it sounds like you and I are saying the same thing with respect to subjective quality definitions/guidelines. You referenced my statement about search engines' inability to fact check, so I'm not sure if you're disputing that or not; but your points are in general agreement with mine about quality detection by an algorithm.

Search engines can't actually fact check a document to determine whether or not the information is factually accurate. For example, if two scientists post information with opposing view points on a topic, the algorithms have no way of knowing which is correct. Site A says global warming is real, Site B says it's a scam, Site C says it's anthropogenic, Site D says it's real but occurs in cycles, etc. That was my point. I agree with everything you've said in terms of defining quality for the purpose of detection by a computer program. Those definitions/signals can always change with time. Meanwhile, like incredibill said, we've seen otherwise quality sites get thrown under the bus because of the unknown (but highly speculated) "signals of quality" that Google used to devalue them.

This 337 message thread spans 12 pages: < < 337 ( 1 2 3 4 5 6 7 8 9 [10] 11 12 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved