|1) Is there a ratio of low-quality pages to high quality pages? In other words, what if we delete 8,000 pages but left several low quality pages on the site - would we still be hit by Panda? How does the algo work? |
We don't know.
|2) Is it better to delete from the oldest to the newest or does it matter? I am not sure it matters, but I was thinking that if Google crawls from newest to oldest, the SE will notice the difference quicker. |
Again, we don't know. If we did some might keep that secret, some might reveal it. So far none have said either way.
|3) My wife and I see things a little differently. I delete articles even if they got one or two visits over the last nine months. She wont, if it had one visit she keeps it. Who is right? I don't want to have to go through this process again if it isn't fixed the first time around. |
Married 29 years to second wife, 5 years to first wife (outlived both) you ALWAYS listen to the wife. Marital accord is more important than Google. Put that in your pipe and smoke it. Serious!
Meanwhile, present your site to the USERS/VISITORS. I have pages over 12 years old that get only 1 or so hits per year. Not popular, of course, but exact for those seeking that info. Pick and choose what you keep, and pick and choose if you are info or commerce, the latter being the biggie if Adsense or Adwords is involved.
Wow Tangor. Thanks. Those are mysteries to me too (1 and 2).
As per my second question - I don't think Google crawls the whole site in one crawl. I would assume that they crawl part of it, take a note where they left off, and then come back and crawl some more the next day (or later depending how often content is updated).
Also, as per #3 - you are right. I have been less likely to delete an article if it pertains to high-paying keyword.
Again, your answers were great.
You should noindex them instead, same results with search engines, but it's easier to resurrect if you were wrong.
The problem I see here is that you're equating visitor numbers to Panda's assessment of quality. But Panda is all about sending fewer visitors to lower quality pages, so what Google sent traffic to pre-Panda doesn't tell you all you need to know. A better source of information might be which articles, and which types of pages, people have linked to without having been asked to by you.
Good point Rosalind. I'll see if I can find that data. I saw Webmaster Tools has it. It looked like their data only goes up to 1,000 pages though.
We might do that after this first round of deletes.
As per the no-index Koan - I am going to think about that. I have done that with other sites for various reasons.
We have been talking about deleting some of the dead-wood on our website. It might even improve performance.
We have seen drop-offs in traffic before. Over the years I have began to catalog the SERPS. All of a sudden I noticed we weren't showing up in the SERPS - for those keywords I monitored.
Thanks for your responses.
I just went through 100 articles and deleted less than five. At that pace there may only be 1000 or less articles deleted. I doubt it would have much of an effect.
I told the wife about your comments. Perhaps we will look into the inbound links, but we need to find that data.
I would sign up to a duplciate content service and run a test on your entire content to see how rubbish it really is, take all the rubbish down, either rewrite it or leave it off.
Thanks Nippi, I will discuss that with the wife.
Do you think it would be a problem to put 8,000 or 10,000 lines of code in the Robots.txt file? It would be a challenge to put the tag in each of the pages.
Also, Google says that if the noindex is used it will not show up in the search results. I have not read yet that they will not hold that page against you with Panda.
About a year ago I used noindex for a folder in one of my websites. A couple months later I noticed Google quit indexing the whole site. I am not sure what happened, but it worked out OK in that instance because I later I didn't want that site index. In this case, I want everything indexed.
Dan01, I had better luck putting the noindex tag in pages when I tried to remove a large section of my site from Google's index. When I first used the robots.txt file, it was taking forever. I understand that doing it manually with each pages individually may not be viable though. Mine used a simple template page for all of them so it was easy.
Did it help with Panda? Not one bit, so far.
I am currently dealing with the same problem with one of my client, they have about 25K pages of newsletters. A couple of months ago, they were on Page 1 for a high traffic finance keyword, then they dropped to page 5 and stuck there. (After a Panda rollout)
One of the solution we are trying is to leave all the posts (newsletter pages) but remove all of the links to it from any of the main pages. The only posts that will have links are the latest ones (2 weeks old). The reason for this is because this is exactly what the Top 1 site is doing who have exactly the same format. We'll see how we goes. But I will be following this thread for other possible solutions.
Dan01, had you been untouched by Panda up until now? That is scary, it's worrying that some of our "safe" sites could still be hit.
I was hit slightly. I noticed a couple SERPS fell, but most of the SERPS remained. Sometime last month all of the SERPS I watched disappeared. We are still getting some traffic, but like I said, it is cut in half.
I have other sites that are still untouched, but this was our biggest.
Over the past year we have been looking into other businesses besides this (I have had my own businesses since 1985 and haven't worked for anyone since then). I hate to put all of my eggs in one basket.
It seems like every year or every other year we have had something like this happen. In 2008 our revenue dropped off like this, but it came back. Since then I have been more analytical about SERPS, etc. I noticed some shakeups in 2009 too. 2010 seemed pretty steady. I think this one in 2011 will require more work.
whatson, there was the first international Panda in August affecting non-english sites for the first time.
We incorporated a list of pages that links to them. Webmaster Tools truncates the list at 1,000. Is there a better way to download a list of OUR PAGES that have links to them?
Majestic SEO offers full backlink data. You do need to register, but if you put a verification file on your server you don't need to pay for the report.
Thanks Ted. I downloaded a report and noticed it only gave me about 500 pages. But I see other tools now to add to my tool box. Cool.
I was hit by Panda in February of this year. From what I've been reading... the more you cut the better. There is a school of thought out there that it takes 50% to get hit by Panda and then 75% over-correction to get back from it.
I think that people who have been testing Panda solutions would agree to deleting/no-indexing as much of the site as possible... or even starting over.
When we were first hit we no-indexed about 2,500 pages of 17,000. Most recently (two weeks ago) we bit the bullet and no-indexed an additional 9,000 pages. So far the immediate affect is that our already halved traffic has fallen off about 20%... but, we'll see what happens the next time Panda runs.
Thanks Lenny. We have been slowly but surely deleting articles.
I was of the opinion - the more the better. This morning I was looking in the log files and saw some page not found errors on pages that we deleted already. Since then I have included the list of our pages with links from the site Ted gave me, all of the Webmaster tools list, all of the pages that have had visitors in since my wife created the list in the first place.
I am sure there are duplicates in the list - for instance, some were on the list I got from Majestic SEO and Webmaster tools etc., but if we are just doing a find command with a ton of data, it doesn't really matter how long the list is.
The site is too valuable to just forget about it or start over, and it still generates some income despite being hit by Panda. Plus, I have been working on it since 2004 while we parked the domains we bought back in the 90s. This is our baby now.
I am going to go with the more conservative approach and delete a little at a time now, and try to be more selective (careful).
Thanks for your comment. At least I am not the only one in this boat.
Beware the conservative approach. I took that opinion, starting back in April (UK site). Didn't want to throw the baby out with the bath water. But, over time I have realised more drastic action was required, as it became apparent that tweaks really weren't the answer. Certainly, that's the case if you contrast the advice of those who have recovered, with the discussions of those who haven't. The off shot is I have been wasting time bleeding to death slowly. Now I am pretty much bankrupt and still no major recovery. Don't waste the opportunity (time) you have!
Hey there, suggy:
I am sorry to hear about your predicament. I do hope things turn around for you and you get back on track.
|Certainly, that's the case if you contrast the advice of those who have recovered, with the discussions of those who haven't. |
I have to ask this, but is there somewhere you have seen advice from people who have actually recovered from Panda?
If it is here in webmasterworld forums, I don't seem to have found it. Maybe there are some people who have recovered, but I don't see much in terms of what was successfully done (except for a few mentions by tedster).
|I think that people who have been testing Panda solutions would agree to deleting/no-indexing as much of the site as possible... or even starting over. |
There are many types of sites that have had a Panda problem - so I am not a fan on this kind of "one size fits all" advice. For example, I've helped one site recover without doing any deleting or no indexing.
If you know you have shallow or repetitive content, sure go after removing that. A common problem I've seen that does require deleting is hosting many very similar pages just to target small keyword variations. But don't start deleting or noindexing just because someone else did it and had success. Make sure you have a good idea about how and why your content became a Panda target first.
For example, maybe your site lost its authority status with Google and those who syndicate your best content are now ranking for it. In a case like that, you need to recover your site's trust and authority first. Otherwise you can delete content all you want and it will get you nowhere.
Like tedster said you don't know how many filters you might end up with(might not be panda), noindex preferably than deletion of pages. The lack of more authority link can be a key issue sometimes . I have seen an issue with ranking for new more powerful KWs which leads to drops probably based on revaluation.
With large websites you need a lot of authority to float deep pages in the results with any regularity. If Google changes the way it scores your inbound links then traffic and rankings will drop away.. My point is that its not always "Panda" there are plenty of ways Google can hurt you these days, they have plenty of weapons.
Dan01 I would never just delete a page I would 301 the page to another page on the site that has the same or very like content. Do you want 1000 or how many pages you delete 404's. There was a rule a long time ago never just delete a page we call this linkrot, and not something I would do.
You can as suggested noindex them but if the pages are duplicate in nature I prefer to 301 them myself.
I like Ted's advice: "find out why panda targeted YOU and fix that..." I have to say in my own case DanO I did find that I in fact only have about 5,000 useful pages on the site... the rest of the 15,000+ pages are all basically search functions and alternative ways of "seeing" the site. So, my decision to de-indext the 10,000+ pages was based on my realization that the content was basically duplicate and not necessary for the search engines. This "cleaning house" is also why I am willing to lose an additional 20% of our Google traffic... (no matter what happens I know my site is more sound by making the change.)
On the other hand, losing authority status could also be a major issue... In my case I assume that Google still "likes" my site because NEW pages get indexed fairly quickly (within a day) and fairly well (on the first page for just a mention on the site - less competitive key phrases of course.)
I'm not trying to hijack your forum intent Dan0... but, would it be helpful for some insight into how other webmasters test the authority of their websites? I know it would help me! Sounds like it would be a more productive approach for you too... since it seems like your site is primarily all very good content!
|Dan01 I would never just delete a page I would 301 the page to another page on the site that has the same or very like content. Do you want 1000 or how many pages you delete 404's. There was a rule a long time ago never just delete a page we call this linkrot, and not something I would do. |
You can as suggested noindex them but if the pages are duplicate in nature I prefer to 301 them myself
OK, there were a lot of good responses here.
The latest iteration of Panda, according to what I read here, had to do with scraper sites. We were hit at that time.
Up until 2009 we accepted guest posts. There were a lot of them. I am sure those posts appeared on other sites. Perhaps we were viewed as a scraper site? That is what I think is happening now.
Amazingly, we still rank for some pages - but we have a bunch of junk pages too.
NOW HERE ARE MY QUESTIONS:
1) Will Google penalize us for getting rid of (deleting) those junk pages. It is so easy and quick to see if the page got any traffic and checkbox it to delete, rather than 301 or no index.
2) I am still looking for a good list of the pages on OUR site that have links to them. Tedsters site truncated the list at 500 and Webmaster tools truncated the list at 1,000.
> 1) Will Google penalize us for getting rid of (deleting) those junk pages.
Google already advised Pandalized webmasters to get rid of junk (noindex/remove/move to another site).
> 2) If you can't find the info from a tool, use Google Analytics for Landing Pages. If nobody landed in the last year, the links can't be worth much.
Google finds very large sites highly suspicious, and subdirectories with many pages, too.
> I think that people who have been testing Panda solutions would agree to deleting/no-indexing as much of the site as possible... or even starting over.
Everybody has to find the right balance between destroying the building. If you kill 90% of your site's traffic all at once, and then rebuild, you could lose all that traffic while you rebuild.
How long is that going to take?
Slow and steady building and rebuilding is most natural and least suspicious to Google.
Wow Potenialgeek. That is exactly what I was thinking, but I wanted to see if anyone else agreed. Thanks.
We are doing this rather slow. We check each page before we delete it. We started several months ago and hope to be done by February or so.
| This 34 message thread spans 2 pages: 34 (  2 ) > > |