|Massive De-indexing – Deleted Posts = Thin Content? |
| 7:37 pm on Jun 19, 2014 (gmt 0)|
Like many other sites, one of my client’s websites was devastated by Panda 4.0. Since the update the site’s traffic has decreased by over 60%, which has thrown my client into a panicked frenzy. The site itself allows users to upload pictures, in which each picture contains a unique URL. While ideally all of these picture pages as landing pages would generate traffic, even before the update, the main page and the category pages were responsible for the overwhelming majority of organic traffic.
The issue of duplicate or thin content was never a concern of his prior to the update, but a Google site search reveals a series problem. While he has approximately 25,000 pages, Google has over 60,000 in the index. It appears the problem is that many times users will upload images multiple times or images that contain nudity, the client subsequently deletes these posts. Depending on the time of the deletion, Google may crawl these pages and add them to the index. The question I have is whether anyone has a suggestion to expedite the de-indexing process. The deleted posts are not linked to in the website and no longer appear in the sitemap. Consequently, it will be quite a while before Google re-crawls these nearly 40,000 pages. Given the number of pages, submitting individual removal requests through Webmaster Tools is not a feasible option. I was thinking I could create a sitemap that includes these deleted posts, which now include a meta NOINDEX tag, but outside of clicking on every single result in the site search, I do not know a way to obtain all the deleted post URLS.
Was additionally contemplating de-indexing the entire site outside of the main and category pages, as the individual picture pages are created with a random URL, few have titles, and outside of the picture, the page contains no text description. Furthermore as the pictures are uploaded by the user, most do not contain descriptive names or alt text. The client has went through thousands of these pages and added descriptive title tags, but even these pages likely would be considered thin content correct?
| 9:09 pm on Jun 19, 2014 (gmt 0)|
If there is a directory structure for these problem pages, then you can submit a request to purge the entire directory.
if the structure of the site is as follows:
then you can easily purge the directory by submitting the following request:
and choose the remove directory option from the drop down list in WMT.
| 9:27 pm on Jun 19, 2014 (gmt 0)|
Panoramic you said this was a client but your just now finding out this issue? This should have been addressed long time ago.
The pages in question are worthless IMO an image is an image and without supporting text will only continue to be a drag on the site.
I think part of the problem is a person uploads and image and links to that image, so the link profile on this site is probably a high number of links to these interior pages that have no real value other than the person that uploaded it.
| 9:37 pm on Jun 19, 2014 (gmt 0)|
Unfortunately the website is not structured in that fashion. The problem pages (i.e., the pages that are now deleted and need to be de-indexed), have an identical URL structure as the pages that we want to remain in the index.
example.com/category1/post1 [Deleted - Need post to be de-indexed]
example.com/category1/post2 [Good- Need post to remain in index]
If only there was an easy way to locate the URLs of all the deleted posts quickly.
| 9:42 pm on Jun 19, 2014 (gmt 0)|
BwnBwn, you are entirely correct! However, this is a new client of mine who only sought out my services after incurring the wrath of Google.
I would agree that for the most part that these image pages are worthless, but what about the ones that contain custom descriptive title tags? A handful of these pages generate solid traffic due to the title keywords. Would it better in your opinion to deindex these nearly 25,000 pages to recover from the penalty and simply focus on the category and main page?
| 9:54 pm on Jun 19, 2014 (gmt 0)|
Panoramic makes sense then a new client after the fact.
How many pages are there on this site?
| 10:32 pm on Jun 19, 2014 (gmt 0)|
BwnBwn, the site has 16 pages, which include several category in which the pictures are displayed in gallery format, and a handful of other pages. In addition to these pages, are the individual posts, which are the image pages, which total over 24,000. The Goggle site search fluctuates, but at this second is 66,600.
| 11:16 pm on Jun 19, 2014 (gmt 0)|
|In addition to these pages, are the individual posts |
You mentioned "pages" and "posts" - is this site a Wordpress site?
| 12:05 am on Jun 20, 2014 (gmt 0)|
getcooking, yes WordPress is the content management system. Does that have any significance?
| 12:20 am on Jun 20, 2014 (gmt 0)|
|Does that have any significance? |
| 12:20 am on Jun 20, 2014 (gmt 0)|
Even just images can survive without on-page text use the <noscript>Meta data here</noscript> That with the title element is more than enough to fix this problem. That is what it is there for.
Nab a few expired domans redirect to the primary categories and you have cured this completely.
| 12:25 am on Jun 20, 2014 (gmt 0)|
For wordpress, something like the 404 Redirected plugin might help to at least identify the urls that have been deleted. It logs 404 not found errors - the intention of the plugin is to set up redirects for missing files but I imagine just the logging itself could be helpful in a case like this since you said
|If only there was an easy way to locate the URLs of all the deleted posts quickly. |
You'd obviously still have to figure out what to do with the logged files... whether it's manually removing them from GWT or something else.
(side note: there may be other Wordpress plugins that do the same thing or do it better, but this is the only one I've used and am familiar with)
| 12:28 am on Jun 20, 2014 (gmt 0)|
RedBar & getcooking, can you elaborate on this? The website is not hosted on WordPress.com, but uses WordPress as a CMS.
| 12:29 am on Jun 20, 2014 (gmt 0)|
Getcooking, Thanks! Having a file that logs the 404 redirects will certainly be useful.