homepage Welcome to WebmasterWorld Guest from 54.161.155.142
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Removing Thin Content - 410 vs robots.txt
DodgeThis




msg:4530596
 10:23 am on Dec 23, 2012 (gmt 0)

We have several folders on our site that at one time contained thin content. Everything within has since been set to return a 410, but due to the high volume of pages involved, we are looking for a better way to tell Google everything has been deleted rather than wait for it to recrawl each page.

Google advise using robots.txt to deny access to certain paths, but we have also heard that this can look suspicious, as if trying to hide thin content from them.

Most of the pages were 410ed last year after a Panda strike, the rest earlier this year. WMT is still discovering them. Recent events have caused us to investigate the possibility that Panda’s tolerance is decreasing and we are falling back into its clutches due to the perception of our site, not the reality.

We would be most grateful for any guidance from others who have experienced similar. Thank you.

 

not2easy




msg:4530609
 1:46 pm on Dec 23, 2012 (gmt 0)

It might be a good idea to verify that pages in those folders do actually return a 410 response. If it does you do need to wait for G to crawl them. The only way they offer to remove content requires that you do not block those pages or folders in robots.txt.

Look in your Google Webmasters Tools account to get instructions from Google on how to remove pages, folders and directories from their index. If the pages exist you need to make sure they don't have metatags of "index, follow" and do not block them in robots.txt or they can't crawl to see that the pages have noindex tags. When you have fixed the metatags, then notify Google via GWT to remove that content from their index. It does not always work with one try, they still bug me about directories that were properly removed years ago and do not exist anywhere except their imagination.

Edited to add: Make sure that pages you want to disappear are not showing up in your sitemaps.

g1smd




msg:4530640
 6:13 pm on Dec 23, 2012 (gmt 0)

Always looking for the easy life, once the URL requests correctly invoke a 410 Gone response I feel there's nothing more that needs doing and let Google reindex at their own rate.

TheMadScientist




msg:4530676
 11:26 pm on Dec 23, 2012 (gmt 0)

What g1smd said + welcome to WebmasterWorld!

Sgt_Kickaxe




msg:4530697
 2:37 am on Dec 24, 2012 (gmt 0)

410, no doubt about it. If the pages are gone that is the only message you want to send Google. If the pages happen to be in a directory you can use your GWT account and submit a removal request for the entire directory so that you are not causing search visitors to bounce but it's 100% up to Google to catch up with your pages at this point.

Welcome to WW.

DodgeThis




msg:4530733
 9:43 am on Dec 24, 2012 (gmt 0)

Thank you all for taking the time to reply.

The pages are long gone, replaced by a script that sends a 410 header and redirects visitors to a Page Gone notice. The sitemap doesn’t include them, nor do any pages on the site link to them. The pages in question do not show in serps either, just WMT, so not sure if the Remove URL tool would apply.

Comforting to hear not2easy say this can take a while. Google discover a fresh batch of 410s daily so it seems all we need apply is a little more patience.

Thanks again, you’ve put my mind at ease.

Happy Holidays.

g1smd




msg:4530739
 10:05 am on Dec 24, 2012 (gmt 0)

sends a 410 header and redirects visitors to a Page Gone notice.

If the URL shown in the browser address bar changes (is "redirected") then your implementaion is broken.

Hopefully, you meant to say "sends a 410 header and shows a notice informing that the page has Gone at the originally requested URL".

Don't take this as a criticism but we are hot on using the right terminology here: a redirect is a very specific thing, one that causes the browser to make a new request for a different URL after receiving a 301, 302 or 307 response to the original request.

DodgeThis




msg:4530748
 11:03 am on Dec 24, 2012 (gmt 0)

g1smd, you are correct, I meant to say it shows a notice, not redirects to one. My wording was poor, I appreciate you seeking confirmation.

g1smd




msg:4530753
 11:22 am on Dec 24, 2012 (gmt 0)

Thanks for the clarification. There's a lot of times when people have been found to 301 redirect and then return 404 or 410 at the second URL. That's a disaster. Glad you're not affected.

backdraft7




msg:4530783
 1:38 pm on Dec 24, 2012 (gmt 0)

why remove thin content? Google loves it.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved