Forum Moderators: open
My feeling is that Google allots a limited amount of crawling and indexing resources for a site based on the site's PR and backlinks, and that these pages on our site are taking up a significant portion of that allotment. So I hope that by removing them Google will focus more on the pages we do want indexed and that have better chances of ranking.
These pages are already indexed and have a fair amount of links pointing to them from other sites.
Is it enough to add <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> on these pages? Will that prevent future crawling and indexing despite the numerous external links point to them?
What HTTP status code should these return when requested by Googlebot? 404? Or, when Googlebot crawls the external links to these pages, should we 301 redirect to our homepage to take advantage of backlinks? Again, we don't want the backlinks removed because they're useful for visitors. Is it dangerous to return 404 or 301 to Googlebot and 200 (ok) to humans? Does that violate Google's TOS?
Thanks
I guess you're suggesting we return HTTP status 200 whenever the bot crawls external links to these pages, but then disallow indexing with robots exclusion.
It sure would be nice to take advantage of those external links by 301 redirecting googlebot to page(s) we do want indexed. Would this be a bad idea?
Harness that, don't waste it! Don't fuss with weird redirects or trying to get them removed from SE's; use them for strategic internal linking to support other pages in your site. Depending on how much PR the pages have, a few well-planned, well-placed links could give a really nice boost to other pages that you've already done some SEO work on.
Keep it simple; you're a lot less likely to create problems for yourself.