|robots.txt to forbid an entire forum|
I want to disallow Google from indexing my forum.
There are over 2 million member pages. None of them, have titles or URL's that are ever going to bring traffic to the site.
I was thinking noindex on the entire forum, but would it not be easier to just disallow the forum from being indexed?
Would there be any negative aspect, to disallowing such a massive number of pages?
I'm also going to unlink members names, for guests to prevent leaking link juice to these worthless pages.
It would also save crawl budget if you use a robots.txt Disallow rule. If you use a noindex robots meta tag, googlebot still needs to crawl all those URLs just to see it.
However, I'm not clear whether you're aiming to remove the member pages or ALL the forum threads. If not, then there could be a downside to not allowing a crawl of the member pages: any links on those pages will no longer circulate PageRank to the other URLs on the forum. If the rest of the URLs are well interlinked, that might not be a problem. But in a less than ideal situation, it might.
[edited by: tedster at 7:20 pm (utc) on Feb 5, 2011]
Hi Tedster, the website receives 30,000 visits per day. Of which about 20 come from the forum + member pages. I do mean 20, no 20,000 haha. I want to basically act as though the entire forum + member pages don't exist. Oh, btw, the member pages are within /forum/
So to do that, the best way would be to to disallow and also noindex, to save crawl allowance?
If you disallow via robots.txt, your noindex meta will not be seen since the page will not be crawled.
True. After checking, I see some members, have linked to their profiles, from their own blogs etc. But not a great deal. My issue with using noindex, follow on these pages, is the fact it will still eat in to crawl allowances. hmm
Uh, you lose all the internal links I would guess you have to the home page and other main category pages if you disallow ... I'd noindex, but not disallow. Let Google and others see what you have and act accordingly, even if it doesn't send you traffic IMO it adds 'weight' to your site.
Put up an xml sitemap and set the priority of pages on your site to help manage crawling if you feel the need, but IMO you should not disallow.
Why not delete all fake xrumer'ish never-posted/3months-old user profiles? Then rest member pages are real member pages (probably just 25% at least in my forum hehe)