Welcome to WebmasterWorld Guest from 54.156.56.73

Forum Moderators: Robert Charlton & andy langton & goodroi

Message Too Old, No Replies

robots.txt to forbid an entire forum

     
3:42 am on Feb 4, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 9, 2007
posts:876
votes: 0


I want to disallow Google from indexing my forum.

There are over 2 million member pages. None of them, have titles or URL's that are ever going to bring traffic to the site.

I was thinking noindex on the entire forum, but would it not be easier to just disallow the forum from being indexed?

Would there be any negative aspect, to disallowing such a massive number of pages?

I'm also going to unlink members names, for guests to prevent leaking link juice to these worthless pages.
4:45 am on Feb 4, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


It would also save crawl budget if you use a robots.txt Disallow rule. If you use a noindex robots meta tag, googlebot still needs to crawl all those URLs just to see it.

However, I'm not clear whether you're aiming to remove the member pages or ALL the forum threads. If not, then there could be a downside to not allowing a crawl of the member pages: any links on those pages will no longer circulate PageRank to the other URLs on the forum. If the rest of the URLs are well interlinked, that might not be a problem. But in a less than ideal situation, it might.

[edited by: tedster at 7:20 pm (utc) on Feb 5, 2011]

4:49 am on Feb 4, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 9, 2007
posts:876
votes: 0


Hi Tedster, the website receives 30,000 visits per day. Of which about 20 come from the forum + member pages. I do mean 20, no 20,000 haha. I want to basically act as though the entire forum + member pages don't exist. Oh, btw, the member pages are within /forum/

So to do that, the best way would be to to disallow and also noindex, to save crawl allowance?
1:57 pm on Feb 4, 2011 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Apr 30, 2008
posts:2630
votes: 191


If you disallow via robots.txt, your noindex meta will not be seen since the page will not be crawled.
2:56 am on Feb 5, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 9, 2007
posts:876
votes: 0


True. After checking, I see some members, have linked to their profiles, from their own blogs etc. But not a great deal. My issue with using noindex, follow on these pages, is the fact it will still eat in to crawl allowances. hmm
7:11 pm on Feb 5, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 14, 2008
posts:2910
votes: 62


Uh, you lose all the internal links I would guess you have to the home page and other main category pages if you disallow ... I'd noindex, but not disallow. Let Google and others see what you have and act accordingly, even if it doesn't send you traffic IMO it adds 'weight' to your site.

Put up an xml sitemap and set the priority of pages on your site to help manage crawling if you feel the need, but IMO you should not disallow.
7:01 pm on Feb 14, 2011 (gmt 0)

New User

10+ Year Member

joined:Feb 7, 2006
posts: 29
votes: 0


Why not delete all fake xrumer'ish never-posted/3months-old user profiles? Then rest member pages are real member pages (probably just 25% at least in my forum hehe)