Welcome to WebmasterWorld Guest from 220.127.116.11
1. What HTTP header should I return if not 200?
2. Would allowing Googlebot to crawl be ok? I don't think that's cloaking and I see many sites doing this. Or will it penalize me somehow?
the reason it shows up as a soft 404 is that you have a large number of urls returning essentially the same content, which would be the page template with only the thread title varying between pages.
i would allow googlebot to crawl (and follow links) and provide a noindex with the response.
I recently started serving 404 responses
it also responds with an "Invalid Product id" type message rather than a 404.
I also have one soft 404. It apparently, if I understand the message correctly, is caused by a strange link from someone else