Forum Moderators: phranque
Are you writing a review about the latest Acme Widget to put on your non-commercial Widgetry Fans site? That would generally be fine.
It really depends on what the content is, where you got it from, and why you are using it on your non-commercial site. Everything written is protected by copyright, regardless of whether there is a copyright notice or not, unless it is public domain or specifically says it is available for reprint/republishing.
Keep in mind that some companies don't care for things being written about them, even if it is just a user manual, particularly if they are offered for sale. This is definitely something else to consider. You will have to make it clear you are not associated with whatever search engine you are writing about, and even then, it can be a very grey area. In cases like these, talking with a lawyer is always good.
The problem is holding the pdfs locally. If I came along to your search engine and found you had a pdf file I had created available directly from your search engine, I'd send you a cease and desist letter pretty quick, because I never gave permission for you to have it there. It is not really any different than you going to any of my webpages, copying the source code, and putting it intact on your own domain. That is still considered copyright infringement.
Google's indexing of pdf files is still fairly new, but if they apply the duplicate content filter to pdf files, you could cause major issues for the owners of the pdfs.
To hold them locally, you would need to get permission from each author to do so.
If you were to link to external files, however, that should be fine.
The key is really if you have permission to hold the pdf on your own server or not. If you don't have permission, that is asking for trouble.
You could possibly keep them locally for indexing purposes, and then return links to the original remote files to the user. Or if you plan on showing the relevant content from the PDF in the results, you could excerpt it from your local copy, but also link to the original if they want to see the result in context (instead of linking to your local copy). I don't think there would be anything wrong with that (I'm not a lawyer though!). But if you did it that way, you'd have to make sure, as Jenstar said, that Google didn't index your local copies (e.g., keep them in a separate directory that is disallowed by robots.txt).
Jordan