If content is not under copyright, and is of any value, you can be pretty sure that Google has already loaded it. So probably have a thousand other funky web-sites.
Any current material is typically automatically copyrighted for the life of the author + 70 years. And if the author have put it into the public domain (as in given up his/her rights entirely, not merely a GPL), it is likely to exist in many, many copies on the net already. In a big batch down on page 950+ in Google search.
Anything else is ripe for a DMCA request if it suddenly shows up on the web. Causing the violating site to be immediately eliminated from Google search. As such eliminating the whole idea of using it to attract GoogleBot to begin with.
All authors (whether book- or web-based) have to do is to set up a few Google Alerts for key phrases from their content, for Google to automatically alert them when their content suddenly magically shows up somewhere they did not expect. I have quite a few such alerts set up. On receipt of an alert, followed up with a few clicks on Google's DMCA complaint page, and the offending site (or parts of it) gets banned from Google search. First the site-results show up (sort of), but any attempt from users to use it simply jumps off to the DMCA complaint against the site. After the DMCA have been determined, the site's results are killed off. Google cannot afford (politically speaking) to be in violation of copyright laws. They already tried going down that path with the initial version of the Google Books site. And they are big enough now to be under watchful political eyes in multiple countries. The last thing Google would want is to go the way of old Ma Bell because they get deemed too powerful.
Not to mention what else might follow if the copyright owner decides to follow up further, after the initial DMCA complaint. (Unless as you say it might be content specifically in the public domain.)