Since you are planning to "scan" rather than merely typing a sentence or two as "fair-use" quotes for commentary, your term "quote" does not apply. By all normal copyright laws, you are apparently planning theft. Otherwise, why worry about Google.
Yes, you will get nailed. Stealing content written by others, just to feed it to GoogleBot, should never be rewarded.
The fact that it is not in Google Books likely have no impact. Google is likely to have many more books loaded in their databases than the ones showing publicly on their Books site. The publicly viewable Google Books is merely what they have deemed legally "publishable" by them and have flipped the switch for at any given time. You can be sure they will recognize others. OCR'ing books is not a new scam. Google know to protect themselves from that.
If content is not under copyright, and is of any value, you can be pretty sure that Google has already loaded it. So probably have a thousand other funky web-sites.
Any current material is typically automatically copyrighted for the life of the author + 70 years. And if the author have put it into the public domain (as in given up his/her rights entirely, not merely a GPL), it is likely to exist in many, many copies on the net already. In a big batch down on page 950+ in Google search.
Anything else is ripe for a DMCA request if it suddenly shows up on the web. Causing the violating site to be immediately eliminated from Google search. As such eliminating the whole idea of using it to attract GoogleBot to begin with.
All authors (whether book- or web-based) have to do is to set up a few Google Alerts for key phrases from their content, for Google to automatically alert them when their content suddenly magically shows up somewhere they did not expect. I have quite a few such alerts set up. On receipt of an alert, followed up with a few clicks on Google's DMCA complaint page, and the offending site (or parts of it) gets banned from Google search. First the site-results show up (sort of), but any attempt from users to use it simply jumps off to the DMCA complaint against the site. After the DMCA have been determined, the site's results are killed off. Google cannot afford (politically speaking) to be in violation of copyright laws. They already tried going down that path with the initial version of the Google Books site. And they are big enough now to be under watchful political eyes in multiple countries. The last thing Google would want is to go the way of old Ma Bell because they get deemed too powerful.
Not to mention what else might follow if the copyright owner decides to follow up further, after the initial DMCA complaint. (Unless as you say it might be content specifically in the public domain.)
I read this as "some text" which, IMO should fall under Fair Use (US) or Fair Dealing (UK, Commonwealth). If properly attributed, there's no problems in doing this. That said, Fair Use is SHORT quotes for the purpose of illustration/scholarly report, with emphasis on SHORT. What that number of words might be is rather fluid, though if the work is already short and 90% is "quoted" then Fair Use has been exceeded. Additionally, if these are true IMAGE scans of a printed page, not ocr'd to text then to html, etc., that would also seem okay under Fair Use. Example:
Scan of an early 18th century chapbook page showing typesetting and illustration that is included in an article regarding the printing industry.
Thank you for the interesting comments... looks as though there is quite a bit of legislation to swot up on although it might differ country to country...? Not sure how Google manages that, for say a UK author... who has not yet published their work on the internet. Which I guess would mean you are OK, that is until they complain and take you down. But by then, hopefully one would have acrued enough of a critical mass of users to self perpetuate enough unique content so one does not have to rely on all the "quoted material".
I saw another guy mention in another thread he was scanning mercilesslly thousands of pages of book content. Which is not good. But... if I am brutally honest I've done it once before on a much smaller scale just to give me a kick start on the ladder, and eventually it all transformed into unique. I was blazing into the sunset before getting stung.
DeeCee... for every maverick there is a 'steady eddy'... AS you can see from the thrust of this thread it's a fine line. I'd try not to negatively generalize without context, understanding the degree and as we mention it's important creditting the author. It also depends on your strategy and personal style of doing business. Zuckerberg is a prime case study skirting around the fringes. So long as everyone as a whole is winning that is the main goal.
Back on track with the original conversation... let's not discount Bing and Yahoo...