Forum Moderators: open
Google has said it has reached agreements that allow it to enter as many as 60,000 titles in its database and also presented extensive mock-ups to publishers of how book-relevant searches will look.
There’s one problem, however. The programming algorithms that rank search results involve linguistic frequency as well as popularity statistics. OCLC bibliographic records, by Google spidering standards, are very thin. At this point, Google had nothing to say on how they will handle the OCLC records to ensure a “page one” level of visibility to searchers that corresponds to the quality of the material.
Quoted from the article you linked, rcjordan.
I wonder how they'll get around the lack of popularity of such records - keeping in mind that Google's ranking concept is in large parts based on document link structures.
we've seen it before in the pre-google age, when they all went portal empire building before perfecting their own patch.
think G is far more robust than its predecessors, but still makes you wonder.
No, I think more interesting are the technologies like copernic's summarize which summarize the actual texts for you. Or maybe "auto generate cliff's notes" applications. ;)
If it's a technical question, i'm more likely to hit newsgroups/blogs/message boards.
There's the rub. Google wants you hitting Google for that. :)
Not to get too far ahead of the game, but if we're going to start indexing the text of books to extract the 'information' from them for the purpose (presumably) of helping users find the information they seek ... what's next?
Web pages were obviously the first floor of this house, and images, PDFs, Word documents, etc., were added. The news search covers newspapers, magazines, etc. Blogs have been added. Now we're adding books to the mix. What other media will follow? Will the technology be developed to index the contents of audio and video files? Think of the information to be had in interview transcripts, etc.
Very intersting development. I gotta hand it to Google for not messing around here and realizing that Amazon/Alexa dynamic duo represent a major threat. In many ways, that threat is greater than microsoft or IBM.
Can you elaborate Brett. Why do you think they post one of the biggest threats to Google?
Even more so if they add "out of print" and "out of copyright" books to it.
Could also give some interesting results on searches on etymology of e.g. quotations if they add publish dates and date search options.
It could create a whole new rage in ebayish second-hand book adwords market-place.
I guess some authors who have been "borrowing" others content a little too freely will sweat it out though.
I wonder if publishers will allow full indexing of reference books, a snippet of 6 words may be enough to trigger interest to buy, without letting loose the critical content for nothing.