Forum Moderators: Robert Charlton & goodroi
Google : Rethinking Search: Making Experts out of Dilettantes
This paper envisions a unified model-based approach to building IR systems that eliminates the need for indexes as we know them today by encoding all of the knowledge for a given corpus in a model that can be used for a wide range of tasks. As the remainder of this paper shows, once everything is viewed through a model-centric lens instead of an index-centric one, many new and interesting opportunities emerge to significantly advance IR systems. If successful, IR models that synthesize elements of classical IR systems and modern large-scale NLP models have the potential to yield a transformational shift in thinking and a significant leap in capabilities across a wide range of IR tasks, such as document retrieval, question answering, summarization, classification,recommendation, etc.
If all of these research ambitions were to come to fruition, the resulting system would be a very early version of the system that we envisioned in the introduction. That is, the resulting system would be able to provide expert answers to a wide range of information needs in a way that neither modern IR systems, question answering systems, or pre-trained LMs can do today.Some of the key benefits of the model-based IR paradigm de-scribed herein include:
•It abstracts away the long-lived, and possibly unnecessary,distinction between “retrieval” and “scoring”.
•It results in a unified model that encodes all of the knowledge contained in a corpus, eliminating the need for traditional indexes.
•It allows for dozens of new tasks to easily be handled by the model, either via multi-task learning or via few-shot learning, with minimal amounts of labelled training data.
•It allows seamless integration of multiple modalities and languages within a unified model.