This book surveys recent advances in Conversational Information Retrieval (CIR), focusing on neural approaches that have been developed in the last few years. Progress in deep learning has brought tremendous improvements in natural language processing (NLP) and conversational AI, leading to a plethora of commercial conversational services that allow naturally spoken and typed interaction, increasing the need for more human-centric interactions in IR.
The book contains nine chapters. Chapter 1 motivates the research of CIR by reviewing the studies on how people search and subsequently defines a CIR system and a reference architecture which is described in detail in the rest of the book. Chapter 2 provides a detailed discussion of techniques for evaluating a CIR system - a goal-oriented conversational AI system with a human in the loop. Then Chapters 3 to 7 describe the algorithms and methods for developing the main CIR modules (or sub-systems). In Chapter 3, conversational document search is discussed, which can be viewed as a sub-system of the CIR system. Chapter 4 is about algorithms and methods for query-focused multi-document summarization. Chapter 5 describes various neural models for conversational machine comprehension, which generate a direct answer to a user query based on retrieved query-relevant documents, while Chapter 6 details neural approaches to conversational question answering over knowledge bases, which is fundamental to the knowledge base search module of a CIR system. Chapter 7 elaborates various techniques and models that aim to equip a CIR system with the capability of proactively leading a human-machine conversation. Chapter 8 reviews a variety of commercial systems for CIR and related tasks. It first presents an overview of research platforms and toolkits which enable scientists and practitioners to build conversational experiences, and continues with historical highlights and recent trends in a range of application areas. Chapter 9 eventually concludes the book with a brief discussion of research trends and areas for future work.
The primary target audience of the book are the IR and NLP research communities. However, audiences with another background, such as machine learning or human-computer interaction, will also find it an accessible introduction to CIR.
About the Author: Jianfeng Gao is a Distinguished Scientist and Vice President of Microsoft. He is the head of the Deep Learning group at Microsoft Research, leading the development of AI systems for natural language processing, Web search, vision language understanding, dialogue, and business applications. He is an affiliate professor of Computer Science & Engineering at University of Washington, an IEEE fellow, and a Distinguished Member of ACM.Chenyan Xiong is a Principal Researcher at Microsoft Research at Redmond. His research area is in the intersection of information retrieval, natural language processing, and deep learning. Chenyan is a co-founder of the TREC Conversational Assistance Track (CAsT) and has developed a series of neural approaches for conversational systems with award winning publications, covering various aspects of conversational IR systems, including conversational search, system initiatives, dialog modeling, and few-shot learning.
Paul Bennett is a Partner Research Manager for the Productivity+Intelligence area in Microsoft Research. His published research has focused on a variety of topics surrounding the use of machine learning in information retrieval - including deep learning for ranking and retrieval, ensemble methods and the combination of information sources, calibration, consensus methods for noisy supervision labels, active learning and evaluation, supervised classification and ranking, crowdsourcing, behavioral modeling and analysis, and personalization. Some of his work has been recognized with awards at SIGIR, CHI, ECIR, and ACM UMAP.
Nick Craswell is a Principal Architect developing Search and related functionality in Microsoft Teams, Outlook and Sharepoint. His research is on the evaluation and optimization of information retrieval systems, particularly in Web search, and more recently relating to conversational interfaces, and deep learning.