Efficient User Interaction for High-Recall Retrieval: Model Priming
No Thumbnail Available
Date
2025-08-11
Authors
Advisor
Smucker, Mark
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
High Recall Information Retrieval (HRIR) tasks, including legal e-discovery, information retrieval test collection construction and systematic review, require finding all, or nearly all relevant documents with the least amount of effort. Past research has shown that Technology Assisted Review (TAR) generally outperforms traditional e-discovery tools, and established that Continuous Active Learning (CAL) performs better than other commonly used TAR tools like Simple Active Learning and Simple Passive Learning. Prior research has also shown that adding search in a CAL-based HRIR tool can slow users down, and restricting access to full documents can speed up the document review process. Our goal was to design a system that provides more autonomy to users without affecting performance. Specifically, we wanted to investigate ways in which search can speed up the document review process. Systems like CAL often go through an initial training phase. We hypothesized that this training phase can be significantly shortened if search is used to seed the model. Moreover, we also created a novel interface that combines search with CAL on a single page. To test our hypothesis and the newly created user-interface, we conducted a user study with 40 participants to investigate five different configurations of an information retrieval system. We found that the addition of search, when preceded with proper user training, can significantly improve precision, performance, user experience and perceived effectiveness of the system. We also found that the newly designed interface, "Integrated CAL" performs comparably to the traditional interface, while providing a more familiar search based interface for users to interact with. Our findings reinforce the importance of hybrid High Recall Information Retrieval systems built on both search and CAL, that provide maximum control to users.
Description
Keywords
High Recall Information Retrieval, HRIR, Information Retrieval, User study, Continuous Active Learning, CAL