Relevance Filtering for Embedding-based Retrieval

Read original: arXiv:2408.04887 - Published 8/12/2024 by Nicholas Rossi, Juexin Lin, Feng Liu, Zhen Yang, Tony Lee, Alessandro Magnani, Ciya Liao

Relevance Filtering for Embedding-based Retrieval

Overview

This paper introduces a relevance filtering approach for improving the performance of embedding-based retrieval systems.
The method aims to remove irrelevant documents from the initial ranked list produced by an embedding-based retrieval model.
Experiments on several datasets show the proposed approach can significantly boost retrieval effectiveness compared to standard embedding-based retrieval.

Plain English Explanation

Embedding-based retrieval systems use mathematical representations of text, called embeddings, to find documents that are similar to a given query. However, the initial ranked list of retrieved documents may contain many irrelevant results.

The proposed relevance filtering approach addresses this by identifying and removing the least relevant documents from the list. This is done by training a separate model to predict the relevance of each document in the initial ranking. Documents with low predicted relevance scores are then filtered out, leaving a more accurate set of top results.

The key benefit of this method is that it can significantly improve the overall quality and usefulness of the retrieved information, compared to standard embedding-based retrieval alone. By focusing on the most relevant documents, users are more likely to find what they are looking for.

Technical Explanation

The paper outlines a two-stage retrieval process:

An initial embedding-based retrieval model is used to generate a ranked list of documents for a given query. This could be any existing embedding-based system, such as a neural network or a similarity-based method.
A relevance filtering model is then applied to the initial ranked list. This model predicts a relevance score for each document, allowing the least relevant results to be removed. The authors experiment with different approaches for training the relevance filtering model, including using human-labeled relevance judgments or self-supervised learning.

The key innovations of this work are:

Introducing the concept of relevance filtering as a way to improve embedding-based retrieval systems.
Demonstrating that relevance filtering can significantly boost retrieval performance on multiple benchmark datasets.
Exploring different techniques for training the relevance filtering model, including supervised and self-supervised methods.

Critical Analysis

The paper provides a well-designed and thorough evaluation of the relevance filtering approach. The authors acknowledge some limitations, such as the need for a separate relevance filtering model that may add computational overhead.

One potential area for further research could be investigating ways to integrate the relevance filtering directly into the initial embedding-based retrieval model, rather than as a separate post-processing step. This could potentially improve efficiency and allow the two components to be jointly optimized.

Additionally, the authors focus on standard retrieval benchmarks, but it would be interesting to see how the relevance filtering approach performs on real-world applications with diverse user needs and search intents.

Overall, the work presents a compelling and practical solution for enhancing the effectiveness of embedding-based retrieval systems, which are widely used in modern information retrieval and search applications.

Conclusion

This paper introduces a relevance filtering approach that can significantly improve the performance of embedding-based retrieval systems. By identifying and removing the least relevant documents from the initial ranked list, the method produces a more accurate and useful set of search results for users.

The technical innovations and empirical evaluations demonstrate the potential of this approach to advance the state-of-the-art in information retrieval. As embedding-based models become increasingly prevalent, relevance filtering could be an important tool for ensuring these systems deliver high-quality and relevant search experiences.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Relevance Filtering for Embedding-based Retrieval

Nicholas Rossi, Juexin Lin, Feng Liu, Zhen Yang, Tony Lee, Alessandro Magnani, Ciya Liao

In embedding-based retrieval, Approximate Nearest Neighbor (ANN) search enables efficient retrieval of similar items from large-scale datasets. While maximizing recall of relevant items is usually the goal of retrieval systems, a low precision may lead to a poor search experience. Unlike lexical retrieval, which inherently limits the size of the retrieved set through keyword matching, dense retrieval via ANN search has no natural cutoff. Moreover, the cosine similarity scores of embedding vectors are often optimized via contrastive or ranking losses, which make them difficult to interpret. Consequently, relying on top-K or cosine-similarity cutoff is often insufficient to filter out irrelevant results effectively. This issue is prominent in product search, where the number of relevant products is often small. This paper introduces a novel relevance filtering component (called Cosine Adapter) for embedding-based retrieval to address this challenge. Our approach maps raw cosine similarity scores to interpretable scores using a query-dependent mapping function. We then apply a global threshold on the mapped scores to filter out irrelevant results. We are able to significantly increase the precision of the retrieved set, at the expense of a small loss of recall. The effectiveness of our approach is demonstrated through experiments on both public MS MARCO dataset and internal Walmart product search data. Furthermore, online A/B testing on the Walmart site validates the practical value of our approach in real-world e-commerce settings.

8/12/2024

Efficient Retrieval with Learned Similarities

Bailu Ding, Jiaqi Zhai

Retrieval plays a fundamental role in recommendation systems, search, and natural language processing by efficiently finding relevant items from a large corpus given a query. Dot products have been widely used as the similarity function in such retrieval tasks, thanks to Maximum Inner Product Search (MIPS) that enabled efficient retrieval based on dot products. However, state-of-the-art retrieval algorithms have migrated to learned similarities. Such algorithms vary in form; the queries can be represented with multiple embeddings, complex neural networks can be deployed, the item ids can be decoded directly from queries using beam search, and multiple approaches can be combined in hybrid solutions. Unfortunately, we lack efficient solutions for retrieval in these state-of-the-art setups. Our work investigates techniques for approximate nearest neighbor search with learned similarity functions. We first prove that Mixture-of-Logits (MoL) is a universal approximator, and can express all learned similarity functions. We next propose techniques to retrieve the approximate top K results using MoL with a tight bound. We finally compare our techniques with existing approaches, showing that MoL sets new state-of-the-art results on recommendation retrieval tasks, and our approximate top-k retrieval with learned similarities outperforms baselines by up to two orders of magnitude in latency, while achieving > .99 recall rate of exact algorithms.

8/15/2024

🏷️

Adaptive Retrieval and Scalable Indexing for k-NN Search with Cross-Encoders

Nishant Yadav, Nicholas Monath, Manzil Zaheer, Rob Fergus, Andrew McCallum

Cross-encoder (CE) models which compute similarity by jointly encoding a query-item pair perform better than embedding-based models (dual-encoders) at estimating query-item relevance. Existing approaches perform k-NN search with CE by approximating the CE similarity with a vector embedding space fit either with dual-encoders (DE) or CUR matrix factorization. DE-based retrieve-and-rerank approaches suffer from poor recall on new domains and the retrieval with DE is decoupled from the CE. While CUR-based approaches can be more accurate than the DE-based approach, they require a prohibitively large number of CE calls to compute item embeddings, thus making it impractical for deployment at scale. In this paper, we address these shortcomings with our proposed sparse-matrix factorization based method that efficiently computes latent query and item embeddings to approximate CE scores and performs k-NN search with the approximate CE similarity. We compute item embeddings offline by factorizing a sparse matrix containing query-item CE scores for a set of train queries. Our method produces a high-quality approximation while requiring only a fraction of CE calls as compared to CUR-based methods, and allows for leveraging DE to initialize the embedding space while avoiding compute- and resource-intensive finetuning of DE via distillation. At test time, the item embeddings remain fixed and retrieval occurs over rounds, alternating between a) estimating the test query embedding by minimizing error in approximating CE scores of items retrieved thus far, and b) using the updated test query embedding for retrieving more items. Our k-NN search method improves recall by up to 5% (k=1) and 54% (k=100) over DE-based approaches. Additionally, our indexing approach achieves a speedup of up to 100x over CUR-based and 5x over DE distillation methods, while matching or improving k-NN search recall over baselines.

5/7/2024

Enhancing Relevance of Embedding-based Retrieval at Walmart

Juexin Lin, Sachin Yadav, Feng Liu, Nicholas Rossi, Praveen R. Suram, Satya Chembolu, Prijith Chandran, Hrushikesh Mohapatra, Tony Lee, Alessandro Magnani, Ciya Liao

Embedding-based neural retrieval (EBR) is an effective search retrieval method in product search for tackling the vocabulary gap between customer search queries and products. The initial launch of our EBR system at Walmart yielded significant gains in relevance and add-to-cart rates [1]. However, despite EBR generally retrieving more relevant products for reranking, we have observed numerous instances of relevance degradation. Enhancing retrieval performance is crucial, as it directly influences product reranking and affects the customer shopping experience. Factors contributing to these degradations include false positives/negatives in the training data and the inability to handle query misspellings. To address these issues, we present several approaches to further strengthen the capabilities of our EBR model in terms of retrieval relevance. We introduce a Relevance Reward Model (RRM) based on human relevance feedback. We utilize RRM to remove noise from the training data and distill it into our EBR model through a multi-objective loss. In addition, we present the techniques to increase the performance of our EBR model, such as typo-aware training, and semi-positive generation. The effectiveness of our EBR is demonstrated through offline relevance evaluation, online AB tests, and successful deployments to live production. [1] Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, et al. 2022. Semantic retrieval at walmart. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3495-3503.

8/16/2024