FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection

Read original: arXiv:2408.06333 - Published 8/13/2024 by Yufei Huang, Xu Han, Maosong Sun

FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection

Overview

FastFiD is a technique to improve the inference efficiency of open-domain question answering models.
It selects the most relevant sentences from a large corpus of text to answer a given query, rather than processing the entire corpus.
This leads to faster inference times without sacrificing answer quality.

Plain English Explanation

FastFiD is a method that can make open-domain question answering systems more efficient. Open-domain question answering is the task of answering questions by searching through a large collection of text, like Wikipedia.

Typically, these systems process the entire text collection to find the answer, which can be slow. FastFiD addresses this by selecting the most relevant sentences from the text collection, and only processing those. This allows the system to answer questions faster without losing accuracy.

The key idea is to first use a sentence selection model to identify the sentences that are most likely to contain the answer. The question-answering model then only needs to process those relevant sentences, rather than the entire text collection. This improves the inference efficiency of the overall system.

Technical Explanation

FastFiD works by incorporating a sentence selection model into the open-domain question answering pipeline. The sentence selection model is trained to predict which sentences in the text collection are most likely to contain the answer to a given query.

During inference, the sentence selection model first scores all the sentences in the collection based on their relevance to the query. The top-scoring sentences are then passed to the question-answering model, which only needs to process this smaller subset of the text. This leads to faster inference times compared to processing the entire text collection.

The authors evaluate FastFiD on several open-domain question answering benchmarks, and show that it can achieve comparable or better answer accuracy than baseline systems, while being significantly more efficient in terms of inference time.

Critical Analysis

The authors of the FastFiD paper acknowledge that their approach has some limitations. One key limitation is that the sentence selection model needs to be trained separately from the question-answering model, which adds complexity to the overall system.

Additionally, the authors note that the effectiveness of FastFiD may depend on the quality of the sentence selection model. If the sentence selection model fails to identify the most relevant sentences, the question-answering model may not have access to the information needed to answer the query correctly.

Further research could explore ways to integrate the sentence selection and question-answering models more tightly, or to dynamically adjust the sentence selection process based on the question-answering model's performance.

Conclusion

FastFiD is a promising approach for improving the inference efficiency of open-domain question answering systems. By selectively processing only the most relevant sentences, it can provide faster answers without sacrificing accuracy.

The key innovation is the incorporation of a dedicated sentence selection model, which allows the system to focus its computational resources on the most important parts of the text collection. While the approach has some limitations, it represents an important step forward in making open-domain question answering more practical for real-world applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection

Yufei Huang, Xu Han, Maosong Sun

Open Domain Question Answering (ODQA) has been advancing rapidly in recent times, driven by significant developments in dense passage retrieval and pretrained language models. Current models typically incorporate the FiD framework, which is composed by a neural retriever alongside an encoder-decoder neural reader. In the answer generation process, the retriever will retrieve numerous passages (around 100 for instance), each of which is then individually encoded by the encoder. Subsequently, the decoder makes predictions based on these encoded passages. Nevertheless, this framework can be relatively time-consuming, particularly due to the extensive length of the gathered passages. To address this, we introduce FastFiD in this paper, a novel approach that executes sentence selection on the encoded passages. This aids in retaining valuable sentences while reducing the context length required for generating answers. Experiments on three commonly used datasets (Natural Questions, TriviaQA and ASQA) demonstrate that our method can enhance the inference speed by 2.3X-5.7X, while simultaneously maintaining the model's performance. Moreover, an in-depth analysis of the model's attention reveals that the selected sentences indeed hold a substantial contribution towards the final answer. The codes are publicly available at https://github.com/thunlp/FastFiD.

8/13/2024

Multi-Granularity Guided Fusion-in-Decoder

Eunseong Choi, Hyeri Lee, Jongwuk Lee

In Open-domain Question Answering (ODQA), it is essential to discern relevant contexts as evidence and avoid spurious ones among retrieved results. The model architecture that uses concatenated multiple contexts in the decoding phase, i.e., Fusion-in-Decoder, demonstrates promising performance but generates incorrect outputs from seemingly plausible contexts. To address this problem, we propose the Multi-Granularity guided Fusion-in-Decoder (MGFiD), discerning evidence across multiple levels of granularity. Based on multi-task learning, MGFiD harmonizes passage re-ranking with sentence classification. It aggregates evident sentences into an anchor vector that instructs the decoder. Additionally, it improves decoding efficiency by reusing the results of passage re-ranking for passage pruning. Through our experiments, MGFiD outperforms existing models on the Natural Questions (NQ) and TriviaQA (TQA) datasets, highlighting the benefits of its multi-granularity solution.

4/4/2024

🛸

Retrieval Augmented Generation for Domain-specific Question Answering

Sanat Sharma, David Seunghyun Yoon, Franck Dernoncourt, Dewang Sultania, Karishma Bagga, Mengjiao Zhang, Trung Bui, Varun Kotte

Question answering (QA) has become an important application in the advanced development of large language models. General pre-trained large language models for question-answering are not trained to properly understand the knowledge or terminology for a specific domain, such as finance, healthcare, education, and customer service for a product. To better cater to domain-specific understanding, we build an in-house question-answering system for Adobe products. We propose a novel framework to compile a large question-answer database and develop the approach for retrieval-aware finetuning of a Large Language model. We showcase that fine-tuning the retriever leads to major improvements in the final generation. Our overall approach reduces hallucinations during generation while keeping in context the latest retrieval information for contextual grounding.

5/30/2024

Improving Health Question Answering with Reliable and Time-Aware Evidence Retrieval

Juraj Vladika, Florian Matthes

In today's digital world, seeking answers to health questions on the Internet is a common practice. However, existing question answering (QA) systems often rely on using pre-selected and annotated evidence documents, thus making them inadequate for addressing novel questions. Our study focuses on the open-domain QA setting, where the key challenge is to first uncover relevant evidence in large knowledge bases. By utilizing the common retrieve-then-read QA pipeline and PubMed as a trustworthy collection of medical research documents, we answer health questions from three diverse datasets. We modify different retrieval settings to observe their influence on the QA pipeline's performance, including the number of retrieved documents, sentence selection process, the publication year of articles, and their number of citations. Our results reveal that cutting down on the amount of retrieved documents and favoring more recent and highly cited documents can improve the final macro F1 score up to 10%. We discuss the results, highlight interesting examples, and outline challenges for future research, like managing evidence disagreement and crafting user-friendly explanations.

4/15/2024