Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering

2404.17723

Published 5/7/2024 by Zhentao Xu, Mark Jerome Cruz, Matthew Guevara, Tie Wang, Manasi Deshpande, Xiaofeng Wang, Zheng Li

cs.IR cs.AI cs.CL cs.LG

🛸

Abstract

In customer service technical support, swiftly and accurately retrieving relevant past issues is critical for efficiently resolving customer inquiries. The conventional retrieval methods in retrieval-augmented generation (RAG) for large language models (LLMs) treat a large corpus of past issue tracking tickets as plain text, ignoring the crucial intra-issue structure and inter-issue relations, which limits performance. We introduce a novel customer service question-answering method that amalgamates RAG with a knowledge graph (KG). Our method constructs a KG from historical issues for use in retrieval, retaining the intra-issue structure and inter-issue relations. During the question-answering phase, our method parses consumer queries and retrieves related sub-graphs from the KG to generate answers. This integration of a KG not only improves retrieval accuracy by preserving customer service structure information but also enhances answering quality by mitigating the effects of text segmentation. Empirical assessments on our benchmark datasets, utilizing key retrieval (MRR, Recall@K, NDCG@K) and text generation (BLEU, ROUGE, METEOR) metrics, reveal that our method outperforms the baseline by 77.6% in MRR and by 0.32 in BLEU. Our method has been deployed within LinkedIn's customer service team for approximately six months and has reduced the median per-issue resolution time by 28.6%.

Create account to get full access

Overview

This paper proposes a novel customer service question-answering method that combines retrieval-augmented generation (RAG) with a knowledge graph (KG) to efficiently and accurately retrieve relevant past issues for resolving customer inquiries.
The conventional RAG methods treat a large corpus of past issue tracking tickets as plain text, ignoring the crucial intra-issue structure and inter-issue relations, which limits performance.
The proposed method constructs a KG from historical issues to retain this valuable structural information and uses it for improved retrieval and answer generation.

Plain English Explanation

When customers contact a company's technical support, customer service representatives need to quickly find information about similar past issues to help resolve the current problem. Retrieval-augmented generation (RAG) is a technique that allows large language models to retrieve relevant information from a large database to assist with this task.

However, the conventional RAG methods treat the database of past issues as just plain text, ignoring the inherent structure and relationships within and between the issues. This structural information can be very useful for accurately matching the current issue to similar past ones.

The researchers in this paper have developed a new method that builds a knowledge graph (KG) from the historical customer service data. A knowledge graph is a way of representing information as a network of interconnected concepts and their relationships. By using a KG instead of just plain text, the method can better preserve the context and structure of the past issues, leading to more accurate retrieval and better answers for the customer.

During the question-answering process, the method parses the customer's query and retrieves relevant sub-graphs from the KG to generate the response. This integration of a KG not only improves the retrieval accuracy but also enhances the overall quality of the answers provided to the customer.

Technical Explanation

The paper introduces a novel retrieval-augmented generation (RAG) based customer service question-answering method that leverages a knowledge graph (KG) to enhance the retrieval and answer generation process.

Conventional RAG approaches treat the large corpus of past issue tracking tickets as plain text, ignoring the crucial intra-issue structure and inter-issue relations. This limitation can hinder the performance of the system. To address this, the proposed method constructs a KG from the historical customer service data, preserving the inherent structure and relations between issues.

During the question-answering phase, the method first parses the customer's query and then retrieves relevant sub-graphs from the KG. This allows the system to better understand the context and structure of the current issue and match it to similar past issues, leading to more accurate retrieval. The retrieved sub-graphs are then used to generate the final answer for the customer.

The authors evaluate their method on benchmark datasets, using key retrieval metrics like mean reciprocal rank (MRR), recall@K, and normalized discounted cumulative gain (NDCG@K), as well as text generation metrics like BLEU, ROUGE, and METEOR. The results show that their method outperforms the baseline by 77.6% in MRR and 0.32 in BLEU score.

Furthermore, the method has been deployed within LinkedIn's customer service team for approximately six months, reducing the median per-issue resolution time by 28.6%.

Critical Analysis

The paper presents a compelling approach to enhancing retrieval-augmented generation for customer service question-answering by incorporating a knowledge graph. The authors have clearly demonstrated the benefits of preserving the structural information of past issues, which can lead to significant improvements in retrieval accuracy and answer quality.

One potential limitation of the study is the reliance on a specific knowledge graph construction process. While the authors have shown the effectiveness of their approach, it would be interesting to explore the performance of alternative KG construction methods or the integration of other structured data sources, such as issue metadata or user profiles.

Additionally, the paper does not delve into the scalability and computational efficiency of the proposed method, which could be important considerations for real-world deployment in large-scale customer service environments. Further research on the system's performance under varying data volumes and query loads would be valuable.

Another area for potential exploration is the interpretability and explainability of the KG-enhanced question-answering system. Providing customers with insights into how the system arrived at its recommendations could further enhance trust and user satisfaction.

Overall, the paper presents a novel and promising approach to improving retrieval-augmented generation for customer service, and the reported results suggest significant practical benefits. Continued research in this direction, addressing the potential limitations and exploring additional applications, could lead to further advancements in the field.

Conclusion

This paper introduces a novel customer service question-answering method that combines retrieval-augmented generation (RAG) with a knowledge graph (KG) to efficiently and accurately retrieve relevant past issues for resolving customer inquiries.

The key innovation is the construction of a KG from historical customer service data, which allows the method to preserve the crucial intra-issue structure and inter-issue relations that are often lost in conventional RAG approaches. By leveraging this structured knowledge during the question-answering process, the method demonstrates significant improvements in retrieval accuracy and answer quality, as evidenced by the empirical evaluations.

The successful deployment of this method within LinkedIn's customer service team, resulting in a 28.6% reduction in median per-issue resolution time, underscores the practical benefits of this approach. As companies continue to face growing customer service demands, innovations like this that enhance the efficiency and effectiveness of technical support systems could have a substantial impact on customer satisfaction and overall business performance.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

KG-RAG: Bridging the Gap Between Knowledge and Creativity

Diego Sanmartin

Ensuring factual accuracy while maintaining the creative capabilities of Large Language Model Agents (LMAs) poses significant challenges in the development of intelligent agent systems. LMAs face prevalent issues such as information hallucinations, catastrophic forgetting, and limitations in processing long contexts when dealing with knowledge-intensive tasks. This paper introduces a KG-RAG (Knowledge Graph-Retrieval Augmented Generation) pipeline, a novel framework designed to enhance the knowledge capabilities of LMAs by integrating structured Knowledge Graphs (KGs) with the functionalities of LLMs, thereby significantly reducing the reliance on the latent knowledge of LLMs. The KG-RAG pipeline constructs a KG from unstructured text and then performs information retrieval over the newly created graph to perform KGQA (Knowledge Graph Question Answering). The retrieval methodology leverages a novel algorithm called Chain of Explorations (CoE) which benefits from LLMs reasoning to explore nodes and relationships within the KG sequentially. Preliminary experiments on the ComplexWebQuestions dataset demonstrate notable improvements in the reduction of hallucinated content and suggest a promising path toward developing intelligent systems adept at handling knowledge-intensive tasks.

5/21/2024

cs.AI cs.CL cs.IR

Empowering Large Language Models to Set up a Knowledge Retrieval Indexer via Self-Learning

Xun Liang, Simin Niu, Zhiyu li, Sensen Zhang, Shichao Song, Hanyu Wang, Jiawei Yang, Feiyu Xiong, Bo Tang, Chenyang Xi

Retrieval-Augmented Generation (RAG) offers a cost-effective approach to injecting real-time knowledge into large language models (LLMs). Nevertheless, constructing and validating high-quality knowledge repositories require considerable effort. We propose a pre-retrieval framework named Pseudo-Graph Retrieval-Augmented Generation (PG-RAG), which conceptualizes LLMs as students by providing them with abundant raw reading materials and encouraging them to engage in autonomous reading to record factual information in their own words. The resulting concise, well-organized mental indices are interconnected through common topics or complementary facts to form a pseudo-graph database. During the retrieval phase, PG-RAG mimics the human behavior in flipping through notes, identifying fact paths and subsequently exploring the related contexts. Adhering to the principle of the path taken by many is the best, it integrates highly corroborated fact paths to provide a structured and refined sub-graph assisting LLMs. We validated PG-RAG on three specialized question-answering datasets. In single-document tasks, PG-RAG significantly outperformed the current best baseline, KGP-LLaMA, across all key evaluation metrics, with an average overall performance improvement of 11.6%. Specifically, its BLEU score increased by approximately 14.3%, and the QE-F1 metric improved by 23.7%. In multi-document scenarios, the average metrics of PG-RAG were at least 2.35% higher than the best baseline. Notably, the BLEU score and QE-F1 metric showed stable improvements of around 7.55% and 12.75%, respectively. Our code: https://github.com/IAAR-Shanghai/PGRAG.

5/28/2024

cs.CL cs.IR

🛸

G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering

Xiaoxin He, Yijun Tian, Yifei Sun, Nitesh V. Chawla, Thomas Laurent, Yann LeCun, Xavier Bresson, Bryan Hooi

Given a graph with textual attributes, we enable users to `chat with their graph': that is, to ask questions about the graph using a conversational interface. In response to a user's questions, our method provides textual replies and highlights the relevant parts of the graph. While existing works integrate large language models (LLMs) and graph neural networks (GNNs) in various ways, they mostly focus on either conventional graph tasks (such as node, edge, and graph classification), or on answering simple graph queries on small or synthetic graphs. In contrast, we develop a flexible question-answering framework targeting real-world textual graphs, applicable to multiple applications including scene graph understanding, common sense reasoning, and knowledge graph reasoning. Toward this goal, we first develop a Graph Question Answering (GraphQA) benchmark with data collected from different tasks. Then, we propose our G-Retriever method, introducing the first retrieval-augmented generation (RAG) approach for general textual graphs, which can be fine-tuned to enhance graph understanding via soft prompting. To resist hallucination and to allow for textual graphs that greatly exceed the LLM's context window size, G-Retriever performs RAG over a graph by formulating this task as a Prize-Collecting Steiner Tree optimization problem. Empirical evaluations show that our method outperforms baselines on textual graph tasks from multiple domains, scales well with larger graph sizes, and mitigates hallucination.~footnote{Our codes and datasets are available at: url{https://github.com/XiaoxinHe/G-Retriever}}

5/28/2024

cs.LG

Enhancing Question Answering for Enterprise Knowledge Bases using Large Language Models

Feihu Jiang, Chuan Qin, Kaichun Yao, Chuyu Fang, Fuzhen Zhuang, Hengshu Zhu, Hui Xiong

Efficient knowledge management plays a pivotal role in augmenting both the operational efficiency and the innovative capacity of businesses and organizations. By indexing knowledge through vectorization, a variety of knowledge retrieval methods have emerged, significantly enhancing the efficacy of knowledge management systems. Recently, the rapid advancements in generative natural language processing technologies paved the way for generating precise and coherent answers after retrieving relevant documents tailored to user queries. However, for enterprise knowledge bases, assembling extensive training data from scratch for knowledge retrieval and generation is a formidable challenge due to the privacy and security policies of private data, frequently entailing substantial costs. To address the challenge above, in this paper, we propose EKRG, a novel Retrieval-Generation framework based on large language models (LLMs), expertly designed to enable question-answering for Enterprise Knowledge bases with limited annotation costs. Specifically, for the retrieval process, we first introduce an instruction-tuning method using an LLM to generate sufficient document-question pairs for training a knowledge retriever. This method, through carefully designed instructions, efficiently generates diverse questions for enterprise knowledge bases, encompassing both fact-oriented and solution-oriented knowledge. Additionally, we develop a relevance-aware teacher-student learning strategy to further enhance the efficiency of the training process. For the generation process, we propose a novel chain of thought (CoT) based fine-tuning method to empower the LLM-based generator to adeptly respond to user questions using retrieved documents. Finally, extensive experiments on real-world datasets have demonstrated the effectiveness of our proposed framework.

4/23/2024

cs.CL cs.AI cs.IR