CRE-LLM: A Domain-Specific Chinese Relation Extraction Framework with Fine-tuned Large Language Model

2404.18085

Published 4/30/2024 by Zhengpeng Shi, Haoran Luo

⛏️

Abstract

Domain-Specific Chinese Relation Extraction (DSCRE) aims to extract relations between entities from domain-specific Chinese text. Despite the rapid development of PLMs in recent years, especially LLMs, DSCRE still faces three core challenges: complex network structure design, poor awareness, and high consumption of fine-tuning. Given the impressive performance of large language models (LLMs) in natural language processing, we propose a new framework called CRE-LLM. This framework is based on fine-tuning open-source LLMs, such as Llama-2, ChatGLM2, and Baichuan2. CRE-LLM enhances the logic-awareness and generative capabilities of the model by constructing an appropriate prompt and utilizing open-source LLMs for instruction-supervised fine-tuning. And then it directly extracts the relations of the given entities in the input textual data, which improving the CRE approach. To demonstrate the effectiveness of the proposed framework, we conducted extensive experiments on two domain-specific CRE datasets, FinRE and SanWen. The experimental results show that CRE-LLM is significantly superior and robust, achieving state-of-the-art (SOTA) performance on the FinRE dataset. This paper introduces a novel approach to domain-specific relation extraction (DSCRE) tasks that are semantically more complex by combining LLMs with triples. Our code is publicly available.

Create account to get full access

Overview

The paper introduces a new framework called CRE-LLM for Domain-Specific Chinese Relation Extraction (DSCRE) tasks.
DSCRE aims to extract relationships between entities from domain-specific Chinese text, but faces challenges like complex network design, poor awareness, and high fine-tuning costs.
CRE-LLM leverages the impressive performance of large language models (LLMs) like Llama-2, ChatGLM2, and Baichuan2 to enhance logic-awareness and generative capabilities for improved relation extraction.
The framework was evaluated on two domain-specific CRE datasets, FinRE and SanWen, achieving state-of-the-art (SOTA) performance on FinRE.

Plain English Explanation

The paper presents a new approach to extracting relationships between entities from specialized Chinese text. Existing methods for this task, called Domain-Specific Chinese Relation Extraction (DSCRE), have faced some challenges. These include designing complex neural network architectures, lack of deep understanding of the text, and high costs for fine-tuning the models.

To address these issues, the researchers developed a framework called CRE-LLM that builds on the impressive capabilities of large language models (LLMs) like Llama-2, ChatGLM2, and Baichuan2. These powerful models can understand language context and generate relevant text.

The key idea is to fine-tune these LLMs using carefully crafted prompts that teach the models to better understand the logical relationships between entities in the text. This allows the models to directly extract the relevant relationships, improving on previous DSCRE approaches.

The researchers tested their CRE-LLM framework on two specialized Chinese text datasets, FinRE and SanWen, and found that it outperformed other state-of-the-art methods, especially on the FinRE dataset. This demonstrates the potential of using advanced language models to tackle the challenges of extracting meaning from domain-specific Chinese text.

Technical Explanation

The paper proposes a new framework called CRE-LLM (Domain-Specific Chinese Relation Extraction with Large Language Models) to address the challenges in DSCRE tasks. These challenges include complex network architecture design, poor semantic awareness, and high fine-tuning costs.

CRE-LLM leverages the impressive performance of large language models (LLMs) like Llama-2, ChatGLM2, and Baichuan2 to enhance the model's logic-awareness and generative capabilities. The framework fine-tunes these open-source LLMs using carefully designed prompts, enabling the model to directly extract relations between entities in the input text.

The researchers conducted extensive experiments on two domain-specific CRE datasets, FinRE and SanWen. The results show that CRE-LLM significantly outperforms other state-of-the-art approaches, achieving SOTA performance on the FinRE dataset. This demonstrates the effectiveness of the proposed framework in tackling the challenges of DSCRE tasks.

Critical Analysis

The paper presents a promising approach to DSCRE tasks, leveraging the power of large language models to overcome some of the key challenges. However, there are a few potential limitations and areas for further research:

The authors only evaluated the framework on two domain-specific datasets, FinRE and SanWen. It would be valuable to test the generalizability of CRE-LLM on a wider range of DSCRE tasks and datasets to better understand its broader applicability.
The paper does not provide detailed insights into the specific prompt engineering techniques used to fine-tune the LLMs. Sharing more information about the prompt design process could help other researchers and practitioners replicate and build upon the proposed approach.
While the CRE-LLM framework achieves state-of-the-art performance, the paper does not discuss the computational cost or inference time of the model. Evaluating the efficiency and scalability of the approach would be important for real-world deployment.
The paper could be strengthened by a more in-depth comparison to other DSCRE methods, highlighting the unique advantages and potential limitations of the CRE-LLM approach.

Overall, the CRE-LLM framework represents an exciting step forward in leveraging large language models for domain-specific relation extraction tasks. Further research and empirical evaluation could help refine and expand the approach to unlock its full potential.

Conclusion

This paper introduces a novel framework called CRE-LLM for Domain-Specific Chinese Relation Extraction (DSCRE) tasks. By fine-tuning open-source large language models like Llama-2, ChatGLM2, and Baichuan2 using carefully designed prompts, the CRE-LLM framework is able to enhance the models' logic-awareness and generative capabilities for improved relation extraction.

The extensive experiments conducted on the FinRE and SanWen datasets demonstrate the effectiveness and robustness of the CRE-LLM approach, achieving state-of-the-art performance. This research showcases the potential of leveraging large language models to tackle the complex challenges in DSCRE tasks, opening up new avenues for further advancements in this field.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Empirical Analysis of Dialogue Relation Extraction with Large Language Models

Guozheng Li, Zijie Xu, Ziyu Shang, Jiajun Liu, Ke Ji, Yikai Guo

Dialogue relation extraction (DRE) aims to extract relations between two arguments within a dialogue, which is more challenging than standard RE due to the higher person pronoun frequency and lower information density in dialogues. However, existing DRE methods still suffer from two serious issues: (1) hard to capture long and sparse multi-turn information, and (2) struggle to extract golden relations based on partial dialogues, which motivates us to discover more effective methods that can alleviate the above issues. We notice that the rise of large language models (LLMs) has sparked considerable interest in evaluating their performance across diverse tasks. To this end, we initially investigate the capabilities of different LLMs in DRE, considering both proprietary models and open-source models. Interestingly, we discover that LLMs significantly alleviate two issues in existing DRE methods. Generally, we have following findings: (1) scaling up model size substantially boosts the overall DRE performance and achieves exceptional results, tackling the difficulty of capturing long and sparse multi-turn information; (2) LLMs encounter with much smaller performance drop from entire dialogue setting to partial dialogue setting compared to existing methods; (3) LLMs deliver competitive or superior performances under both full-shot and few-shot settings compared to current state-of-the-art; (4) LLMs show modest performances on inverse relations but much stronger improvements on general relations, and they can handle dialogues of various lengths especially for longer sequences.

4/30/2024

cs.CL cs.AI

Relation Extraction with Fine-Tuned Large Language Models in Retrieval Augmented Generation Frameworks

Sefika Efeoglu, Adrian Paschke

Information Extraction (IE) is crucial for converting unstructured data into structured formats like Knowledge Graphs (KGs). A key task within IE is Relation Extraction (RE), which identifies relationships between entities in text. Various RE methods exist, including supervised, unsupervised, weakly supervised, and rule-based approaches. Recent studies leveraging pre-trained language models (PLMs) have shown significant success in this area. In the current era dominated by Large Language Models (LLMs), fine-tuning these models can overcome limitations associated with zero-shot LLM prompting-based RE methods, especially regarding domain adaptation challenges and identifying implicit relations between entities in sentences. These implicit relations, which cannot be easily extracted from a sentence's dependency tree, require logical inference for accurate identification. This work explores the performance of fine-tuned LLMs and their integration into the Retrieval Augmented-based (RAG) RE approach to address the challenges of identifying implicit relations at the sentence level, particularly when LLMs act as generators within the RAG framework. Empirical evaluations on the TACRED, TACRED-Revisited (TACREV), Re-TACRED, and SemEVAL datasets show significant performance improvements with fine-tuned LLMs, including Llama2-7B, Mistral-7B, and T5 (Large). Notably, our approach achieves substantial gains on SemEVAL, where implicit relations are common, surpassing previous results on this dataset. Additionally, our method outperforms previous works on TACRED, TACREV, and Re-TACRED, demonstrating exceptional performance across diverse evaluation scenarios.

6/26/2024

cs.CL cs.AI

How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation

Dawulie Jinensibieke, Mieradilijiang Maimaiti, Wentao Xiao, Yuanhang Zheng, Xiaobo Wang

Relation Extraction (RE) serves as a crucial technology for transforming unstructured text into structured information, especially within the framework of Knowledge Graph development. Its importance is emphasized by its essential role in various downstream tasks. Besides the conventional RE methods which are based on neural networks and pre-trained language models, large language models (LLMs) are also utilized in the research field of RE. However, on low-resource languages (LRLs), both conventional RE methods and LLM-based methods perform poorly on RE due to the data scarcity issues. To this end, this paper constructs low-resource relation extraction datasets in 10 LRLs in three regions (Central Asia, Southeast Asia and Middle East). The corpora are constructed by translating the original publicly available English RE datasets (NYT10, FewRel and CrossRE) using an effective multilingual machine translation. Then, we use the language perplexity (PPL) to filter out the low-quality data from the translated datasets. Finally, we conduct an empirical study and validate the performance of several open-source LLMs on these generated LRL RE datasets.

6/27/2024

cs.CL

Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

Guozheng Li, Peng Wang, Wenjun Ke, Yikai Guo, Ke Ji, Ziyu Shang, Jiajun Liu, Zijie Xu

Relation extraction (RE) aims to identify relations between entities mentioned in texts. Although large language models (LLMs) have demonstrated impressive in-context learning (ICL) abilities in various tasks, they still suffer from poor performances compared to most supervised fine-tuned RE methods. Utilizing ICL for RE with LLMs encounters two challenges: (1) retrieving good demonstrations from training examples, and (2) enabling LLMs exhibit strong ICL abilities in RE. On the one hand, retrieving good demonstrations is a non-trivial process in RE, which easily results in low relevance regarding entities and relations. On the other hand, ICL with an LLM achieves poor performance in RE while RE is different from language modeling in nature or the LLM is not large enough. In this work, we propose a novel recall-retrieve-reason RE framework that synergizes LLMs with retrieval corpora (training examples) to enable relevant retrieving and reliable in-context reasoning. Specifically, we distill the consistently ontological knowledge from training datasets to let LLMs generate relevant entity pairs grounded by retrieval corpora as valid queries. These entity pairs are then used to retrieve relevant training examples from the retrieval corpora as demonstrations for LLMs to conduct better ICL via instruction tuning. Extensive experiments on different LLMs and RE datasets demonstrate that our method generates relevant and valid entity pairs and boosts ICL abilities of LLMs, achieving competitive or new state-of-the-art performance on sentence-level RE compared to previous supervised fine-tuning methods and ICL-based methods.

4/30/2024

cs.CL cs.AI