The Literature Review Network: An Explainable Artificial Intelligence for Systematic Literature Reviews, Meta-analyses, and Method Development

Read original: arXiv:2408.05239 - Published 8/13/2024 by Joshua Morriss, Tod Brindle, Jessica Bah Rosman, Daniel Reibsamen, Andreas Enz

🏷️

Overview

Systematic literature reviews are considered the highest quality of evidence in research.
However, the review process is hindered by significant resource and data constraints.
The Literature Review Network (LRN) is an explainable AI platform designed to automate the entire literature review process and adhere to PRISMA 2020 standards.

Plain English Explanation

The paper discusses Literature Review Network (LRN), which is an artificial intelligence (AI) system that can automatically conduct literature reviews. Literature reviews are important in research as they summarize the current state of knowledge on a particular topic. However, conducting these reviews manually is a time-consuming and resource-intensive process.

LRN aims to address this problem by using AI to automate the entire literature review process. The researchers evaluated LRN's performance in the domain of surgical glove practices, using search queries developed by experts. They found that LRN was able to achieve high classification accuracy without any expert training, and was able to identify key themes and insights that were nearly identical to those found by the clinical researchers. LRN was also able to complete the entire literature review process much faster than the manual review, reducing the time from 11 months to just 5 days.

The researchers conclude that explainable AI can accurately and efficiently conduct PRISMA-compliant systematic literature reviews, without the need for expert training. This could potentially revolutionize healthcare research by helping researchers quickly and accurately understand the current state of knowledge on a particular topic.

Technical Explanation

The researchers evaluated the performance of the Literature Review Network (LRN) in conducting a systematic literature review on surgical glove practices. They used three search strings developed by experts to query the PubMed database, and trained all LRN models using a non-expert.

The researchers benchmarked LRN's performance against an expert manual review, measuring concordance using the Jaccard index and confusion matrices. They also assessed LRN's explainability by linking key terms to the classification decisions.

The results show that LRN models demonstrated superior classification accuracy without expert training, achieving 84.78% and 85.71% accuracy. The highest-performing model achieved high interrater reliability (k = 0.4953) and strong explainability metrics, linking terms like 'reduce', 'accident', and 'sharp' with 'double-gloving'. Another LRN model covered 91.51% of the relevant literature despite diverging from the non-expert's judgments (k = 0.2174), with terms like 'latex', 'double' (gloves), and 'indication'.

Importantly, LRN outperformed the manual review, reducing the entire process from 19,920 minutes over 11 months to just 288.6 minutes over 5 days.

Critical Analysis

The paper provides a compelling demonstration of how explainable AI can be used to automate the systematic literature review process, without the need for expert training.

One potential limitation is that the evaluation was conducted in a single domain (surgical glove practices), and it's unclear how well the LRN system would perform in other domains. Additionally, the researchers noted that the non-expert training data may have introduced some biases or inconsistencies, which could impact the system's performance.

Further research is needed to explore the generalizability of the LRN approach, as well as to address any potential ethical or privacy concerns associated with automating the literature review process. It will also be important to continue exploring ways to enhance the explainability and transparency of the LRN system, to build trust and confidence in its outputs.

Conclusion

This study demonstrates that explainable AI can accurately and efficiently conduct PRISMA-compliant systematic literature reviews, without the need for expert training. The Literature Review Network (LRN) was able to summarize the results of surgical glove studies and identify key themes that were nearly identical to the findings of the clinical researchers, while completing the entire process much faster than a manual review.

This has the potential to revolutionize healthcare research by helping researchers quickly and accurately understand the current state of knowledge on a particular topic, allowing them to focus their efforts on advancing the field rather than being bogged down by the resource-intensive literature review process.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🏷️

The Literature Review Network: An Explainable Artificial Intelligence for Systematic Literature Reviews, Meta-analyses, and Method Development

Joshua Morriss, Tod Brindle, Jessica Bah Rosman, Daniel Reibsamen, Andreas Enz

Systematic literature reviews are the highest quality of evidence in research. However, the review process is hindered by significant resource and data constraints. The Literature Review Network (LRN) is the first of its kind explainable AI platform adhering to PRISMA 2020 standards, designed to automate the entire literature review process. LRN was evaluated in the domain of surgical glove practices using 3 search strings developed by experts to query PubMed. A non-expert trained all LRN models. Performance was benchmarked against an expert manual review. Explainability and performance metrics assessed LRN's ability to replicate the experts' review. Concordance was measured with the Jaccard index and confusion matrices. Researchers were blinded to the other's results until study completion. Overlapping studies were integrated into an LRN-generated systematic review. LRN models demonstrated superior classification accuracy without expert training, achieving 84.78% and 85.71% accuracy. The highest performance model achieved high interrater reliability (k = 0.4953) and explainability metrics, linking 'reduce', 'accident', and 'sharp' with 'double-gloving'. Another LRN model covered 91.51% of the relevant literature despite diverging from the non-expert's judgments (k = 0.2174), with the terms 'latex', 'double' (gloves), and 'indication'. LRN outperformed the manual review (19,920 minutes over 11 months), reducing the entire process to 288.6 minutes over 5 days. This study demonstrates that explainable AI does not require expert training to successfully conduct PRISMA-compliant systematic literature reviews like an expert. LRN summarized the results of surgical glove studies and identified themes that were nearly identical to the clinical researchers' findings. Explainable AI can accurately expedite our understanding of clinical practices, potentially revolutionizing healthcare research.

8/13/2024

🤖

Artificial Intelligence for Literature Reviews: Opportunities and Challenges

Francisco Bolanos, Angelo Salatino, Francesco Osborne, Enrico Motta

This manuscript presents a comprehensive review of the use of Artificial Intelligence (AI) in Systematic Literature Reviews (SLRs). A SLR is a rigorous and organised methodology that assesses and integrates previous research on a given topic. Numerous tools have been developed to assist and partially automate the SLR process. The increasing role of AI in this field shows great potential in providing more effective support for researchers, moving towards the semi-automatic creation of literature reviews. Our study focuses on how AI techniques are applied in the semi-automation of SLRs, specifically in the screening and extraction phases. We examine 21 leading SLR tools using a framework that combines 23 traditional features with 11 AI features. We also analyse 11 recent tools that leverage large language models for searching the literature and assisting academic writing. Finally, the paper discusses current trends in the field, outlines key research challenges, and suggests directions for future research.

8/7/2024

Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning

Teo Susnjak, Peter Hwang, Napoleon H. Reyes, Andre L. C. Barczak, Timothy R. McIntosh, Surangika Ranathunga

This research pioneers the use of fine-tuned Large Language Models (LLMs) to automate Systematic Literature Reviews (SLRs), presenting a significant and novel contribution in integrating AI to enhance academic research methodologies. Our study employed the latest fine-tuning methodologies together with open-sourced LLMs, and demonstrated a practical and efficient approach to automating the final execution stages of an SLR process that involves knowledge synthesis. The results maintained high fidelity in factual accuracy in LLM responses, and were validated through the replication of an existing PRISMA-conforming SLR. Our research proposed solutions for mitigating LLM hallucination and proposed mechanisms for tracking LLM responses to their sources of information, thus demonstrating how this approach can meet the rigorous demands of scholarly research. The findings ultimately confirmed the potential of fine-tuned LLMs in streamlining various labor-intensive processes of conducting literature reviews. Given the potential of this approach and its applicability across all research domains, this foundational study also advocated for updating PRISMA reporting guidelines to incorporate AI-driven processes, ensuring methodological transparency and reliability in future SLRs. This study broadens the appeal of AI-enhanced tools across various academic and research fields, setting a new standard for conducting comprehensive and accurate literature reviews with more efficiency in the face of ever-increasing volumes of academic studies.

4/16/2024

💬

The emergence of Large Language Models (LLM) as a tool in literature reviews: an LLM automated systematic review

Dmitry Scherbakov, Nina Hubig, Vinita Jansari, Alexander Bakumenko, Leslie A. Lenert

Objective: This study aims to summarize the usage of Large Language Models (LLMs) in the process of creating a scientific review. We look at the range of stages in a review that can be automated and assess the current state-of-the-art research projects in the field. Materials and Methods: The search was conducted in June 2024 in PubMed, Scopus, Dimensions, and Google Scholar databases by human reviewers. Screening and extraction process took place in Covidence with the help of LLM add-on which uses OpenAI gpt-4o model. ChatGPT was used to clean extracted data and generate code for figures in this manuscript, ChatGPT and Scite.ai were used in drafting all components of the manuscript, except the methods and discussion sections. Results: 3,788 articles were retrieved, and 172 studies were deemed eligible for the final review. ChatGPT and GPT-based LLM emerged as the most dominant architecture for review automation (n=126, 73.2%). A significant number of review automation projects were found, but only a limited number of papers (n=26, 15.1%) were actual reviews that used LLM during their creation. Most citations focused on automation of a particular stage of review, such as Searching for publications (n=60, 34.9%), and Data extraction (n=54, 31.4%). When comparing pooled performance of GPT-based and BERT-based models, the former were better in data extraction with mean precision 83.0% (SD=10.4), and recall 86.0% (SD=9.8), while being slightly less accurate in title and abstract screening stage (Maccuracy=77.3%, SD=13.0). Discussion/Conclusion: Our LLM-assisted systematic review revealed a significant number of research projects related to review automation using LLMs. The results looked promising, and we anticipate that LLMs will change in the near future the way the scientific reviews are conducted.

9/10/2024