AuditGPT: Auditing Smart Contracts with ChatGPT

Read original: arXiv:2404.04306 - Published 4/9/2024 by Shihao Xia, Shuai Shao, Mengting He, Tingting Yu, Linhai Song, Yiying Zhang

AuditGPT: Auditing Smart Contracts with ChatGPT

Overview

This research paper introduces "AuditGPT", a system that leverages the capabilities of the ChatGPT language model to audit smart contracts and identify potential security vulnerabilities.
The paper explores the opportunities and challenges in using large language models like ChatGPT for smart contract auditing, a critical task in ensuring the safety and reliability of decentralized applications built on blockchain platforms.

Plain English Explanation

The paper describes a new system called "AuditGPT" that uses a powerful language model called ChatGPT to help audit and analyze smart contracts. Smart contracts are computer programs that automatically execute the terms of a contract on the blockchain, a decentralized digital ledger. Auditing these smart contracts is important to make sure they work as intended and don't have any vulnerabilities that could be exploited.

The researchers behind AuditGPT recognized that ChatGPT, a large language model trained on a vast amount of text data, has the potential to assist in the smart contract auditing process. By prompting ChatGPT with details about a smart contract, the system can analyze the code, identify potential issues, and provide recommendations for improvements. This could save time and resources compared to manual auditing, which can be a complex and error-prone task.

The paper explores the opportunities and challenges of using a system like AuditGPT for smart contract auditing. On the positive side, it could make the auditing process more efficient and effective, especially for smaller or less complex smart contracts. However, there are also concerns about the reliability and trustworthiness of using an AI system for such a critical task. The researchers acknowledge that further research and development is needed to fully realize the potential of this approach.

Technical Explanation

The core idea behind AuditGPT is to leverage the natural language processing capabilities of the ChatGPT model to assist in the auditing of Ethereum-based smart contracts written in the Solidity programming language.

The researchers describe a workflow where the AuditGPT system prompts ChatGPT with details about a smart contract, such as the contract's code, documentation, and intended functionality. ChatGPT then analyzes this information and provides feedback on potential security vulnerabilities, code optimization opportunities, and compliance with best practices.

To evaluate the effectiveness of this approach, the researchers conducted a series of experiments where they used AuditGPT to audit a set of real-world smart contracts and compared the results to those of manual audits performed by human experts. The results suggest that AuditGPT can identify a significant portion of the same vulnerabilities as human auditors, while also providing additional insights and recommendations.

The paper also discusses the challenges and limitations of using a language model like ChatGPT for smart contract auditing. For example, the researchers note that ChatGPT's understanding is based on the data it was trained on, which may not fully capture the nuances and domain-specific knowledge required for comprehensive smart contract auditing. Additionally, the researchers highlight the need for further research to ensure the reliability and trustworthiness of the AuditGPT system.

Critical Analysis

The researchers acknowledge several caveats and limitations of the AuditGPT approach. One key concern is the reliability and trustworthiness of using an AI system for smart contract auditing, a task that requires a deep understanding of blockchain technology, Solidity programming, and security best practices. While the experiments showed AuditGPT can identify a significant number of vulnerabilities, the researchers note that the system may miss certain subtle or complex issues that a human expert would be better equipped to detect.

Additionally, the paper highlights the need for further research to improve the capabilities of language models like ChatGPT in the context of smart contract auditing. For example, the researchers suggest exploring ways to enhance the model's understanding of blockchain-specific concepts and to better integrate it with existing smart contract auditing tools and workflows.

Overall, the AuditGPT concept represents an interesting and potentially valuable application of large language models in the domain of smart contract security. However, the researchers rightly emphasize that significant further work is needed to fully realize the potential of this approach and address the challenges and limitations identified in the paper.

Conclusion

The AuditGPT paper presents a novel approach to leveraging the capabilities of the ChatGPT language model for the auditing of Ethereum-based smart contracts. The researchers demonstrate the potential of this system to assist in the identification of security vulnerabilities and provide optimization recommendations, which could improve the overall quality and reliability of decentralized applications built on blockchain platforms.

While the results are promising, the paper also highlights the need for further research and development to address the reliability and trustworthiness concerns associated with using an AI system for such a critical task. Nonetheless, the AuditGPT concept represents an exciting step forward in the application of large language models to the domain of smart contract security, with the potential to significantly streamline and enhance the auditing process in the future.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

AuditGPT: Auditing Smart Contracts with ChatGPT

Shihao Xia, Shuai Shao, Mengting He, Tingting Yu, Linhai Song, Yiying Zhang

To govern smart contracts running on Ethereum, multiple Ethereum Request for Comment (ERC) standards have been developed, each containing a set of rules to guide the behaviors of smart contracts. Violating the ERC rules could cause serious security issues and financial loss, signifying the importance of verifying smart contracts follow ERCs. Today's practices of such verification are to either manually audit each single contract or use expert-developed, limited-scope program-analysis tools, both of which are far from being effective in identifying ERC rule violations. This paper presents a tool named AuditGPT that leverages large language models (LLMs) to automatically and comprehensively verify ERC rules against smart contracts. To build AuditGPT, we first conduct an empirical study on 222 ERC rules specified in four popular ERCs to understand their content, their security impacts, their specification in natural language, and their implementation in Solidity. Guided by the study, we construct AuditGPT by separating the large, complex auditing process into small, manageable tasks and design prompts specialized for each ERC rule type to enhance LLMs' auditing performance. In the evaluation, AuditGPT successfully pinpoints 418 ERC rule violations and only reports 18 false positives, showcasing its effectiveness and accuracy. Moreover, AuditGPT beats an auditing service provided by security experts in effectiveness, accuracy, and cost, demonstrating its advancement over state-of-the-art smart-contract auditing practices.

4/9/2024

🗣️

GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis

Yuqiang Sun, Daoyuan Wu, Yue Xue, Han Liu, Haijun Wang, Zhengzi Xu, Xiaofei Xie, Yang Liu

Smart contracts are prone to various vulnerabilities, leading to substantial financial losses over time. Current analysis tools mainly target vulnerabilities with fixed control or data-flow patterns, such as re-entrancy and integer overflow. However, a recent study on Web3 security bugs revealed that about 80% of these bugs cannot be audited by existing tools due to the lack of domain-specific property description and checking. Given recent advances in Large Language Models (LLMs), it is worth exploring how Generative Pre-training Transformer (GPT) could aid in detecting logicc vulnerabilities. In this paper, we propose GPTScan, the first tool combining GPT with static analysis for smart contract logic vulnerability detection. Instead of relying solely on GPT to identify vulnerabilities, which can lead to high false positives and is limited by GPT's pre-trained knowledge, we utilize GPT as a versatile code understanding tool. By breaking down each logic vulnerability type into scenarios and properties, GPTScan matches candidate vulnerabilities with GPT. To enhance accuracy, GPTScan further instructs GPT to intelligently recognize key variables and statements, which are then validated by static confirmation. Evaluation on diverse datasets with around 400 contract projects and 3K Solidity files shows that GPTScan achieves high precision (over 90%) for token contracts and acceptable precision (57.14%) for large projects like Web3Bugs. It effectively detects ground-truth logic vulnerabilities with a recall of over 70%, including 9 new vulnerabilities missed by human auditors. GPTScan is fast and cost-effective, taking an average of 14.39 seconds and 0.01 USD to scan per thousand lines of Solidity code. Moreover, static confirmation helps GPTScan reduce two-thirds of false positives.

5/7/2024

🛸

PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation

Ye Liu, Yue Xue, Daoyuan Wu, Yuqiang Sun, Yi Li, Miaolei Shi, Yang Liu

With recent advances in large language models (LLMs), this paper explores the potential of leveraging state-of-the-art LLMs, such as GPT-4, to transfer existing human-written properties (e.g., those from Certora auditing reports) and automatically generate customized properties for unknown code. To this end, we embed existing properties into a vector database and retrieve a reference property for LLM-based in-context learning to generate a new prop- erty for a given code. While this basic process is relatively straight- forward, ensuring that the generated properties are (i) compilable, (ii) appropriate, and (iii) runtime-verifiable presents challenges. To address (i), we use the compilation and static analysis feedback as an external oracle to guide LLMs in iteratively revising the generated properties. For (ii), we consider multiple dimensions of similarity to rank the properties and employ a weighted algorithm to identify the top-K properties as the final result. For (iii), we design a dedicated prover to formally verify the correctness of the generated prop- erties. We have implemented these strategies into a novel system called PropertyGPT, with 623 human-written properties collected from 23 Certora projects. Our experiments show that PropertyGPT can generate comprehensive and high-quality properties, achieving an 80% recall compared to the ground truth. It successfully detected 26 CVEs/attack incidents out of 37 tested and also uncovered 12 zero-day vulnerabilities, resulting in $8,256 bug bounty rewards.

5/7/2024

Retrieval Augmented Generation Integrated Large Language Models in Smart Contract Vulnerability Detection

Jeffy Yu

The rapid growth of Decentralized Finance (DeFi) has been accompanied by substantial financial losses due to smart contract vulnerabilities, underscoring the critical need for effective security auditing. With attacks becoming more frequent, the necessity and demand for auditing services has escalated. This especially creates a financial burden for independent developers and small businesses, who often have limited available funding for these services. Our study builds upon existing frameworks by integrating Retrieval-Augmented Generation (RAG) with large language models (LLMs), specifically employing GPT-4-1106 for its 128k token context window. We construct a vector store of 830 known vulnerable contracts, leveraging Pinecone for vector storage, OpenAI's text-embedding-ada-002 for embeddings, and LangChain to construct the RAG-LLM pipeline. Prompts were designed to provide a binary answer for vulnerability detection. We first test 52 smart contracts 40 times each against a provided vulnerability type, verifying the replicability and consistency of the RAG-LLM. Encouraging results were observed, with a 62.7% success rate in guided detection of vulnerabilities. Second, we challenge the model under a blind audit setup, without the vulnerability type provided in the prompt, wherein 219 contracts undergo 40 tests each. This setup evaluates the general vulnerability detection capabilities without hinted context assistance. Under these conditions, a 60.71% success rate was observed. While the results are promising, we still emphasize the need for human auditing at this time. We provide this study as a proof of concept for a cost-effective smart contract auditing process, moving towards democratic access to security.

7/23/2024