ML Study of MaliciousTransactions in Ethereum

Read original: arXiv:2408.08749 - Published 8/19/2024 by Natan Katz

ML Study of MaliciousTransactions in Ethereum

Overview

Examines vulnerabilities in smart contracts and introduces CodeLlama, a tool for detecting gas-driven vulnerabilities.
Discusses the usage and effectiveness of CodeLlama in identifying common vulnerabilities in Ethereum smart contracts.

Plain English Explanation

Smart contracts are self-executing pieces of code that run on blockchain networks like Ethereum. They are designed to automatically enforce the terms of an agreement between parties. However, smart contracts can be vulnerable to various security issues, which can lead to financial losses or disruptions in the blockchain ecosystem.

The research paper investigates these vulnerabilities and introduces CodeLlama, a tool that can detect gas-driven vulnerabilities in smart contracts. Gas is the unit of measurement for the computational effort required to execute a transaction on the Ethereum network.

The paper explains how CodeLlama works and how it can be used to identify common smart contract vulnerabilities, such as integer overflows, reentrancy attacks, and denial of service attacks. The researchers also evaluate the effectiveness of CodeLlama in detecting these vulnerabilities and compare its performance with other existing tools.

Technical Explanation

The paper begins by discussing the importance of secure smart contracts in the blockchain ecosystem and the various types of vulnerabilities that can exist in these contracts. The researchers then introduce CodeLlama, a tool designed to detect gas-driven vulnerabilities in Ethereum smart contracts.

CodeLlama uses a combination of static and dynamic analysis techniques to identify vulnerabilities. The static analysis component examines the source code of the smart contract to detect potential issues, while the dynamic analysis component runs the contract on the Ethereum network to observe its behavior and identify gas-related vulnerabilities.

The researchers evaluate the performance of CodeLlama by applying it to a dataset of real-world Ethereum smart contracts and comparing its results to other vulnerability detection tools. The results show that CodeLlama is effective in identifying a range of vulnerabilities, including integer overflows, reentrancy attacks, and denial of service attacks.

Critical Analysis

The paper provides a comprehensive overview of smart contract vulnerabilities and the importance of addressing them. The introduction of CodeLlama, a tool specifically designed to detect gas-driven vulnerabilities, is a valuable contribution to the field of smart contract security.

However, the paper does not discuss the limitations of CodeLlama or the potential for false positives or false negatives in its vulnerability detection. Additionally, the paper does not address the broader challenges of smart contract security, such as the need for formal verification, secure coding practices, and improved developer education.

Further research could explore the integration of CodeLlama with other security tools and the development of more advanced techniques for detecting a wider range of vulnerabilities in smart contracts.

Conclusion

The research paper highlights the significance of smart contract vulnerabilities and introduces CodeLlama, a tool that can effectively detect gas-driven vulnerabilities in Ethereum smart contracts. By addressing these vulnerabilities, the blockchain ecosystem can become more secure and resilient, fostering greater trust and adoption of this transformative technology.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

ML Study of MaliciousTransactions in Ethereum

Natan Katz

Smart contracts are a major tool in Ethereum transactions. Therefore hackers can exploit them by adding code vulnerabilities to their sources and using these vulnerabilities for performing malicious transactions. This paper presents two successful approaches for detecting malicious contracts: one uses opcode and relies on GPT2 and the other uses the Solidity source and a LORA fine-tuned CodeLlama. Finally, we present an XGBOOST model that combines gas properties and Hexa-decimal signatures for detecting malicious transactions. This approach relies on early assumptions that maliciousness is manifested by the uncommon usage of the contracts' functions and the effort to pursue the transaction.

8/19/2024

🔎

New!Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities

Md Tauseef Alam, Raju Halder, Abyayananda Maiti

The large-scale deployment of Solidity smart contracts on the Ethereum mainnet has increasingly attracted financially-motivated attackers in recent years. A few now-infamous attacks in Ethereum's history includes DAO attack in 2016 (50 million dollars lost), Parity Wallet hack in 2017 (146 million dollars locked), Beautychain's token BEC in 2018 (900 million dollars market value fell to 0), and NFT gaming blockchain breach in 2022 ($600 million in Ether stolen). This paper presents a comprehensive investigation of the use of large language models (LLMs) and their capabilities in detecting OWASP Top Ten vulnerabilities in Solidity. We introduce a novel, class-balanced, structured, and labeled dataset named VulSmart, which we use to benchmark and compare the performance of open-source LLMs such as CodeLlama, Llama2, CodeT5 and Falcon, alongside closed-source models like GPT-3.5 Turbo and GPT-4o Mini. Our proposed SmartVD framework is rigorously tested against these models through extensive automated and manual evaluations, utilizing BLEU and ROUGE metrics to assess the effectiveness of vulnerability detection in smart contracts. We also explore three distinct prompting strategies-zero-shot, few-shot, and chain-of-thought-to evaluate the multi-class classification and generative capabilities of the SmartVD framework. Our findings reveal that SmartVD outperforms its open-source counterparts and even exceeds the performance of closed-source base models like GPT-3.5 and GPT-4 Mini. After fine-tuning, the closed-source models, GPT-3.5 Turbo and GPT-4o Mini, achieved remarkable performance with 99% accuracy in detecting vulnerabilities, 94% in identifying their types, and 98% in determining severity. Notably, SmartVD performs best with the `chain-of-thought' prompting technique, whereas the fine-tuned closed-source models excel with the `zero-shot' prompting approach.

9/18/2024

🔎

Explainable Ponzi Schemes Detection on Ethereum

Letterio Galletta, Fabio Pinelli

Blockchain technology has been successfully exploited for deploying new economic applications. However, it has started arousing the interest of malicious actors who deliver scams to deceive honest users and to gain economic advantages. Ponzi schemes are one of the most common scams. Here, we present a classifier for detecting smart Ponzi contracts on Ethereum, which can be used as the backbone for developing detection tools. First, we release a labelled data set with 4422 unique real-world smart contracts to address the problem of the unavailability of labelled data. Then, we show that our classifier outperforms the ones proposed in the literature when considering the AUC as a metric. Finally, we identify a small and effective set of features that ensures a good classification quality and investigate their impacts on the classification using eXplainable AI techniques.

4/19/2024

Soley: Identification and Automated Detection of Logic Vulnerabilities in Ethereum Smart Contracts Using Large Language Models

Majd Soud, Waltteri Nuutinen, Grischa Liebel

Modern blockchain, such as Ethereum, supports the deployment and execution of so-called smart contracts, autonomous digital programs with significant value of cryptocurrency. Executing smart contracts requires gas costs paid by users, which define the limits of the contract's execution. Logic vulnerabilities in smart contracts can lead to financial losses, and are often the root cause of high-impact cyberattacks. Our objective is threefold: (i) empirically investigate logic vulnerabilities in real-world smart contracts extracted from code changes on GitHub, (ii) introduce Soley, an automated method for detecting logic vulnerabilities in smart contracts, leveraging Large Language Models (LLMs), and (iii) examine mitigation strategies employed by smart contract developers to address these vulnerabilities in real-world scenarios. We obtained smart contracts and related code changes from GitHub. To address the first and third objectives, we qualitatively investigated available logic vulnerabilities using an open coding method. We identified these vulnerabilities and their mitigation strategies. For the second objective, we extracted various logic vulnerabilities, applied preprocessing techniques, and implemented and trained the proposed Soley model. We evaluated Soley along with the performance of various LLMs and compared the results with the state-of-the-art baseline on the task of logic vulnerability detection. From our analysis, we identified nine novel logic vulnerabilities, extending existing taxonomies with these vulnerabilities. Furthermore, we introduced several mitigation strategies extracted from observed developer modifications in real-world scenarios. Our Soley method outperforms existing methods in automatically identifying logic vulnerabilities. Interestingly, the efficacy of LLMs in this task was evident without requiring extensive feature engineering.

6/26/2024