区块链论文速读A刊-TSE 2024（1~7）一个智能合约缺陷大规模数据集；Rug Pull是什么

Journal：IEEE Transactions on Software Engineering

(TSE)

CCF level：CCF A

Categories：Software Engineering/System Software/Programming Languages

Time：1~7, 2024

Title:

ContractCheck: Checking Ethereum Smart Contracts in Fine-Grained Level

ContractCheck：细粒度检查以太坊智能合约

Authors:****

the School of Automation Science and Engineering, South China University of Technology

Key words:

Smart contract, blockchain security, vulnerability detection, neural network

智能合约、区块链安全、漏洞检测、神经网络

Abstract:****

The blockchain has been the main computing scenario for smart contracts, and the decentralized infrastructure of the blockchain is effectively implemented in a de-trusted and executable environment. However, vulnerabilities in smart contracts are particularly vulnerable to exploitation by malicious attackers and have always been a key issue in blockchain security. Existing traditional tools are inefficient in detecting vulnerabilities and have a high rate of false positives when detecting contracts. Some neural network methods have improved the detection efficiency, but they are not competent for fine-grained (code line level) vulnerability detection. We propose the ContractCheck model for detecting contract vulnerabilities based on neural network methods. ContractCheck extracts fine-grained segments from the abstract syntax tree (AST) and function call graph of smart contract source code. Furthermore, the segments are parsed into token flow retaining semantic information as uint, which are used to generate numerical vector sequences that can be trained using neural network methods. We conduct multiple rounds of experiments using a dataset constructed from 36,885 smart contracts and identified the optimal ContractCheck model structure by employing the Fasttext embedding vector algorithm and constructing a composite model using CNN and BiGRU for training the network. Evaluation on other datasets demonstrates that ContractCheck exhibits significant improvement in contract-level detection performance compared to other methods, with an increase of 23.60% in F1 score over the best existing method. Particularly, it achieves fine-grained detection based on neural network methods. The cases provide indicate that ContractCheck can effectively assist developers in accurately locating the presence of vulnerabilities, thereby enhancing the security of Ethereum smart contracts.

区块链一直是智能合约的主要计算场景，区块链去中心化的基础架构有效实现在去信任、可执行的环境中。但智能合约中的漏洞特别容易被恶意攻击者利用，一直是区块链安全的关键问题。现有的传统工具在检测合约时漏洞检测效率低下、误报率较高，一些神经网络方法虽然提高了检测效率，但无法胜任细粒度（代码行级）的漏洞检测。我们提出了基于神经网络方法检测合约漏洞的ContractCheck模型。ContractCheck从智能合约源代码的抽象语法树（AST）和函数调用图中提取细粒度的片段，并将它们解析为保留语义信息的token流作为uint，用于生成可使用神经网络方法训练的数值向量序列。我们对36885个智能合约构成的数据集进行了多轮实验，采用Fasttext嵌入向量算法，构建CNN与BiGRU复合模型训练网络，确定了最优的ContractCheck模型结构。在其他数据集上的评估表明，ContractCheck在合约层面的检测性能相比其他方法有显著提升，F1得分比现有最优方法提升了23.60%，尤其实现了基于神经网络方法的细粒度检测。案例表明ContractCheck可以有效协助开发者精准定位漏洞的存在，从而提升以太坊智能合约的安全性。

图 1. 真实智能合约整数上溢和下溢漏洞示例。漏洞发生在第 257 行，但整个合约包含 303 行代码。第 257 行中，变量 amout、cnt 和 value 均定义为 uint256 类型，存在乘法溢出的可能性。变量 cnt 表示转账接收者的数量。如果恶意攻击者控制 2^255 个账户，变量 value 等于 2，而变量 amout 超出 uint256 类型规定的最大值（值为 2^256 − 1），最终结果将溢出并等于 0。

图 2. 2020 年至 2022 年四个智能合约漏洞相关的安全事件日期标签，包括整数溢出/下溢 (IO/U)、拒绝服务 (DoS)、通过 tx.origin 授权 (TX) 和时间操纵 (Time)。

abstract syntax tree (AST)

AST-based Code Slices (ACSs)

AST-based Flow Units (AFUs)

图 7. 一个真实的以太坊智能合约示例，地址为 0x2019763bd984cce011cd9b55b0e700abe42fa6c7。在 (a) 中，我们展示了使用函数调用图提取合约的元路径，并结合源代码的抽象语法树结构得出 (b) 中的 AFU 实例。AFU 总共涉及 10 行代码，仅占原始代码总行数（不包括空行）的 29.41%。

Pdf link:

https://ieeexplore.ieee.org/document/10531111

Title:

DAppSCAN: Building Large-Scale Datasets for Smart Contract Weaknesses in DApp Projects

DAppSCAN：为 DApp 项目中的智能合约缺陷构建大规模数据集

Authors:****

School of Software Engineering, Sun Yat-sen University

Key words:

Empirical study, Smart contracts, SWC weakness, dataset, ethereum

实证研究、智能合约、SWC 缺陷、数据集、以太坊

Abstract:****

The Smart Contract Weakness Classification Registry (SWC Registry) is a widely recognized list of smart contract weaknesses specific to the Ethereum platform. Despite the SWC Registry not being updated with new entries since 2020, the sustained development of smart contract analysis tools for detecting SWC-listed weaknesses highlights their ongoing significance in the field. However, evaluating these tools has proven challenging due to the absence of a large, unbiased, real-world dataset. To address this problem, we aim to build a large-scale SWC weakness dataset from real-world DApp projects. We recruited 22 participants and spent 44 person-months analyzing 1,199 open-source audit reports from 29 security teams. In total, we identified 9,154 weaknesses and developed two distinct datasets, i.e., DAppSCAN-Source and DAppSCAN-Bytecode . The DAppSCAN-Source dataset comprises 39,904 Solidity files, featuring 1,618 SWC weaknesses sourced from 682 real-world DApp projects. However, the Solidity files in this dataset may not be directly compilable for further analysis. To facilitate automated analysis, we developed a tool capable of automatically identifying dependency relationships within DApp projects and completing missing public libraries. Using this tool, we created DAppSCAN-Bytecode dataset, which consists of 6,665 compiled smart contract with 888 SWC weaknesses. Based on DAppSCAN-Bytecode , we conducted an empirical study to evaluate the performance of state-of-the-art smart contract weakness detection tools. The evaluation results revealed sub-par performance for these tools in terms of both effectiveness and success detection rate, indicating that future development should prioritize real-world datasets over simplistic toy contracts.

智能合约缺陷分类注册表 (SWC Registry) 是一份广为人知的以太坊平台特有的智能合约弱点列表。尽管 SWC Registry 自 2020 年以来没有更新新条目，但用于检测 SWC 所列弱点的智能合约分析工具的持续发展凸显了它们在该领域的持续重要性。但是，由于缺乏大量无偏见的真实数据集，评估这些工具已被证明具有挑战性。为了解决这个问题，我们旨在从现实世界的 DApp 项目中构建一个大规模的 SWC 缺陷数据集。我们招募了 22 名参与者，花了 44 个人月的时间分析了来自 29 个安全团队的 1,199 份开源审计报告。总的来说，我们发现了 9,154 个缺陷，并开发了两个不同的数据集，即 DAppSCAN-Source 和 DAppSCAN-Bytecode。DAppSCAN-Source 数据集包含 39,904 个 Solidity 文件，其中包含来自 682 个真实 DApp 项目的 1,618 个 SWC 弱点。但此数据集中的 Solidity 文件可能无法直接编译以进行进一步分析。为了促进自动化分析，我们开发了一个工具，该工具能够自动识别 DApp 项目内的依赖关系并补全缺失的公共库。使用此工具，我们创建了 DAppSCAN-Bytecode 数据集，其中包含 6,665 个已编译的智能合约和 888 个 SWC 弱点。基于 DAppSCAN-Bytecode，我们进行了一项实证研究，以评估最先进的智能合约弱点检测工具的性能。评估结果显示，这些工具在有效性和成功检测率方面的表现均低于标准，这表明未来的开发应该优先考虑真实世界的数据集，而不是简单的玩具合约。

They open the whole dataset to the public at: https://github.com/InPlusLab/ DAppSCAN/.

Pdf link:

https://ieeexplore.ieee.org/document/10486822

Title:

CRPWarner: Warning the Risk of Contract-Related Rug Pull in DeFi Smart Contracts

CRPWarner：警告DeFi智能合约中与合约相关的Rug Pull风险

Authors:****

School of Software Engineering, Sun Yat-sen University

Key words:

Smart contracts, decentralized finance, rug pull, datalog analysis

智能合约、去中心化金融、rug pull、数据记录分析

Abstract:****

In recent years, Decentralized Finance (DeFi) has grown rapidly due to the development of blockchain technology and smart contracts. As of March 2023, the estimated global cryptocurrency market cap has reached approximately $949 billion. However, security incidents continue to plague the DeFi ecosystem, and one of the most notorious examples is the "Rug Pull" scam. This type of cryptocurrency scam occurs when the developer of a particular token project intentionally abandons the project and disappears with investors' funds. Despite only emerging in recent years, Rug Pull events have already caused significant financial losses. In this work, we manually collected and analyzed 103 real-world rug pull events, categorizing them based on their scam methods. Two primary categories were identified: Contract-related Rug Pull (through malicious functions in smart contracts) and Transaction-related Rug Pull (through cryptocurrency trading without utilizing malicious functions). Based on the analysis of rug pull events, we propose CRPWarner (short for C ontract-related R ug P ull Risk Warner ) to identify malicious functions in smart contracts and issue warnings regarding potential rug pulls. We evaluated CRPWarner on 69 open-source smart contracts related to rug pull events and achieved a 91.8% precision, 85.9% recall, and 88.7% F1-score. Additionally, when evaluating CRPWarner on 13,484 real-world token contracts on Ethereum, it successfully detected 4168 smart contracts with malicious functions, including zero-day examples. The precision of large-scale experiments reaches 84.9%.

近年来，由于区块链技术和智能合约的发展，去中心化金融 (DeFi) 迅速发展。截至 2023 年 3 月，全球加密货币市值估计已达到约 9490 亿美元。然而，安全事件继续困扰着 DeFi 生态系统，其中最臭名昭著的例子之一就是"Rug Pull"骗局。这种类型的加密货币骗局发生在特定代币项目的开发者故意放弃该项目并卷走投资者的资金时。尽管 Rug Pull 事件是近年来才出现的，但它已经造成了重大的财务损失。在这项工作中，我们手动收集并分析了 103 起现实世界中的 Rug Pull 事件，并根据其诈骗方法对其进行分类。确定了两个主要类别：与合约相关的 Rug Pull（通过智能合约中的恶意功能）和与交易相关的 Rug Pull（通过加密货币交易而不利用恶意功能）。基于对 rug pull 事件的分析，我们提出了 CRPWarner（C ontract-related R ug Pull Risk Warner 的缩写）来识别智能合约中的恶意功能并对潜在的 rug pull 发出警告。我们在 69 个与 rug pull 事件相关的开源智能合约上对 CRPWarner 进行了评估，取得了 91.8% 的准确率、85.9% 的召回率和 88.7% 的 F1 分数。此外，在 13,484 个以太坊上的真实代币合约上对 CRPWarner 进行评估时，它成功检测到了 4168 个具有恶意功能的智能合约，包括零日漏洞示例。大规模实验的准确率达到 84.9%。

Pdf link:

https://ieeexplore.ieee.org/document/10515209

关注我们，持续接收区块链最新论文

洞察区块链技术发展趋势

Insight into Blockchain Technology Trends

区块链论文速读A刊-TSE 2024（1~7） 一个智能合约缺陷大规模数据集；Rug Pull是什么

区块链论文速读A刊-TSE 2024（1~7）一个智能合约缺陷大规模数据集；Rug Pull是什么