注:本文为 " Oxidizr | Oxidizer " 相关合辑。
英文引文,机翻未校。
略作重排,如有内容异常,请看原文。
Oxidizing Ubuntu: adopting Rust utilities by default
系统革新:Ubuntu 默认采用 Rust 编写工具程序
By Joe Brockmeier
作者:乔·布罗克迈尔
March 18, 2025
If all goes according to plan, the Ubuntu project will soon be replacing many of the traditional GNU utilities with implementations written in Rust, such as those created by the uutils project, which we covered in February. Wholesale replacement of core utilities at the heart of a Linux distribution is no small matter, which is why Canonical's VP of engineering, Jon Seager, has released oxidizr. It is a command-line utility that helps users easily enable or disable the Rust-based utilities to test their suitability. Seager is calling for help with testing and for users to provide feedback with their experiences ahead of a possible switch for Ubuntu 25.10, an interim release scheduled for October 2025. So far, responses from the Ubuntu community seem positive if slightly skeptical of such a major change.
若计划顺利推进,Ubuntu 项目即将使用 Rust 编写程序替换大量传统 GNU 工具,uutils 项目产出工具便在替换范围内,本刊曾于 2 月对该项目进行相关报道。大规模替换 Linux 发行版内置基础工具属于重大调整,为此佳能公司工程副总裁乔恩·西格尔推出命令行工具 oxidizr。该工具可便捷启停 Rust 版本程序,完成兼容性测试。Ubuntu 25.10 为阶段性版本,定于 2025 年 10 月发布,官方计划在此版本完成工具切换,西格尔号召用户参与测试并反馈使用体验。目前社区整体态度偏向接纳,同时也对此次大幅改动存在顾虑。
Next 20 years of Ubuntu
Ubuntu 未来二十年发展规划
Ubuntu celebrated 20 years since its first release in 2024 last year. Seager reflected on that milestone and published his vision for the next 20 years of Ubuntu in February. One of his themes for the future is modernization, calling on the project to constantly assess the foundations of the distribution against the needs of its users:
Ubuntu 于 2024 年迎来首个版本发布二十周年。西格尔回顾这一发展节点,并在 2 月发布发展构想,规划系统未来二十年发展方向。现代化升级是发展主线之一,项目团队需要结合用户需求,持续评估系统底层组件。
We should look deeply at the tools we ship with Ubuntu by default - selecting for tools that have resilience, performance and maintainability at their core. There are countless examples in the open source community of tools being re-engineered, and re-imagined using tools and practices that have only relatively recently become available. Some of my personal favourites include command-line utilities such as eza, bat, and helix, the new ghostty terminal emulator, and more foundational projects such as the uutils rewrite of coreutils in Rust. Each of these projects are at varying levels of maturity, but have demonstrated a vision for a more modern Unix-like experience that emphasizes resilience, performance and usability.
我们需要审视系统预装工具,选取稳定性、运行效率与可维护性表现优异的程序。开源社区中,大量工具依托新兴技术与开发模式完成重构迭代。个人较为认可的程序包含命令行工具 eza、bat、helix,全新终端模拟器 ghostty,以及采用 Rust 重构基础工具集的 uutils 项目。各类项目完善程度各不相同,均致力于打造稳定性、效率与易用性兼具的现代化类 Unix 使用体验。
On March 12, Seager published a a follow-up to introduce his plan to start adopting some of the tools as defaults---with an eye to having them in place for the next Ubuntu long-term support (LTS) release, 26.04. The rationale for the switch is primarily "the enhanced resilience and safety that is more easily achieved with Rust ports". He cited a blog post by Rust core developer Niko Matsakis. The post, in a nutshell, is about Matsakis's vision for using Rust to write (or rewrite) foundational software; that is, "the software that underlies everything else".
3 月 12 日,西格尔发布后续文章,公布工具默认替换方案,目标在 26.04 长期支持版本完成部署。选用 Rust 重构程序,主要原因在于该语言能够显著提升程序稳定性与运行安全性。他引用 Rust 核心开发者尼科·松崎的技术博文,文章核心观点为依托 Rust 编写重构系统底层支撑软件。
Those who have been following the continuing debates and discussions about using Rust will find familiar themes in Matsakis's arguments in its favor: Rust provides the performance of C/C++ without demanding perfection from developers, it provides reliability, and it makes developers more productive regardless of experience level. Its reliability makes it particularly suitable for foundational software because "when foundations fail, everything on top fails also". Given Ubuntu's widespread adoption, Seager wrote, "it behooves us to be absolutely certain we're shipping the most resilient and trustworthy software we can".
长期关注 Rust 应用讨论可发现,松崎的观点具备共识性:Rust 可媲美 C/C++ 的运行效率,对开发人员编写规范无严苛要求,程序运行稳定,能够适配不同技术水平开发者,提升开发效率。底层程序故障会引发上层全部应用异常,稳定性特质让 Rust 适配底层软件开发。Ubuntu 受众基数庞大,系统预装程序必须具备极高的稳定性与可靠性。
Seager also thinks that embracing Rust will help meet another of his goals for Ubuntu, increasing the number of contributors. Not because Rust is necessarily easier to use than C, but because it provides a framework that makes it harder for contributors to commit potentially unsafe code. Presumably, though it was unsaid, that would make Rust a more attractive language for those interested in contributing but not interested in programming in C for whatever reason.
西格尔认为引入 Rust 还能扩充项目贡献者规模。该语言上手难度未必低于 C 语言,但其语法框架可规避高危代码编写行为。对于有意参与开源开发、不愿使用 C 语言的开发者而言,Rust 具备更高吸引力。
oxidizr
oxidizr 工具介绍
The abstract possibility that Rust utilities would be better, or even feasible, for Ubuntu is no substitute for hands-on experience. To that end, Seager created oxidizr as a way to quickly swap in (and out) Rust utilities in place of the traditional counterparts with relatively low risk. He released the first version, 1.0.0, on March 7. It is available under the Apache 2.0 license and, as one might expect, written in Rust.
理论层面无法判定 Rust 工具适配 Ubuntu 的实际效果,实操测试必不可少。西格尔开发 oxidizr 工具,可低风险快速切换传统工具与 Rust 版本工具。该工具首个版本 1.0.0 于 3 月 7 日发布,基于 Apache 2.0 协议开源,本体采用 Rust 编写。
The project is not yet packaged for Ubuntu, nor does Seager have a personal package archive (PPA) set up for users to install oxidizr with APT. There are binary releases on GitHub, or users can install the tool using cargo:
该工具暂无 Ubuntu 系统安装包,也未搭建个人软件源供 APT 命令安装。用户可获取 GitHub 编译成品,也可通过 cargo 执行安装操作。
$ cargo install --git https://github.com/jnsgruk/oxidizr
The binary releases may be the easiest way to get started, as oxidizr requires the exists function in the fs module, but exists was added in Rust 1.81.0 (released in September 2024) while the Rust version in Ubuntu 24.10 is still at 1.80.1. I used rustup to install the most recent stable version of Rust, and then used cargo to install oxidizr.
直接使用编译成品操作门槛更低。工具依赖文件系统模块 fs 内的 exists 函数,该函数在 2024 年 9 月推出的 Rust 1.81.0 版本新增,Ubuntu 24.10 内置 Rust 版本为 1.80.1,无法满足运行条件。可借助 rustup 升级至最新稳定版 Rust,再完成工具安装。
The oxidizr utility calls a set of utilities that can be independently replaced an "experiment". Experiments are Rust modules that define the packages to be installed (or removed) and handle renaming of the utilities to enable or disable use of the Rust versions. The current set of experiments include replacing GNU coreutils, findutils, or diffutils with the uutils coreutils, findutils, or diffutils, as well as replacing traditional sudo with the Rust-based sudo-rs.
可单独替换的工具组被定义为测试项,测试模块基于 Rust 编写,用于管控程序安装卸载、程序别名修改,以此切换工具版本。现有测试项目包含:使用 uutils 工具集替换 GNU coreutils、findutils、diffutils,以及采用 Rust 编写的 sudo-rs 替换原生 sudo。
For instance, to try out sudo-rs a user would run this command:
启用 sudo-rs 可执行下述指令:
# oxidizr enable --experiments sudo-rs
That will install the sudo-rs package from the Ubuntu package repository, back up the sudo binary, and create a /usr/bin/sudo symbolic link that targets the Rust binary (/usr/lib/cargo/bin/sudo). To enable all experiments, a user would use the all target instead:
指令会从官方软件源安装程序,备份原生 sudo 可执行文件,并创建软链接 /usr/bin/sudo 指向 Rust 版本程序 /usr/lib/cargo/bin/sudo。执行下述指令可开启全部替换项目:
# oxidizr enable --experiments --all
Finally, to revert the system to the traditional utilities and remove the replacement packages from the system:
执行下述指令即可恢复默认工具,并卸载替换程序:
# oxidizr disable --all
According to Seager, oxidizr works on all versions of Ubuntu after 24.04 LTS, but the uutils diffutils experiment is only supported on Ubuntu 24.10 or later. He did urge users to start testing on a virtual machine or other machine that is not their production workstation or server for safety's sake. Seager reported that he hasn't had many problems, but he has run into one incompatibility: the uutils cp, mv, and ls replacements don't support the -Z flag yet, which is used to set the SELinux context of a file or (in the case of ls) print a file's security context.
该工具适配 24.04 长期支持版及后续 Ubuntu 系统,diffutils 替换功能仅支持 24.10 及更高版本。为保障业务设备稳定,官方建议用户在虚拟机或非生产设备中开展测试。西格尔自身测试过程整体平稳,仅发现一处兼容缺陷:uutils 版本的 cp、mv、ls 暂未兼容 -Z 参数,该参数用于配置文件 SELinux 安全上下文,也可查询文件安全权限信息。
In my brief testing, I did not run into any problems with the uutils versions of the utilities or the changes oxidizr made to the system in order to swap them in. However, I did note that oxidizr does not make any changes to the system's man pages. Even when the GNU utilities have been replaced with the uutils versions, the GNU man pages are left in place, so "man cp" still displays the GNU version. It would be good to switch the man pages too in order to expose users to any gaps in the uutils documentation as well as the utilities themselves.
短期测试中,工具替换流程与程序运行均未出现异常。该工具不会同步修改系统帮助手册,即便程序完成替换,执行 man cp 依旧调取 GNU 版本说明文档。同步更新手册,能够直观发现程序功能与文档描述存在的偏差。
Reactions
社区各方反馈
Fern Dziadulewicz asked if the move toward uutils meant that "Ubuntu is actually kind of heading towards GNUlessness", as with some other Linux distributions that shy away from GNU components. Seager responded that people should not read too much into the change:
弗恩·贾杜莱维奇提出疑问,此次替换是否意味着 Ubuntu 逐步脱离 GNU 组件,效仿部分规避 GNU 程序的发行版。西格尔作出回应,无需过度解读本次调整。
This is not symbolic of any pointed move away from GNU components - it's literally just about replacing coreutils with a more modern equivalent.
本次调整不存在摒弃 GNU 组件的倾向,仅以现代化程序替换基础工具集。
Sure, the license is different, and it's a consideration, but it's by no means a driver in the decision making.
两类程序开源协议存在差异,该因素仅作为参考条件,并非调整的主导原因。
That response did not satisfy Joseph Erdosy, who wrote that he would migrate to Fedora or Rocky Linux if Ubuntu goes through with the change. He said that he liked Rust and the idea of better, memory-safe alternatives, but that he was unhappy that the biggest "oxidized" project was an MIT-licensed rewrite of GPL-licensed code.
该答复未能打消约瑟夫·埃尔多西的异议,他发文表示,若项目落实替换方案,自己将切换至 Fedora 或 Rocky Linux 系统。其认可 Rust 语言的内存安全优势,不满核心重构项目将 GPL 协议代码改写为 MIT 协议程序。
This decision seems to align with a broader trend of companies deprecating GPL software in favour of more permissively licensed alternatives, often under the guise of "modernization." However, the real-world impact is clear: free software is increasingly co-opted into proprietary ecosystems, weakening the principles that made Linux successful.
该举措契合行业趋势,企业常以现代化升级为由,舍弃 GPL 协议程序,选用约束性更低的开源程序。此类行为会让自由软件逐步纳入闭源体系,背离 Linux 发展初衷。
A few other users quickly agreed with Erdosy, then Ian Weisser announced that he was putting the topic into "Slow Mode" to "prevent piling-on until the developers have a chance to respond and keep this topic constructive". Shortly after, Seager responded that he did not agree that this potential move posed a threat to Ubuntu, or its community. He reiterated that it was not indicative of a political agenda or wider move away from GPL'ed software, and said that most of Canonical's own software is and would continue to be GPL'ed.
多名用户认同该观点,伊恩·韦瑟随即设置话题慢审模式,避免舆论偏激发酵,等待开发团队正式回应。西格尔后续再次表态,此次调整不会损害系统与社区发展,不存在排斥 GPL 协议软件的规划,佳能自研程序仍将沿用 GPL 协议。
Ubuntu is a collection of software that we curate to build a distribution. It's a project dedicated to shipping the latest, and best open source we can find. There is no evidence of foul play, bad practice or poor intentions from the uutils maintainers - they're a thoughtful, dedicated community who are building their own software, and even contributing back to GNU coreutils in some cases. They are achieving things I think we should aspire to with Ubuntu in the coming years, and I remain committed to giving this a chance at success - noting that we and others will need to work closely with them to resolve issues with locales, selinux support and other issues.
Ubuntu 整合各类开源程序搭建发行版,择优收录优质开源软件。uutils 开发团队合规开发,态度严谨专注,部分开发成果还回馈至 GNU 基础工具项目。该项目的研发方向具备参考价值,Ubuntu 将持续推进合作,协同解决区域语言、SELinux 适配等兼容问题。
If the current situation changes and we believe that the interests of the uutils project are no longer aligned with those of Ubuntu, we can change the coreutils package we choose to ship with Ubuntu.
若后续项目发展方向与 Ubuntu 发展诉求相悖,官方可重新选定预装基础工具程序。
Sergey Davidoff wondered why the Debian alternatives system, which is used to designate default applications when multiple programs with the same function are installed, was not sufficient for experimenting with Rust utilities. Julian Andres Klode replied that the alternatives system would not be suitable because the existing package would need to cooperate. He also responded to another user, "rain", who had floated the idea of allowing users to switch out individual commands. Klode said that it was a bad idea to allow users to select between Rust and non-Rust implementations on a per-command level, as it would make the resulting systems hard to support.
谢尔盖·达维多夫提出疑问,系统自带Debian 备选程序机制可切换同功能默认程序,为何无法满足 Rust 工具测试需求。朱利安·安德烈斯·克洛德作出解答,该机制运行依赖原有程序适配配合,不适用于本次测试。另有用户提议支持单条命令自由切换程序版本,该提议遭到否决,逐命令切换会大幅提升系统运维难度。
Liam Proven asked about support on versions of Ubuntu for architectures other than x86_64 and Arm, such as s390 and ppc64le, since "the LLVM Rust toolchain is still a little immature and code generation for other architectures is lacking". Uutils project founder and Ubuntu developer Sylvestre Ledru asked if Proven had any bug reports to share, since Firefox had been using the LLVM Rust toolchain to ship Firefox on those architectures for years. He pointed out that uutils had been successfully building on Debian and Ubuntu with those architectures as targets for a few years as well.
利亚姆·普罗文询问,LLVM 版 Rust 编译工具尚未成熟,跨架构编译能力不足,s390、ppc64le 等架构能否适配新工具。uutils 项目创始人、Ubuntu 开发人员西尔韦斯特·勒德吕回应,火狐浏览器多年来依托该编译链适配对应架构,uutils 项目也已稳定在多款异构架构的 Debian、Ubuntu 系统完成编译运行,暂未发现相关故障。
Next steps
后续推进计划
Seager said that he had met with Ledru to discuss the idea of making uutils coreutils the default in Ubuntu 25.10, and Ledru felt that the project was ready for that level of exposure. Now it is just a matter of specifics, he said, and the Ubuntu Foundations team is already working up a plan to implement this in the next release cycle. He did acknowledge that there was a need for caution and was open to the possibility that he would need to "scale back on the ambition" if making the switch meant compromising stability or reliability in an Ubuntu LTS release. If the switch doesn't work out, it should be easy enough to revert in time for next year's LTS release.
西格尔表示已与勒德吕洽谈,商议在 Ubuntu 25.10 将 uutils 基础工具设为默认程序,对方判定项目具备上线条件。目前团队着手敲定落地细则,底层开发组已制定迭代部署方案。官方秉持审慎态度,若切换操作会影响长期支持版本稳定性,将缩减替换范围。若适配效果未达预期,可在次年长期版本发布前恢复原有工具配置。
To date, Ubuntu seems to be the first major Linux distribution that has seriously considered a switch to uutils. If Ubuntu 25.10 ships with uutils coreutils, it will be a significant win for the uutils project that grants exposure to a much larger user base than it has enjoyed so far. The "oxidize Ubuntu" experiment has the potential to accelerate Rust's adoption and inspire further attempts to replace C-based utilities with Rust, or it might have a chilling effect if Ubuntu runs into serious problems. Either way, the project should be instructive for the larger community.
Ubuntu 是首个正式评估引入 uutils 工具集的主流 Linux 发行版。若 25.10 版本完成预装,该项目用户规模将大幅增长。本次系统改造能够推动 Rust 语言普及,带动更多 C 语言工具重构开发;若出现严重运行故障,也会放缓同类替换项目推进节奏。无论最终结果如何,本次实践都可为整个开源行业提供参考经验。
评论区观点
许可证、性能、兼容性、GNU 路线、实际 bug
一、许可证保护力度争议(Weakened license protection)
-
I'm also one of the people who're concerned about the weakened protection that comes with the MIT license over a GPL/copyleft variant, which over time erodes the shared commons and is much more open to exploitation.
我也担心 MIT 许可证相比 GPL/著佐权许可证保护力度减弱,长期会侵蚀开源公共资源,更容易被商业方利用。
-
I kinda doubt a company will take these utils, close source them, and resell them without redistributing sources. It would bring only marginal benefit.
我怀疑企业不会拿这些工具闭源再转售而不开放源码,收益很小。
-
I am much more worried about new, innovative implementations with a higher degree of complexity. For instance rsync, or the new ripgrep implementation are much more sophisticated and would be more worrisome without copyleft.
我更担心复杂度高、创新性强的工具(如 rsync、ripgrep),没有著佐权保护风险更大。
-
Those who value copyleft should realize that C isn't the state of the art anymore... Algebraic types are more expressive, dependent typing allows specifying useful invariants, automatic memory management makes a lot of sense now that systems have 64+ gigabytes of RAM.
重视著作权的人该意识到 C 已非最先进技术:代数数据类型、依赖类型、自动内存管理在大内存时代更有优势。
-
Sorry, why can't GNU use Rust?
为什么 GNU 不能用 Rust?
-
I obviously don't speak for the GNU project, but I would assume they prefer code they can compile with gcc, and the gcc-based Rust compiler isn't quite there yet.
GNU 倾向用 GCC 编译 ;基于 GCC 的 Rust 编译器(gccrs)尚未成熟。
-
The gcc-based Rust compiler is still a long way ahead of the gcc-based compiler for a hypothetical GNU language that hasn't been invented yet... If there's an urgent need to defend copyleft, you can't afford to pause and build a whole new language first.
与其造新语言,不如先完善 gccrs;保护著佐权刻不容缓,不能等新语言。
-
What if somebody manages to patent a borrow checker?
会不会有人给 borrow checker(借用检查器) 申请专利?
-
Obviously patent trolls are a huge problem for small and independent projects... Rust is pretty clearly prior art, but it's not even the first language to use a borrow checker. Cyclone is older.
专利流氓是小项目大威胁,但 Rust 是现有技术(prior art),Cyclone 更早用借用检查器,专利风险低。
-
Lots of information here... The license issue is addressed during the talk.
许可证决策背景可参考 FOSDEM 2025 演讲。
-
Real bummed to learn the uutils project is MIT licenced and not GPL. Anyone have any background on that decision?
很失望 uutils 用 MIT 而非 GPL,想知道决策原因。
二、性能争议(Performance concerns when heavily used in scripts)
-
That's 4.6 times slower. Same for other tools, wc is 3 times slower.
慢 4.6 倍 ,
wc慢 3 倍。 -
Loading (and dynamically linking) a 23 MB executable thousands of times per second in scripts is going to cost a lot.
多调用二进制(23MB)频繁加载+动态链接,脚本开销巨大。
-
There is room for performance optimization in Rust side.
Rust 端有性能优化空间。
-
It's good to have all binaries first, and optimize later.
先把功能做全,再优化性能。
-
If every call to a uutil incurs that kind of overhead existing scripts will be noticeably slower... find, sort may end up being faster, depending on the size of their working set.
频繁调用小工具会拖慢脚本,但 find/sort 等复杂工具可能更快。
三、系统替换与兼容性(More robust oxidizr behavior?)
-
/usr/bin is off limits and should only be written to by distro package managers like dpkg.
/usr/bin 属于包管理器管控,不应被第三方工具随意修改。 -
Code that calls various binaries by their absolute path does exist... such code is brittle by definition.
存在硬编码绝对路径 的脚本,但这种设计脆弱易坏。
-
dpkg-divert does not require cooperation... it is a powerful tool that lets you move arbitrary packaged files out of the way permanently.
dpkg-divert 可强制迁移系统文件,无需原包配合。 -
The Filesystem Hierarchy Standard (FHS) requires some basic utilities to be in /bin.
FHS 标准要求基础工具放在 /bin。 -
POSIX does not guarantee /bin/sh is fixed path.
POSIX 不保证 /bin/sh 路径固定。
四、手册页问题(Manpages are important)
-
As long as they do update the man pages I'll be happy.
只要**更新手册页(man pages)**就可以接受。
-
The GNU manpages feature an "AUTHORS" and "REPORTING BUGS" section. That should at least be removed.
复用 GNU 手册需删除作者、漏洞反馈等专属段落。
五、实际故障(od tool broken in uutils)
-
xfstests fail with current Ubuntu due to the regression in switching to uutils vs od which doesn't handle case management of output properly.
uutils 的 od 工具大小写输出错误,导致 xfstests 失败。
-
There are also missing tools not just broken tools.
不仅工具损坏,还存在工具缺失。
-
sudo apt-get remove coreutils-from-uutils --allow-remove-essential fixes the regression.
卸载 uutils、回退 GNU coreutils 可修复所有故障。
Copyright © 2025, Eklektix, Inc.
版权所有 © 2025 埃克莱蒂克斯公司
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
本文依据知识共享署名-相同方式共享 4.0协议允许二次分发
Comments and public postings are copyrighted by their creators.
评论及公开内容版权归发布者所有
Linux is a registered trademark of Linus Torvalds
Linux 为莱纳斯·托瓦尔兹注册商标
Oxidizr: Try Modern Rust Alternatives To Legacy Unix Tools In Ubuntu
Oxidizr:在 Ubuntu 系统中体验旧式 Unix 工具的现代 Rust 替代程序
Modernize Ubuntu Tools with Oxidizr and Rust.
借助 Oxidizr 与 Rust 实现 Ubuntu 工具现代化升级
Written by Sk Published: May 9, 2025
作者:Sk 发布时间:2025 年 5 月 9 日
Looking to try modern Rust-based replacements for classic Unix tools on your Ubuntu system? Say hello to oxidizr , a lightweight command-line utility designed to help you experiment with safer, memory-safe Rust alternatives to traditional Unix utilities .
想要在 Ubuntu 系统中体验经典 Unix 工具的现代 Rust 替代程序?不妨了解 oxidizr,这款轻量级命令行工具可协助用户试用安全性更高、具备内存安全特性的传统 Unix 工具 Rust 替代版本。
Many core Unix tools in Ubuntu are written in C , a language known for its speed but also for its potential security risks.
Ubuntu 系统内多数 Unix 核心工具均采用 C 语言编写,该语言运行速度出众,但存在潜在安全隐患。
That's why Ubuntu is gradually replacing some of these tools with versions written in Rust , a programming language built from the ground up for safety and reliability.
正因如此,Ubuntu 逐步将部分工具替换为 Rust 语言版本,这门编程语言从底层架构出发,兼顾运行安全与程序稳定性。
Oxidizr makes this transition easy. It lets you toggle between legacy tools and their Rust-based counterparts without making permanent changes to your system.
Oxidizr 简化了工具切换流程,用户可在旧式工具与 Rust 重构版本之间自由切换,不会对系统产生永久性改动。
Introduction
简介
Ubuntu is a popular Linux operating system used by millions around the world, from everyday computers to large servers.
Ubuntu 是应用广泛的 Linux 操作系统,全球海量个人计算机与大型服务器均采用该系统。
The tools that make up its core, like the sudo command you use to run things as an administrator, are incredibly important for keeping everything safe and running smoothly.
系统内置各类核心工具,例如用于管理员权限操作的 sudo 指令,是保障系统稳定安全运行的重要组成部分。
But even the most trusted tools can sometimes have weaknesses. Many of these core tools were written a long time ago in C programming language.
即便是常用可靠工具也存在缺陷,这类核心工具大多早年基于 C 语言开发完成。
While C is powerful, it can sometimes allow certain types of errors that can be exploited by malicious actors to gain access or disrupt systems.
C 语言功能强大,但会产生部分程序漏洞,不法分子可利用漏洞入侵系统、破坏设备运行状态。
This is where Rust comes in. Rust is a more modern programming language known for its strong focus on safety , especially when it comes to handling computer memory.
Rust 就此发挥作用,这门现代化编程语言高度侧重运行安全,内存处理方面的安全特性尤为突出。
By design, Rust helps prevent many of the common mistakes that can lead to security problems in C code.
Rust 的语言特性可规避 C 语言代码中多数易引发安全故障的常规程序错误。
Ubuntu is now taking steps to replace some of its foundational tools with newer versions written in Rust.
Ubuntu 目前逐步将系统基础工具更替为 Rust 编写的全新版本。
This effort is part of a plan called "Carefully But Purposefully Oxidising Ubuntu" . The goal is to make Ubuntu even more resilient and trustworthy.
该替换工作隶属于 "审慎稳步推进 Ubuntu Rust 化改造" 规划,以此提升系统抗风险能力与使用可信度。
Meet Oxidizr: Test Rust-Based Tools on Ubuntu
认识 Oxidizr:在 Ubuntu 中测试 Rust 重构工具
To help developers and curious users explore these new Rust-based tools, a special utility called oxidizr has been created.
为方便开发者与爱好者体验全新 Rust 工具,官方推出专用工具 oxidizr。
What is oxidizr?
Oxidizr 是什么?
Simply put, oxidizr is a command-line tool for Ubuntu systems. It lets you experimentally swap out some traditional Unix tools with their modern Rust-based replacements.
简单来说,oxidizr 是适配 Ubuntu 系统的命令行程序,支持试验性地将传统 Unix 工具替换为现代 Rust 重构版本。
Think of it like having two versions of a tool available, and oxidizr helps you switch between them to see how the new one works.
可将其理解为同一工具拥有两个运行版本,借助 oxidizr 即可切换版本,测试新版工具运行效果。
This is useful for testing and seeing how ready these new tools are for everyday use.
该工具可用于功能测试,检验新版工具是否满足日常使用标准。
Which Rust-based Tools Can oxidizr Replace?
Oxidizr 可替换哪些 Rust 版本工具?
As of version 1.0.0, oxidizr supports swapping out the following sets of tools with Rust versions from the uutils and sudo-rs projects:
截至 1.0.0 版本,oxidizr 支持将工具替换为 uutils 与 sudo-rs 项目提供的 Rust 程序,涵盖类别如下:
-
uutils coreutils : This replaces many of the basic commands you use constantly, like
ls(list files) orcp(copy files).coreutilsandsudo-rsare the most complete experiments and are enabled by default inoxidizr.uutils 核心工具集 :替换
ls(文件罗列)、cp(文件复制)等高频基础指令。coreutils与sudo-rs成熟度最高,工具默认启用这两项试验功能。 -
uutils findutils : Tools for finding files.
uutils 文件查找工具集:用于检索系统文件。 -
uutils diffutils : Tools for comparing files and text.
uutils 差异对比工具集:比对文件与文本内容。 -
sudo-rs : A new implementation of the critical
sudocommand.
sudo-rs :核心指令sudo的全新重构程序。
These projects are being developed by the Trifecta Tech Foundation (TTF) , a non-profit focused on building secure, open-source infrastructure.
上述项目由 三方技术基金会(TTF) 开发,该非营利机构专注研发安全开源系统基础程序。
TTF also develop other important projects related to network time, data compression, and smart grids.
基金会同时研发网络时间同步、数据压缩、智能电网相关各类重要程序项目。
Why Use Rust for Tools like sudo and coreutils?
sudo、核心工具集为何选用 Rust 语言开发?
The main reason for rewriting tools like sudo in Rust is enhanced safety and security .
将 sudo 这类工具以 Rust 语言重构,目的在于提升程序运行防护等级。
The original sudo tool, while very important, has been the target of many security issues over the years due to memory-related vulnerabilities.
原版 sudo 工具作用关键,但历年多次因内存漏洞引发各类安全问题。
Rust's design provides strong guarantees against these kinds of memory safety issues .
Rust 语言架构可有效规避此类内存安全故障。
By using Rust for tools that handle important tasks like allowing users to run commands as the administrator (sudo), Ubuntu aims to make the system's security more robust.
管理员权限执行等核心操作工具采用 Rust 开发,能够强化 Ubuntu 系统整体安全防护能力。
For most people, using sudo-rs instead of the original sudo should feel exactly the same in day-to-day use . It's designed to be a near drop-in replacement.
普通用户日常操作中,sudo-rs 与原版 sudo 使用体验基本无差别,可直接替代原版程序运行。
However, the developers are taking a "less is more" approach, meaning some less common or outdated features of the original sudo might not be included.
开发团队秉持精简设计理念,原版 sudo 中部分小众、老旧功能不会纳入新版程序。
For example, distributing the sudoers file using something called LDAP is not planned for sudo-rs.
例如 sudo-rs 不会搭载基于轻型目录访问协议分发 sudoers 配置文件的相关功能。
Ubuntu plans to make sudo-rs the default sudo in the upcoming Ubuntu 25.10 release .
Ubuntu 计划在 Ubuntu 25.10 正式版本中,将 sudo-rs 设为默认权限指令程序。
This release is being used as a testing ground to gather feedback before possibly including sudo-rs by default in the next long-term support release, Ubuntu 26.04 LTS.
该版本作为测试版本收集使用反馈,后续考量将 sudo-rs 纳入 Ubuntu 26.04 长期支持版默认程序。
For more details, please refer our article in the link below:
更多相关详情可参考下方文章链接:
- Ubuntu 25.10 Replaces Sudo with Rust-Based sudo-rs
- Ubuntu 25.10 将采用 Rust 编写的 sudo-rs 替换原有 sudo 程序
How to Try Rust Alternatives to Unix Tools with Oxidizr
借助 Oxidizr 体验 Unix 工具 Rust 替代版本
If you are an advanced user or developer and want to test these new tools, oxidizr is available.
资深用户与开发人员可使用 oxidizr 测试全新工具程序。
BUT, and this is very important: oxidizr is an experimental tool! Using it MAY cause problems with your system, including losing data or preventing your computer from starting up .
重要提醒:oxidizr 属于试验性工具,使用该程序可能引发系统异常,出现数据丢失、设备无法开机等问题。
It is strongly recommended to use oxidizr only on a test machine or a virtual machine .
建议仅在测试设备或虚拟机环境中运行 oxidizr。
If you understand the risks and want to proceed, you can install oxidizr by downloading its binary releases from GitHub or using the cargo tool if you have Rust installed. It works on Ubuntu versions after 24.04 LTS.
确认知晓风险后,可从 GitHub 获取二进制安装包,或借助已部署的 Rust 环境内 cargo 工具完成安装,程序适配 24.04 长期支持版及后续 Ubuntu 系统。
Install oxidizr in Ubuntu
在 Ubuntu 系统中安装 Oxidizr
The oxidizr tool itself should work on all versions of Ubuntu after 24.04 LTS. But, the availability of the experiments might vary; for example, the diffutils experiment is only available from Ubuntu 24.10 onward.
Oxidizr 程序可运行于 24.04 长期支持版之后所有 Ubuntu 版本,但各类试验功能适配范围存在差异,例如 diffutils 试验仅支持 24.10 及以上系统版本。
You need root privileges to run oxidizr. There are a couple of ways to get oxidizr on your system:
运行程序需要管理员权限,系统可通过两种方式安装 Oxidizr:
Using cargo (if you have Rust's package manager installed):
使用 cargo 工具(已安装 Rust 包管理器即可操作):
cargo install --git https://github.com/jnsgruk/oxidizr
Downloading and installing binary releases from Github:
从 Github 下载二进制程序安装包
The another way to install oxidizr is using curl and tar to download the latest release binary for your architecture and move it to /usr/bin/oxidizr.
也可通过 curl 与 tar 指令下载适配设备架构的最新程序包,将程序放置至 /usr/bin/oxidizr 目录。
Get the latest release:
获取最新程序版本号:
latest="$(curl -s "https://api.github.com/repos/jnsgruk/oxidizr/releases/latest" | jq -r '.name')"
Download and install to /usr/bin/oxidizr:
下载程序并安装至指定目录:
curl -sL "https://github.com/jnsgruk/oxidizr/releases/download/$latest/oxidizr_Linux_$(uname -m).tar.gz" | sudo tar -xvzf - -C /usr/bin oxidizr
Switch to Rust Tools on Ubuntu Using oxidizr
依托 Oxidizr 切换 Ubuntu Rust 工具版本
Once installed, you invoke oxidizr as root. It supports two main commands: enable and disable.
安装完成后以管理员身份调用程序,核心操作指令包含 enable 启用、disable 停用两类。
Supported Experiments:
支持的试验工具列表
As of version 1.0.0, oxidizr supports experiments for the following Rust-based replacements:
1.0.0 版本可启用下述 Rust 替代工具试验功能:
uutils coreutilsuutils findutilsuutils diffutilssudo-rs
By default, the coreutils and sudo-rs experiments are enabled because they are considered the most complete and stable experiments.
coreutils 与 sudo-rs 试验功能完整性与稳定性更佳,程序默认自动开启。
Enabling Experiments:
启用试验功能
To enable the default experiments (which are rust-coreutils and sudo-rs in v1.0.0), you can use:
启用 1.0.0 版本默认的 rust-coreutils 与 sudo-rs 试验功能,执行下述指令:
sudo oxidizr enable
You will be given a warning before enabling the expriements. Type y and hit ENTER to continue.
启用功能前会弹出风险提示,输入 y 按下回车键即可继续操作。
? Continue? (y/N) y
[⚠️ oxidizr can cause harm to your system! ⚠️
Depending on your configuration and workload, oxidizr's
experiments could cause your machine to fail to boot, or
your workloads to fail. Use with caution.]
This will install and enable rust-coreutils and sudo-rs.
执行后自动安装并开启 rust-coreutils 与 sudo-rs 程序。
2025-05-09T08:18:47.595755Z INFO Updating apt package cache
2025-05-09T08:18:55.534721Z INFO Installing and configuring rust-coreutils
2025-05-09T08:19:01.599105Z INFO Installing and configuring sudo-rs

Install Rust Alternatives to Unix Tools with oxidizr in Ubuntu
在 Ubuntu 系统中借助 Oxidizr 安装 Unix 工具 Rust 替代程序
To enable specific experiments, you can use the --experiments flag:
指定开启单项或多项试验功能,添加 --experiments 参数执行指令:
sudo oxidizr enable --experiments coreutils findutils
To enable all known experiments (without prompting), you can use --all --yes:
无弹窗确认、一键开启全部试验功能,执行下述指令:
sudo oxidizr enable --all --yes
Disabling Experiments:
停用试验功能
Note: The process for disabling generally involves checking for and restoring backed-up original binaries and uninstalling the new package.
备注:停用流程会校验备份文件,恢复原版程序并卸载新版工具包。
To disable the default experiments, you use:
关闭默认启用的试验功能,执行指令:
sudo oxidizr disable
To disable all experiments (without prompting), you can use --all --yes:
无弹窗确认、一键关闭全部试验功能,执行指令:
sudo oxidizr disable --all --yes
Display help:
查看程序帮助文档:
oxidizr --help
Experiment with Rust-based Unix Tools
体验 Rust 重构版 Unix 工具
Once you've enabled the experiments, use the commands such as cp, mv, ls and other commands provided by each package and see how they perform.
开启试验功能后,运行 cp、mv、ls 等各类工具指令,测试程序运行表现。
coreutils package includes most basic fundamental shell commands :
coreutils 工具集涵盖绝大多数基础终端操作指令:
ls,cp,mv,rm,cat,touch,mkdir,rmdir,echo,pwd,whoami,df,du,head,tail,sort,uniq,cut,tr,tee,yes,basename,dirname,date,env,sleep,stat,test,true,false,timeout,uptime, etc.
diffutils includes tools for comparing files :
diffutils 工具集提供文件内容比对相关程序:
diff,cmp,sdiff,diff3
findutils includes tools for finding files and executing actions:
findutils 工具集用于文件检索与配套操作执行:
find,xargs
To see which commands or binaries are provided by a package like coreutils, diffutils, findutils, and sudo-rs, use one of the following commands:
查询 coreutils、diffutils、findutils、sudo-rs 工具包含的指令与程序文件,执行下述命令:
dpkg -L coreutils | grep '/bin/'
dpkg -L diffutils | grep '/bin/'
dpkg -L findutils | grep '/bin/'
dpkg -L sudo-rs | grep '/bin/'
Conclusion
总结
oxidizr is a tool that lets you get a glimpse into the future of Ubuntu. By helping test Rust-based replacements for essential tools like sudo and coreutils, it supports Ubuntu's mission to build a safer, more resilient operating system for everyone.
oxidizr 可直观展现 Ubuntu 系统后续发展方向,该程序助力 sudo、核心工具集等基础程序的 Rust 版本测试,推动系统打造安全稳定的运行环境。
Remember, use oxidizr with caution on a test system, but know that the work behind it aims to make your Ubuntu experience more secure in the long run.
请务必在测试环境谨慎使用该工具,相关开发优化工作将持续提升 Ubuntu 系统长期使用安全性。
Resource:
参考资源
Oxidizer: Toward Concise and High-fidelity Rust Decompilation
Oxidizer:迈向简洁高保真的 Rust 反编译
Yibo Liu*, Zion Leonahenahe Basque*, Arvind S Raj*, Chavin Udomwongsa*, Chang Zhu*, Jie Hu*, Changyu Zhao†, Fangzhou Dong*, Adam Doup´e*, Tiffany Bao*, Yan Shoshitaishvili*, Ruoyu Wang*
*Arizona State University
{yiboliu, zbasque, arvindsraj, cudomwon, czhu62, jiehu12, bonniedong, doupe, tbao, yans, fishw}@asu.edu
†Stanford University
- 亚利桑那州立大学
† 斯坦福大学
Abstract
Abstract-Rust is an increasingly popular language that has gained traction among developers. As a memory-safe language, Rust reduces the burden for developers to create reliable and fast software. However, the same features can also hinder reverse engineering tasks. For instance, malware developers have also picked up on the trend of Rust, using it to make their malware more reliable and difficult to analyze.
摘要:Rust 是一门日益流行的编程语言,在开发者中认可度不断提升。作为内存安全型语言,Rust 降低了开发者编写可靠、高性能软件的负担。但这些特性同样会给逆向工程任务带来阻碍。例如,恶意软件开发者也紧跟 Rust 趋势,用它提升恶意程序的可靠性并增加分析难度。
Reverse engineering tasks often rely on decompilers to recover the source code from these binaries. However, analysts find it difficult to analyze Rust binaries using modern C decompilers. Modern C decompilers fail on Rust binaries because they fail to recover high-level Rust abstractions from low-level implementations. As a result, the decompiled output is often verbose and inaccurate. Therefore, we believe that to achieve high-quality Rust decompilation, a decompiler must bridge the gap between high-level Rust abstractions and low-level implementations.
逆向工程任务通常依赖反编译器从二进制文件中还原源代码。但安全分析人员发现,用现有 C 语言反编译器分析 Rust 二进制文件十分困难。现有 C 反编译器无法从底层实现中还原 Rust 高层抽象,导致在处理 Rust 二进制时失效,反编译结果往往冗长且不准确。因此我们认为,要实现高质量的 Rust 反编译,反编译器必须弥合 Rust 高层抽象与底层实现之间的鸿沟。
In this paper, we study how C decompilers fail at decompiling Rust binaries. We identify a comprehensive list of decompilation failures, find the root causes of these failures, and develop a novel decompiler, OXIDIZER, for decompiling Rust binaries to Rust pseudocode. We evaluate OXIDIZER on 28 popular Rust projects across multiple optimization levels and compiler versions, comparing it against angr, Hex-Rays, Ghidra, and Binary Ninja. OXIDIZER outperforms all baselines on most conciseness and fidelity metrics, and is the only tool capable of recovering Rust enums and macros. A human study further shows that participants using OXIDIZER achieved 28% higher accuracy and completed tasks 20% faster than those using Hex-Rays.
本文研究了 C 反编译器在反编译 Rust 二进制文件时的失效场景,整理了完整的反编译失效清单并定位根本原因,进而研发了新型反编译器 OXIDIZER,可将 Rust 二进制反编译为 Rust 伪代码。我们在 28 个主流 Rust 项目、多种优化级别与编译器版本下对 OXIDIZER 进行评估,并与 angr、Hex-Rays、Ghidra、Binary Ninja 对比。结果显示,OXIDIZER 在简洁性与保真度的多数指标上优于所有基准工具,且是唯一能还原 Rust 枚举(enum)与宏的工具。人工评估进一步表明,使用 OXIDIZER 的受试者准确率提升 28%,任务完成速度加快 20%。
1. Introduction
引言
The Rust programming language [1] has grown in popularity and is used for critical systems such as embedded firmware [2] and the Linux kernel [3]. This adoption has largely been driven by Rust's memory-safety guarantees, reducing the burden of building reliable and performant software [4]. On the other hand, malware authors have also adopted Rust to build malware for its high execution speed, cross-platform capabilities, and resistance to reverse engineering [5].
Rust 编程语言[1] 愈发流行,已用于嵌入式固件[2]、Linux 内核[3] 等关键系统。其广泛应用主要得益于 Rust 的内存安全保障,降低了开发可靠、高性能软件的成本[4]。另一方面,恶意软件作者也选用 Rust 开发恶意程序,因其执行速度快、跨平台且难以逆向分析[5]。
Recent advancements in binary decompilation [6--8] have facilitated its application in security tasks such as malware analysis [9]. However, most modern decompilers are primarily designed for C/C++ binaries and do not account for Rust's characteristics [6--8, 10, 11]. Given the growing prevalence of Rust-based malware, this design gap makes existing decompilers ineffective for analyzing Rust binaries (as shown in Section 2). One trivial workaround is to apply post-decompilation fixes on C-oriented decompilers' output, using methods such as case-specific scripting [12, 13] and manual rewriting [13]. Even though there is no public work on using LLMs to decompile Rust binaries, we found that recent research has explored LLM-assisted C decompilation [14--17]. We experimented with trivial post-decompilation fixes by prompting LLMs, but found that there are some fundamental issues with LLM-based post-decompilation fixes, hallucination and information loss, which we will explain in Section 2. Therefore, a dedicated Rust decompiler is still needed.
二进制反编译技术的最新进展[6--8] 推动了其在恶意软件分析等安全任务中的应用[9]。但现有主流反编译器主要面向 C/C++ 二进制设计,未适配 Rust 语言特性[6--8, 10, 11]。随着 Rust 编写的恶意软件日益增多,这一设计缺陷导致现有工具无法有效分析 Rust 二进制(见第 2 节)。一种简单的变通方案是对面向 C 的反编译器输出做后处理修复,例如针对性脚本[12, 13] 与手动改写[13]。尽管尚无公开工作研究用大语言模型(LLM)反编译 Rust 二进制,但近期已有研究探索 LLM 辅助 C 反编译[14--17]。我们尝试用 LLM 做简单的反编译后修复,却发现这类方法存在根本性问题:幻觉与信息丢失,详见第 2 节。因此,仍需研发专用的 Rust 反编译器。
LLM-based post-decompilation fixes tend to hallucinate, producing code that appears coherent but lacks semantic grounding, while scripting and manual rewriting can only address trivial issues such as correcting Rust string literals or demangling function names. More complex challenges, such as reconstructing Rust enums or recovering Rust-specific control-flow constructs, remain largely unsolved.
基于 LLM 的反编译后修复易产生幻觉,生成看似连贯却无语义依据的代码;而脚本与手动改写仅能解决 Rust 字符串字面量修正、函数名去修饰等简单问题。重构 Rust 枚举、还原 Rust 特有控制流结构等复杂挑战仍未得到解决。
In this paper, we first examine how C-oriented decompilers perform on Rust binaries. We evaluate four popular C decompilers on 28 popular Rust projects on GitHub. Based on the evaluation, we find that decompiling Rust binaries into C pseudocode often yields verbose code with reduced abstraction, loss of semantic information, and potential inaccuracies. We compile a comprehensive list of fidelity issues observed in these decompilation outputs and trace their root causes to two Rust-specific challenges: (1) Rust introduces high-level abstractions that have no direct counterparts in C (e.g., enums, pattern matching); (2) Even for similar abstractions, the Rust and C compilers generate different low-level implementations (e.g., string handling and library linkage).
本文首先测试面向 C 的反编译器在 Rust 二进制上的表现,在 GitHub 上 28 个主流 Rust 项目中评估四款热门 C 反编译器。结果表明,将 Rust 二进制反编译为 C 伪代码,往往得到冗长、抽象度降低、语义信息丢失且可能不准确的代码。我们整理了反编译输出中观测到的完整保真度问题清单,并将根源归结为两项 Rust 特有挑战:(1)Rust 引入了 C 中无直接对应的高层抽象(如枚举、模式匹配);(2)即便对相似抽象,Rust 与 C 编译器生成的底层实现也不同(如字符串处理、库链接)。
We then develop novel techniques to address the fidelity issues in Rust decompilation and implement them in OXIDIZER, a prototype Rust decompiler built on top of angr [18]. OXIDIZER performs binary-level analyses to overcome challenges introduced by the Rust compiler, enabling Rust standard library function identification and the application of known struct and function types. OXIDIZER also addresses identified fidelity issues by designing the three core components of decompilation, control-flow recovery, type recovery, and structuring, specifically for Rust. With these Rust-oriented designs, OXIDIZER is able to recover Rust high-level abstractions such as macros, enums, and pattern matching. OXIDIZER takes a first step toward high-quality Rust decompilation by generating concise, faithful, and human-readable code that enables more effective reverse engineering of real-world Rust binaries.
随后我们提出解决 Rust 反编译保真度问题的新技术,并在基于 angr[18] 构建的 Rust 反编译器原型 OXIDIZER 中实现。OXIDIZER 执行二进制级分析以克服 Rust 编译器带来的挑战,实现 Rust 标准库函数识别与已知结构体、函数类型的应用。同时,OXIDIZER 针对 Rust 重新设计反编译三大核心模块:控制流还原、类型还原与结构化,解决已发现的保真度问题。凭借这些面向 Rust 的设计,OXIDIZER 可还原宏、枚举、模式匹配等 Rust 高层抽象。它生成简洁、保真、易读的代码,为真实 Rust 二进制的高效逆向分析奠定基础,迈出了高质量 Rust 反编译的第一步。
Finally, we evaluate OXIDIZER on our dataset consisting of 27 projects from the top 50 Rust projects on GitHub (ranked by stars) and the Rust reimplementation of GNU Coreutils [19], with various optimization levels and two different Rust compiler versions. OXIDIZER generates output that is significantly more concise and faithful to the original program compared to existing binary decompilers. It produces 15% fewer lines of code and 7% lower cyclomatic complexity than the best-performing C decompiler on average. Moreover, OXIDIZER effectively reduces extraneous function calls and achieves the highest matched macro recovery rate among all evaluated decompilers. For type inference, OXIDIZER outperforms other decompilers on recovering struct and enum types, which are the most prevalent types in our dataset. In our user study with 37 participants of different reverse engineering expertise, participants using OXIDIZER achieved 28% higher task scores and completed tasks 20% faster than with Hex-Rays, demonstrating improved efficiency and accuracy. Participants also rated OXIDIZER more favorably, with an average score of 4.49 out of 5, compared to 2.61 for Hex-Rays.
最后,我们在自建数据集上评估 OXIDIZER:包含 GitHub Star 数前 50 的 Rust 项目中的 27 个,以及 GNU Coreutils 的 Rust 重实现版[19],覆盖多种优化级别与两个 Rust 编译器版本。与现有二进制反编译器相比,OXIDIZER 的输出更简洁、更贴近原始程序。平均而言,其代码行数比性能最优的 C 反编译器少 15%,圈复杂度低 7%。此外,OXIDIZER 有效减少冗余函数调用,宏还原匹配率在所有参评工具中最高。类型推断方面,OXIDIZER 在结构体与枚举类型(数据集中最常见类型)的还原上优于其他工具。在 37 名不同逆向经验水平受试者的人工评估中,使用 OXIDIZER 的受试者任务得分提升 28%,完成速度加快 20%,效率与准确性显著提升。受试者对 OXIDIZER 的评分也更高,满分 5 分下平均 4.49 分,而 Hex-Rays 仅 2.61 分。
Contributions. We make the following contributions:
贡献:本文主要贡献如下:
-
We empirically study the performance of state-of-the-art C decompilers on Rust binaries, identifying key fidelity issues and Rust-specific challenges that hinder effective decompilation.
1)通过实证研究现有最优 C 反编译器在 Rust 二进制上的表现,定位阻碍有效反编译的关键保真度问题与 Rust 特有挑战。
-
We propose the first Rust decompilation pipeline to address these challenges and implement it in our open-source prototype, OXIDIZER, providing a foundation for future research in Rust decompilation.
2)提出首个针对这些问题的 Rust 反编译流程,并在开源原型 OXIDIZER 中实现,为后续 Rust 反编译研究奠定基础。
-
We introduce the first systematic methodology for evaluating Rust decompilation in terms of conciseness and fidelity, and use it to benchmark OXIDIZER against state-of-the-art C decompilers, showing its superior performance. We further demonstrate OXIDIZER's practical benefits through case studies on real-world malware samples and a human study involving reverse engineering tasks.
3)提出首个从简洁性与保真度系统评估 Rust 反编译的方法,并用该方法将 OXIDIZER 与现有最优 C 反编译器基准测试,证实其性能优势。通过真实恶意软件样本案例与逆向任务人工评估,进一步验证 OXIDIZER 的实用价值。
To further open science, we release OXIDIZER and all our evaluation artifacts at https://github.com/sefcom/oxidizer and integrate OXIDIZER into angr.
为推动开放科学,我们在 https://github.com/sefcom/oxidizer 开源 OXIDIZER 与全部评估资源,并将其集成到 angr 中。
2. Motivation
研究动机
To understand the challenges of Rust decompilation from a technical perspective, we reviewed malware analysis reports in rust-malware-gallery [13] and collected analysts' complaints. Most complaints simply note that analyzing Rust malware is difficult, without explaining the specific challenges. A few more detailed complaints point to issues such as mangled function names, extraneous code inserted during compilation for security checks or resource management, non-null-terminated strings, and inlined Rust standard library functions. While some issues, like non-null-terminated strings, can potentially be addressed with post-decompilation fixes, others require fundamental changes to decompiler internals to resolve.
为从技术角度理解 Rust 反编译的难点,我们梳理了 rust-malware-gallery[13] 中的恶意软件分析报告,收集分析人员反馈。多数反馈仅指出 Rust 恶意软件分析困难,未说明具体问题;少数详细反馈指向函数名修饰、编译时插入的安全检查与资源管理冗余代码、非空终止字符串、Rust 标准库函数内联等问题。部分问题(如非空终止字符串)可通过反编译后修复解决,其余问题则需对反编译器内核做根本性改造。
To illustrate these issues concretely, we present a motivating example from a real-world Rust project [20]. Listing 2 shows (a) the original Rust source code of function encrypt_file, alongside (d) the decompiled C pseudocode generated by the state-of-the-art commercial decompiler Hex-Rays [10], and (b) part of the Rust pseudocode generated by Binary Ninja [21]. For comparison, we also include © the result of a post-decompilation fix using ChatGPT [22] based on Hex-Rays' output.
为直观展示这些问题,我们以真实 Rust 项目[20] 为例。清单 2 展示:(a)encrypt_file 函数的原始 Rust 源码;(d)商用最优反编译器 Hex-Rays[10] 生成的 C 伪代码;(b)Binary Ninja[21] 生成的部分 Rust 伪代码。作为对比,还给出(c)基于 Hex-Rays 输出用 ChatGPT[22] 做反编译后修复的结果。

Listing 2: The original Rust source code (23 lines, a), part of Binary Ninja's decompilation (12 lines of Rust pseudocode, b), Hex-Rays' decompilation (81 lines of C pseudocode, d) of the same function, and ChatGPT-reconstructed code based on Hex-Rays decompilation (21 lines, c). We omitted variable declarations from the decompiled code for brevity.
清单 2:同一函数的原始 Rust 源码(23 行,a)、Binary Ninja 反编译结果的片段(12 行 Rust 伪代码,b)、Hex-Rays 反编译结果(81 行 C 伪代码,d),以及基于 Hex-Rays 反编译结果由 ChatGPT 重构的代码(21 行,c)。为简洁起见,反编译代码中省略了变量声明。
Comparing to the source code, Hex-Rays' decompiled C pseudocode exhibits several critical issues (highlighted in red boxes): (1) Inlined Vec::new function call; (2) Expanded eprintln macro call; (3) Missing format string; (4) Redundant code caused by the inclusion of resource release function calls (i.e., drop_in_place). Green boxes highlight the corresponding elements in the original source code. Though Binary Ninja produces Rust pseudocode, it suffers from the same issues as Hex-Rays.
与源码对比,Hex-Rays 反编译的 C 伪代码存在多处关键问题(红框标注):(1)Vec::new 函数调用内联;(2)eprintln 宏调用展开;(3)格式化字符串缺失;(4)资源释放函数调用(如 drop_in_place)导致的冗余代码。绿框标注源码中对应元素。尽管 Binary Ninja 生成 Rust 伪代码,却存在与 Hex-Rays 相同的问题。
In recent years, researchers have explored using LLMs to improve C decompilation quality [14--17]. However, there is no public work on using LLMs to decompile Rust binaries. We experimented with trivial post-decompilation fixes by prompting LLMs, but found the output had several issues. As shown in Listing 2 ©, using ChatGPT to recover Rust code from Hex-Rays' output introduces fake function signatures, fake function calls, and fake strings (highlighted in purple boxes) that do not exist in the original source, leading to significant fidelity issues. Other LLMs we experimented with have similar issues. Moreover, LLM-based post-decompilation fixes face fundamental limitations. The reason is twofold: 1) the decompilation process inevitably involves information loss when lifting low-level code to higher-level representations, and 2) all LLMs are susceptible to hallucinations to some degree [23].
近年来,研究者探索用 LLM 提升 C 反编译质量[14--17],但尚无公开工作研究用 LLM 反编译 Rust 二进制。我们尝试用 LLM 做简单反编译后修复,却发现输出存在多处问题。如清单 2(c)所示,用 ChatGPT 从 Hex-Rays 输出还原 Rust 代码,会引入源码中不存在的伪造函数签名、函数调用与字符串(紫框标注),导致严重保真度问题。我们测试的其他 LLM 也存在类似问题。此外,基于 LLM 的反编译后修复存在根本性局限,原因有二:1)底层代码向高层表示提升的反编译过程不可避免存在信息丢失;2)所有 LLM 均存在一定程度的幻觉问题[23]。
These observations highlight the need for a decompiler that is explicitly designed to handle Rust-specific compilation patterns and abstractions, rather than adapting C-oriented decompilers or relying on post-decompilation fixes. To that end, we empirically study the fidelity issues that arise during the decompilation of Rust binaries, identify their root causes, and propose solutions to address them. We implement our solutions in OXIDIZER, a Rust decompiler.
上述现象表明,亟需专门适配 Rust 编译模式与抽象的反编译器,而非改造面向 C 的反编译器或依赖后处理修复。为此,我们实证研究 Rust 二进制反编译中的保真度问题,定位根源并提出解决方案,最终在 Rust 反编译器 OXIDIZER 中实现。
Listing 1 shows OXIDIZER's decompilation of the same function. Compared with Hex-Rays, Binary Ninja, and ChatGPT-based fixes, OXIDIZER produces results that most closely resemble the original source, successfully recovering Rust-specific high-level abstractions such as type information, control-flow constructs, and correct semantics. We detail the techniques implemented in OXIDIZER in Section 5.
清单 1 展示 OXIDIZER 对同一函数的反编译结果。与 Hex-Rays、Binary Ninja 及基于 ChatGPT 的修复相比,OXIDIZER 输出最贴近原始源码,成功还原类型信息、控制流结构、正确语义等 Rust 特有高层抽象。第 5 节详细介绍 OXIDIZER 采用的技术。
rust
fn FakeCrypt::fileops::encrypt_file(a0: &Path, a1: i64, a2: i64) {
let v0: I32NotAllOnes; // [bp-0x818]
let v2: Result<struct4, struct8>; // [bp-0x810]
let v3: Vec<u8, alloc::alloc::Global>; // [bp-0x800]
let v4: u64; // [bp-0x7f0]
let v5: File; // [bp-0x7d8]
let v6: Result<File, Error>; // [bp-0x7b8]
let v7: File; // [bp-0x7b4]
let v8: struct960; // [bp-0x3f8]
let v11: u64; // rax
let v12: u64; // rdx
let v13: Result<(), &BOT>; // rax
let v14: Result<usize, Error>; // rax:rdx
v2 = std::fs::File::open(a0);
match v2 {
Err(_) => {
return;
},
Ok(v0) => {
v3 = Vec::new();
v14 = <std::fs::File as std::io::Read>::read_to_end(&v0, &v3);
if let Err(_) = v14 {
return;
}
v3 = alloc::vec::Vec<T,A>::resize((9223372036854775792 & v4) + 16, 0);
v6 = <aes::autodetect::Aes256 as crypto_common::KeyInit>::new(a1);
v8 = <cbc::decrypt::Decryptor<C> as crypto_common::InnerIvInit>::inner_iv_init(&v6, a2);
v11 = cipher::block::BlockEncryptMut::encrypt_padded_mut(&v8, 1, v4, v4);
if v11 {
v6 = std::fs::File::create(a0);
if let Ok(v5) = v6 {
v13 = std::io::Write::write_all(&v5 as u64, v11, v12);
}
} else {
eprintln!("[!] Encryption failed for {}: {}", &a0, &v1);
}
return;
}
}
}
Listing 1: OXIDIZER decompilation of the motivating example (41 lines of Rust pseudocode).
清单 1:OXIDIZER 对示例函数的反编译结果(41 行 Rust 伪代码)
3. Background and Research Scope
背景与研究范围
Rust and Rust Malware. Rust ensures safety through compile-time lifetime checks, runtime bounds checks, and an ownership model that prevents data races. These safety features and a rich and powerful type and trait system make Rust an attractive choice for writing safe yet expressive programs [24].
Rust 与 Rust 恶意软件:Rust 通过编译期生命周期检查、运行时边界检查与避免数据竞争的所有权模型保障安全。这些安全特性与强大的类型、trait 系统,让 Rust 成为编写安全且 expressive 程序的优选[24]。
Malware authors have also begun adopting Rust (instead of traditional choices such as C, C++, or Delphi) to develop malware [13]. In addition to the various language features, easy cross-platform support and the lack of tools to efficiently analyze Rust binaries have made Rust an attractive choice for malware [12]. Prevalent malware, such as the HIVE ransomware and the WildCard ransomware, have been rewritten in Rust in recent years [25, 26].
恶意软件作者也开始改用 Rust(替代 C、C++、Delphi 等传统选择)开发恶意程序[13]。除语言特性外,便捷的跨平台支持与缺乏高效分析 Rust 二进制的工具,让 Rust 成为恶意软件开发的优选[12]。近年来,HIVE 勒索软件、WildCard 勒索软件等主流恶意程序均已改用 Rust 重写[25, 26]。
C Decompilers on Rust Binaries. The Rust compiler rustc uses LLVM as its backend [27], producing native machine code that existing C decompilers can disassemble and lift without Rust-specific support. However, these decompilers lack Rust ABI knowledge and misapply C-centric heuristics, misinterpreting Rust-specific abstractions (e.g., enum layouts, pattern matching) that have no direct C equivalent. Even for shared abstractions like strings and function calls, rustc and C compilers generate different low-level implementations, further degrading decompilation quality.
C 反编译器处理 Rust 二进制:Rust 编译器 rustc 以 LLVM 为后端[27],生成原生机器码,现有 C 反编译器可在无 Rust 专用支持下反汇编与提升。但这类反编译器缺乏 Rust ABI 知识,错误应用面向 C 的启发式规则,误解析无 C 直接对应物的 Rust 特有抽象(如枚举布局、模式匹配)。即便对字符串、函数调用等共享抽象,rustc 与 C 编译器生成的底层实现也不同,进一步降低反编译质量。
Rust Support in Modern Decompilers. To the best of our knowledge, there is no specialized Rust decompiler publicly available. Modern decompilers have primarily focused on optimizing each of these stages for C- and C+±specific output [28]. Analysts often use state-of-the-art C decompilers, including angr, Hex-Rays, Ghidra, and Binary Ninja, to analyze Rust binaries. These decompilers only have limited support for Rust decompilation. They all support Rust-style function name demangling. Binary Ninja offers a PseudoRust representation since 4.2 [29], which displays decompiled code in Rust-like syntax.
现代反编译器的 Rust 支持:据我们所知,尚无公开可用的专用 Rust 反编译器。现代反编译器主要针对 C/C++ 输出优化各阶段流程[28]。分析人员通常使用 angr、Hex-Rays、Ghidra、Binary Ninja 等最优 C 反编译器分析 Rust 二进制,这些工具仅提供有限 Rust 支持,均支持 Rust 风格函数名去修饰。Binary Ninja 自 4.2 版本起提供伪 Rust 表示[29],以类 Rust 语法展示反编译代码。
Analysts attempted various post-decompilation fixes to generate Rust pseudocode from C decompilation output. Ghidrust [30] is a Rust binary analysis extension for Ghidra, which provides functionalities including Rust binary detection, Rust standard library function detection, and emitting Rust pseudocode by parsing decompiled C code. However, as we will show later, these solutions are insufficient for generating meaningful Rust pseudocode for reverse engineering.
分析人员尝试多种反编译后修复方法,从 C 反编译输出生成 Rust 伪代码。Ghidrust[30] 是 Ghidra 的 Rust 二进制分析扩展,提供 Rust 二进制检测、Rust 标准库函数识别、解析反编译 C 代码生成 Rust 伪代码等功能。但后续会证明,这些方案不足以生成逆向工程可用的有效 Rust 伪代码。
Research Scope. We design OXIDIZER to assist with reverse engineering tasks on Rust binaries, including, but not limited to, Rust malware analysis. While designing OXIDIZER, we had to make trade-offs in the metrics we prioritized for our decompilation. For instance, improving the conciseness of our decompiler, such as removing redundant dealloc calls, can also potentially remove security-relevant information. Our goal with OXIDIZER is to generate decompilation that is readable and intended for human use. As such, we prioritize closeness to source code (Rust), which previous work has associated with readability [31]. Improving tasks such as recompilability and robustness to obfuscation is outside the scope of our design, and we leave them to future researchers.
研究范围:我们设计 OXIDIZER 用于辅助 Rust 二进制逆向工程任务,包括但不限于 Rust 恶意软件分析。设计过程中,我们在反编译优先指标上做了权衡。例如,提升反编译器简洁性(如移除冗余内存释放调用)可能同时丢失安全相关信息。OXIDIZER 的目标是生成人类可读的反编译结果,因此优先保证与 Rust 源码的贴近度,已有研究将此与可读性关联[31]。提升可重编译性、抗混淆能力等不在本文设计范围内,留待后续研究。
Modern C decompilers (e.g., Hex-Rays) do not attempt to generate recompilable or fully correct C code. This is a feature for malware analysis-from a participant in a human study on malware analysis "It is useful to get a snippet of code from malware to use externally. For that, function correctness is important, but whole code correctness is not" [9]. Likewise, we do not aim to generate recompilable Rust code.
现代 C 反编译器(如 Hex-Rays)并不追求生成可重编译或完全正确的 C 代码,这一特性适配恶意软件分析需求------某恶意软件分析人工评估参与者表示:"从恶意软件中提取代码片段外部使用很有用,对此函数正确性重要,但完整代码正确性不重要"[9]。同理,我们不追求生成可重编译的 Rust 代码。
We expect as input an unaltered executable generated by the Rust compiler. When facing adversarial binaries, such as those generated by packing [32--34] or obfuscation [35--39], analysts must first unpack or deobfuscate before decompiling them, just as they use state-of-the-art C decompilers. Deobfuscation or unpacking is out of scope. OXIDIZER supports decompiling stripped Rust binaries.
我们期望输入为 Rust 编译器生成的未修改可执行文件。面对加壳[32--34]、混淆[35--39]等对抗性二进制时,分析人员需先脱壳、去混淆再反编译,与使用最优 C 反编译器流程一致。去混淆与脱壳不在本文范围内。OXIDIZER 支持反编译去符号的 Rust 二进制。
4. Decompilation Fidelity Issues
反编译保真度问题
To study the fidelity issues in Rust binary decompilation, we evaluate four popular decompilers (Hex-Rays, Ghidra, Binary Ninja, and angr) on 28 popular Rust projects from GitHub, compiled under various optimization levels and compiler versions. To enable one-to-one function-level matching between source code and binaries, we disabled inlining during compilation¹.
为研究 Rust 二进制反编译的保真度问题,我们在 GitHub 28 个主流 Rust 项目、多种优化级别与编译器版本下,评估四款热门反编译器(Hex-Rays、Ghidra、Binary Ninja、angr)。为实现源码与二进制的函数级一一对应,编译时禁用了内联¹。
¹ Function inlining cannot be completely disabled in Rust.
¹ Rust 中无法完全禁用函数内联。
Researchers have discussed fidelity issues in C decompilers [31], and we adopt a similar concept in this paper for Rust decompilation, where fidelity includes both readability and correctness issues that prevent the decompiled output from faithfully reflecting the original program. While traditional C decompilation has its own challenges, in this paper, we focus on Rust-specific fidelity issues.
研究者已讨论过 C 反编译器的保真度问题[31],本文将类似概念用于 Rust 反编译:保真度包含可读性与正确性问题,这些问题导致反编译输出无法真实反映原始程序。传统 C 反编译自有其挑战,本文聚焦 Rust 特有保真度问题。
Overall, existing decompilers perform insufficiently on Rust binaries because they fail to faithfully recover high-level Rust abstractions by design. Most modern decompilers are built for C/C++ binaries and thus are optimized to reconstruct C/C+±specific abstractions. While Rust and C/C++ share some common constructs, they also introduce unique abstractions without direct counterparts in the other. Moreover, Rust and C/C++ compilers generate different low-level implementations for similar high-level concepts that existing C/C++ decompilers are not designed to handle.
总体而言,现有反编译器在 Rust 二进制上表现不足,根源是设计上无法忠实还原 Rust 高层抽象。多数现代反编译器为 C/C++ 二进制构建,优化目标是重建 C/C++ 特有抽象。Rust 与 C/C++ 虽有部分共用结构,但也引入彼此无直接对应的独有抽象。此外,对相似高层概念,Rust 与 C/C++ 编译器生成不同底层实现,而现有 C/C++ 反编译器未对此设计。
Fidelity Issue 1: Enumeration and Pattern Matching
保真度问题 1:枚举与模式匹配
In Rust, enumeration (enum for short) is a sum type that can hold one of several named variants with distinct names and fields. A common operation associated with Rust enums is pattern matching, which is implemented using the match control-flow construct. This construct branches execution based on the specific variant an enum holds. In particular, Rust uses two built-in enums, Option and Result<T, E>, to handle recoverable errors. Instead of relying on exceptions, Rust encourages explicit error handling using pattern matching.
在 Rust 中,枚举(enum)是可持有多个具名变体之一的和类型,各变体有独立名称与字段。Rust 枚举的常见操作是模式匹配,由 match 控制流结构实现,根据枚举持有的具体变体分支执行。Rust 尤其使用两个内置枚举 Option<T> 与 Result<T, E> 处理可恢复错误,不依赖异常,而是鼓励用模式匹配显式处理错误。
Issue 1.1: Memory Layout Abuse. The difference between C's switch-case and Rust's pattern matching is that Rust's pattern matching checks if the instance matches a specific variant instead of value. Each Rust enum instance consists of a discriminant, an integer that indicates which variant it currently holds, and shared space for variants' fields [40]. In decompiled C pseudocode, enum initialization and variant field access typically appear as a sequence of raw memory writes. While this reflects low-level implementations, it loses high-level semantics such as the specific variant being constructed. Likewise, pattern matching on enums is often compiled into a series of if-else checks on the discriminant accessed through raw memory reads.
问题 1.1:内存布局滥用。C 的 switch-case 与 Rust 模式匹配的区别在于:Rust 模式匹配检查实例是否匹配特定变体,而非数值。每个 Rust 枚举实例包含判别值(标识当前持有的变体的整数)与变体字段共享空间[40]。在反编译 C 伪代码中,枚举初始化与变体字段访问通常表现为一连串原始内存写入,虽反映底层实现,却丢失正在构造的具体变体等高层语义。同理,枚举的模式匹配常被编译为对通过原始内存读取获取的判别值的一系列 if-else 判断。
Issue 1.2: Complex Control Flow. Existing C decompilers recover Rust pattern matches as if-else structures, which are functionally equivalent but not idiomatic. The if-else structures lack high-level semantics, including variant names and associated fields, making the recovered code less readable and harder to understand. Listing 2 shows an example of complex control flow caused by pattern matching.
问题 1.2:复杂控制流。现有 C 反编译器将 Rust 模式匹配还原为 if-else 结构,功能等价但不符合 Rust 习惯。if-else 结构缺少变体名称、关联字段等高层语义,导致还原代码可读性差、难以理解。清单 2 展示模式匹配导致的复杂控制流示例。
Fidelity Issue 2: Error Propagation
保真度问题 2:错误传播
Rust provides a convenient syntax for error handling using the question mark operator (?). This operator is appended to an expression of Result<T, E> type. It simplifies error propagation by implicitly handling Result values: if the result is an Err, it returns early from the function; otherwise, it unwraps the Ok value.
Rust 提供便捷的错误处理语法:问号运算符 ?。该运算符附加在 Result<T, E> 类型表达式后,隐式处理 Result 值以简化错误传播:若结果为 Err,则从函数提前返回;否则解包 Ok 值。
During compilation, the ? syntactic sugar is desugared into more verbose control flow, typically a pattern matching construct that returns early on the Err variant. This pattern matching construct is further compiled into low-level implementations, as described in Fidelity Issue 1.
编译时,? 语法糖被解语法糖为更冗长的控制流,通常是对 Err 变体提前返回的模式匹配结构。该结构进一步被编译为底层实现,详见保真度问题 1。
Fidelity Issue 3: Macros
保真度问题 3:宏
Rust enables powerful metaprogramming through three types of macros: function-like, attribute-like, and derive macros. The most commonly used are function-like macros, such as println! and panic!. Function-like macros enable pattern-based code generation, which is similar to C-style macros but more expressive. These macros match the input token stream against defined patterns and expand the input into corresponding Rust code during compilation. The effect of macros on decompilation is similar to function inlining: a concise macro call in the source code can expand into a larger, less readable chunk of code during compilation.
Rust 通过三类宏支持强大的元编程:类函数宏、类属性宏与派生宏。最常用的是类函数宏,如 println! 与 panic!。类函数宏支持基于模式的代码生成,类似 C 风格宏但表达能力更强。这类宏将输入令牌流与预定义模式匹配,编译时将输入展开为对应 Rust 代码。宏对反编译的影响与函数内联类似:源码中简洁的宏调用,编译后会展开为更大、可读性更低的代码块。
Fidelity Issue 4: Deref Coercion
保真度问题 4:解引用强制转换
Deref coercion is a Rust feature that enables implicit reference type conversion for types that implement the Deref trait. Specifically, if a type T implements Deref<Target = U>, a reference of type &T can be automatically converted to a reference of type &U. This allows values of type T to access methods or fields defined on U without explicit conversion.
解引用强制转换是 Rust 特性,允许实现 Deref trait 的类型隐式进行引用类型转换。具体而言,若类型 T 实现 Deref<Target = U>,则 &T 类型引用可自动转换为 &U 类型引用,让 T 类型值无需显式转换即可访问 U 上定义的方法或字段。
Issue 4.1: Unaligned Code. To support this feature, the Rust compiler inserts calls to the appropriate Deref::deref implementation at compile time. Although implicit in the source code, these deref calls appear explicitly in the compiled binary. Listing 3 shows an example of complex control flow caused by deref coercion.
问题 4.1:代码不规整。为支持该特性,Rust 编译器在编译期插入对应 Deref::deref 实现调用。尽管在源码中是隐式的,这些解引用调用在编译后二进制中显式出现。清单 3 展示解引用强制转换导致的复杂控制流示例。
(a) Rust source code.
Rust 源码
rust
1 s = String::from_utf8_lossy(&stderr).to_string();
(b) C pseudocode as decompiled by Hex-Rays.
Hex-Rays 反编译生成的 C 伪代码
c
1 String::from_utf8_lossy(&v17, v26, v27);
2 v11 = <Cow<B> as Deref>::deref(&v17);
3 <T as ConvertVec>::to_vec(&v23, v11);
Listing 3: The from_utf8_lossy function in the source returns a Cow<str> instance, while the to_string method (inlined to to_vec function in Listing 3b) takes a &str argument. The &Cow<str> instance is converted to a &str implicitly by deref coercion. However, the deref function call appears explicitly in Hex-Rays decompilation output.
清单 3:源码中的 from_utf8_lossy 函数返回一个 Cow<str> 实例,而 to_string 方法(在清单 3b 中被内联为 to_vec 函数)接收 &str 类型参数。&Cow<str> 实例通过解引用强制转换隐式转换为 &str。但在 Hex-Rays 的反编译输出中,deref 函数调用会被显式呈现。
💡 示例展示了 Rust 语言特性对反编译结果的影响:
- 解引用强制转换(deref coercion) :
Cow<str>类型会自动实现Deref<Target=str>,因此&Cow<str>可直接作为&str使用,无需显式转换。- 编译器内联优化 :
to_string()方法在编译时被内联为底层的to_vec函数,导致反编译结果丢失高层语义。- 反编译器的局限性 :Hex-Rays 作为面向 C 的反编译器,无法识别 Rust 的隐式转换,只能将
deref调用以显式形式呈现,导致代码冗余且可读性下降。
Issue 4.2: Memory Layout Abuse. When deref coercion is compiled into explicit deref function calls and subsequently inlined, they might be compiled into raw memory accesses. Listing 4 shows an example of implicit conversion between &String and &str, and what it looks like in Hex-Rays' C pseudocode.
问题 4.2:内存布局滥用。当解引用强制转换被编译为显式解引用函数调用并随后内联时,可能被编译为原始内存访问。清单 4 展示 &String 与 &str 之间的隐式转换,及其在 Hex-Rays C 伪代码中的表现。
(a) Rust source code
Rust 源码
rust
1 let suffix = format!("{suffix_from_template}");
2 if suffix.contains(MAIN_SEPARATOR) {
3 ...
4 }
(b) C pseudocode as decompiled by Hex-Rays
Hex-Rays 反编译生成的 C 伪代码
c
1 ...
2 core::option::Option<T>::map_or_else(&v69, v47);
3 if ( !char::is_contained_in(*(&v69 + 1), v70) )
4 {
5 ...
6 }
Listing 4: The contains method is defined on &str type, while the suffix variable has the type String. This conversion appears explicitly in Hex-Rays' decompilation output (Listing 4b), by dereferencing the ptr and len fields (i.e., *(&v69 + 1) and v70) in the String instance (i.e., v69) to construct the equivalent string slice.
清单 4: contains 方法定义在 &str 类型上,而 suffix 变量的类型为 String。在 Hex-Rays 的反编译输出(清单 4b)中,这一转换被显式呈现:通过解引用 String 实例(即 v69)中的 ptr 和 len 字段(即 *(&v69 + 1) 和 v70),来构造等效的字符串切片。
💡示例展示了 Rust 字符串类型转换对反编译结果的影响:
- 隐式转换的本质 :
String类型调用contains方法时,会自动通过Deref强制转换为&str(即字符串切片),这一过程在源码中完全透明。- 反编译器的"拆箱"操作 :Hex-Rays 无法识别 Rust 的隐式转换,只能还原出转换的底层实现:直接操作
String结构体的内部字段(数据指针ptr和长度len),并调用底层的字符检查函数。- 反编译的可读性损失 :源码中简洁的
suffix.contains(...),在反编译后变成了对内存偏移的直接访问和函数调用,不仅冗长,也丢失了高层语义。
Fidelity Issue 5: Statically-linked Standard Library Functions
保真度问题 5:静态链接标准库函数
Most C compilers by default link the standard library dynamically into the binary [41, 42]. By contrast, in Rust, the standard library and dependencies are statically linked into the final binary by default [43]. It is also uncommon and not encouraged to dynamically link the Rust standard library due to unstable ABI across Rust versions, deployment complexity, and community consensus [44]. Because the Rust standard library is typically statically linked into the binary, its function symbols can be stripped during compilation or post-processing. When stripped, standard library functions become indistinguishable from user-defined ones, forcing reverse engineers to identify and recover their original functionalities. Additionally, statically linked standard library and third-party library functions enable the Rust compiler to inline them. It is harder for reverse engineers to understand the context with inlined function definition than the function name which summarizes the functionality [31].
多数 C 编译器默认将标准库动态链接到二进制[41, 42]。与之相反,Rust 中标准库与依赖默认静态链接到最终二进制[43]。由于 Rust 版本间 ABI 不稳定、部署复杂与社区共识[44],动态链接 Rust 标准库并不常见也不被鼓励。由于 Rust 标准库通常静态链接到二进制,其函数符号可在编译或后处理时被去除。去符号后,标准库函数与用户自定义函数无法区分,迫使逆向工程师识别并还原其原始功能。此外,静态链接的标准库与第三方库函数允许 Rust 编译器内联,对逆向工程师而言,内联函数定义比概括功能的函数名更难理解上下文[31]。
Fidelity Issue 6: Resource Management
保真度问题 6:资源管理
In C, programmers must explicitly create and release resources such as heap memory, file descriptors, and network sockets [45]. In contrast, Rust uses the ownership model with a set of rules to manage resources safely and automatically [46]. One key rule of ownership is that when a variable goes out of scope, the resources it holds are automatically released. Rust programmers do not need to explicitly release memory or other resources. Instead, the compiler inserts appropriate low-level code, such as calls to destructors, at compile time. The compiler-inserted resource cleanup code does not align with any code in the original source. During reverse engineering, we found that such code frequently appears in the decompilation.
在 C 中,程序员必须显式创建与释放堆内存、文件描述符、网络套接字等资源[45]。与之相反,Rust 使用带一套规则的所有权模型安全、自动管理资源[46]。所有权的关键规则之一是:变量离开作用域时,其持有的资源自动释放。Rust 程序员无需显式释放内存或其他资源,编译器在编译期插入合适的底层代码(如析构函数调用)。编译器插入的资源清理代码与原始源码中的任何代码都不对应,逆向工程中发现这类代码在反编译结果中频繁出现。
Fidelity Issue 7: Security Checks
保真度问题 7:安全检查
Besides resource management, there are also security properties that are difficult to check at compile time (e.g., indexing out of bounds and dividing by zero). Rust compiler inserts security-check code that checks these security properties at runtime. Similar to resource management, implicit security checks also introduce extraneous code that does not exist in the source.
除资源管理外,还有部分安全属性难以在编译期检查(如越界索引、除零)。Rust 编译器插入安全检查代码,在运行时检查这些安全属性。与资源管理类似,隐式安全检查也会引入源码中不存在的冗余代码。
Fidelity Issue 8: Rust String Literals
保真度问题 8:Rust 字符串字面量
In C, ANSI strings are null-terminated, with the terminating byte indicating the end of the string [47]. A Rust string literal consists of a pointer to the character sequence, typically not null-terminated, and an integer representing its length [48]. Most existing decompilers do not support Rust string literals, which results in Rust string literals being represented as regular global variables in decompiled code. Interestingly, when used as macro arguments in Rust, string literals are not simply expanded as they are in C macros, instead, they are used as a token stream, which can be manipulated during expansion (e.g., a format string is divided into several pieces as an array). Failing to recover macro calls may also lead to this fidelity issue.
在 C 中,ANSI 字符串以空字符终止,终止字节标识字符串结束[47]。Rust 字符串字面量包含指向字符序列的指针(通常非空终止)与表示长度的整数[48]。现有多数反编译器不支持 Rust 字符串字面量,导致其在反编译代码中被表示为普通全局变量。有趣的是,在 Rust 中作为宏参数使用时,字符串字面量不会像 C 宏中那样简单展开,而是作为令牌流使用,可在展开时操作(如格式化字符串被拆分为数组片段)。无法还原宏调用也会导致该保真度问题。
Fidelity Issue 9: Reordered Struct Layout
保真度问题 9:结构体布局重排
In C, within a struct object, addresses of its fields increase in the order in which the fields were defined [49]. However, the memory layout of a Rust struct, that is, the order of its fields is not guaranteed to match the order in the source code [50]. While field reordering itself does not directly cause fidelity issues, it can hinder decompilers from using known Rust standard library structs and recovering Rust high-level abstractions. For example, recovering a struct instance relies on matching field types to memory offsets to group field assignments into a single, unified initialization statement. If field types and offsets are mismatched due to incorrect assumptions about layout, the recovered struct will be incorrect. Listing 5 shows an example of this issue.
在 C 中,结构体对象内字段地址按定义顺序递增[49]。但 Rust 结构体的内存布局(即字段顺序)不保证与源码一致[50]。字段重排本身不直接导致保真度问题,但会阻碍反编译器使用已知 Rust 标准库结构体与还原 Rust 高层抽象。例如,还原结构体实例依赖将字段类型与内存偏移匹配,将字段赋值归并为单一统一初始化语句。若因布局假设错误导致字段类型与偏移不匹配,还原的结构体将不正确。清单 5 展示该问题示例。
(a) Decompilation with correct struct layout
结构体布局正确时的反编译结果
rust
1 v5 = Arguments {
2 pieces: ["Source file does not exist: "],
3 args: [v0],
4 fmt: None
5 };
6 v3 = alloc::fmt::format::format_inner(&v5);
(b) Decompilation with incorrect struct layout
结构体布局错误时的反编译结果
rust
1 v6 = Arguments {
2 pieces: ["Source file does not exist: "],
3 fmt: &v0
4 args: &[Argument] {
5 ptr: 0
6 len: <UNKNOWN>
7 }
8 };
9 v4 = alloc::fmt::format::format_inner(&v6);
Listing 5: In this case, the correct field order of Arguments is pieces, args, and fmt, which does not align with the field order in the source. OXIDIZER is not able to recover the correct definition of Arguments with the wrong field order, as Listing 5b shows.
清单 5:在本例中,Arguments 结构体的正确字段顺序为 pieces、args、fmt,这与源码中的字段顺序并不一致。如清单 5b 所示,当字段顺序错误时,OXIDIZER 无法还原 Arguments 的正确定义。
💡示例展示了 Rust 结构体布局优化对反编译的挑战:
- Rust 的字段重排优化:为了减少内存碎片和对齐开销,Rust 编译器会自动对结构体字段进行重排,这导致反编译结果中的字段顺序与源码不一致。
- 反编译器的依赖局限 :OXIDIZER 依赖正确的结构体布局定义来还原高层语义。当字段顺序错误时,它会错误地将数据分配到错误的字段上,导致变量类型识别失败(如
fmt字段被错误地赋值为&v0,而args字段信息丢失)。- 对后续分析的连锁影响 :错误的结构体定义会导致后续的函数调用分析(如
format_inner)出现偏差,严重降低了反编译结果的可读性与准确性。
Fidelity Issue 10: Unstable Enum Variant Discriminants
保真度问题 10:不稳定的枚举变体判别值
Rust allows custom discriminant values to be assigned to enum variants. However, unless explicitly specified, these values are not guaranteed to be stable across different compiler versions [40]. By default, the compiler typically assigns discriminant values starting from zero and incrementing by one, but this is not guaranteed for the purpose of optimization. This instability can pose challenges to decompilers, as correct discriminants are fundamental to recovering pattern matching constructs.
Rust 允许为枚举变体分配自定义判别值。但除非显式指定,这些值不保证在不同编译器版本间稳定[40]。默认情况下,编译器通常从 0 开始递增分配判别值,但为优化目的,该行为不被保证。这种不稳定性会给反编译器带来挑战,因为正确的判别值是还原模式匹配结构的基础。
5. Rust Decompilation
Rust 反编译
We design a Rust decompiler, OXIDIZER, as shown in Figure 1. Given a Rust binary, OXIDIZER begins with control-flow graph (CFG) generation and Rust-specific Binary-level Analysis, identifying the Rust compiler version and locating standard library functions; it then loads the corresponding struct and function type database. For each function, OXIDIZER first generates the function-level CFG (fCFG), and performs fCFG Simplification without Types to remove resource-release and safety-check code. It then conducts Rust Type Inference, recovering variable locations, function prototypes, and local variable types, including structs and enums. With these types, OXIDIZER applies fCFG Simplification with Types to further refine the control flow. In the Structuring phase, OXIDIZER converts the fCFG into high-level control-flow constructs such as if-else and loops, as well as Rust-specific constructs such as pattern matching and error propagation. Finally, in the Rust Pseudocode Generation stage, it outputs structured Rust-like pseudocode.
我们设计 Rust 反编译器 OXIDIZER,如图 1 所示。输入 Rust 二进制后,OXIDIZER 首先生成控制流图(CFG)并执行 Rust 专用二进制级分析,识别 Rust 编译器版本与定位标准库函数;随后加载对应结构体与函数类型数据库。对每个函数,OXIDIZER 先生成函数级 CFG(fCFG),执行无类型 fCFG 简化以移除资源释放与安全检查代码;接着进行 Rust 类型推断,还原变量位置、函数原型与局部变量类型(含结构体与枚举);基于这些类型,执行带类型 fCFG 简化进一步优化控制流。在结构化阶段,OXIDIZER 将 fCFG 转换为 if-else、循环等高层控制流结构,以及模式匹配、错误传播等 Rust 特有结构。最后在 Rust 伪代码生成阶段,输出结构化类 Rust 伪代码。

In detail, OXIDIZER employs angr for CFG and fCFG construction. After the two steps, OXIDIZER diverges from angr decompilation by its novel Rust-specific simplification, type inference, and control-flow structuring tailored to Rust semantics, which enables the recovery of Rust-only abstractions absent in traditional C-oriented decompilers.
具体而言,OXIDIZER 基于 angr 构建 CFG 与 fCFG。完成这两步后,OXIDIZER 脱离 angr 反编译流程,采用全新的 Rust 专用简化、类型推断与适配 Rust 语义的控制流结构化,实现传统面向 C 反编译器不具备的 Rust 独有抽象还原。
5.1. Binary-level Analysis
二进制级分析
Rust Compiler Version Identification. During binary-level analysis, OXIDIZER first identifies the Rust compiler version used for the target binary. We achieve this goal using FLIRT [51], which recovers function symbols by converting the bytes of the function into a signature and matching it against a database of known signatures. We build FLIRT signatures for each Rust compiler version from 1.39.0 to 1.93.0, and match the Rust binary with every FLIRT signature file. FLIRT produces the most matches when the Rust binary and the FLIRT signature file are produced from the same version of the Rust compiler. OXIDIZER uses this feature to identify the Rust compiler version.
Rust 编译器版本识别。二进制级分析阶段,OXIDIZER 首先识别目标二进制使用的 Rust 编译器版本。我们使用 FLIRT[51] 实现该功能:将函数字节转换为签名,与已知签名库匹配以还原函数符号。我们为 1.39.0 到 1.93.0 的每个 Rust 编译器版本构建 FLIRT 签名,将 Rust 二进制与所有 FLIRT 签名文件匹配。当 Rust 二进制与 FLIRT 签名文件来自同一 Rust 编译器版本时,FLIRT 匹配数最多,OXIDIZER 利用该特性识别编译器版本。
Rust Standard Library Function Identification. If symbols are not present in the binary, OXIDIZER applies the FLIRT signature file with the highest match rate in the prior stage to identify Rust standard library functions.
Rust 标准库函数识别。若二进制中无符号,OXIDIZER 使用前一阶段匹配率最高的 FLIRT 签名文件识别 Rust 标准库函数。
FLIRT Signature Propagation. However, FLIRT has two limitations in terms of identifying Rust standard library functions. First, FLIRT identifies known functions using masked function prologue bytes (e.g., function call addresses will be masked), it is unable to distinguish simple functions that only consist of one or several function calls. Second, FLIRT is not able to identify unknown monomorphized generic functions. For example, the function core::ptr::drop_in_place may have different implementation for each generic type T.
FLIRT 签名传播。但 FLIRT 在识别 Rust 标准库函数时有两点局限:第一,FLIRT 使用掩码处理的函数序言字节识别已知函数(如函数调用地址会被掩码),无法区分仅包含一个或少量函数调用的简单函数;第二,FLIRT 无法识别未知的单态化泛型函数,例如 core::ptr::drop_in_place<T> 对每个泛型类型 T 可能有不同实现。
FLIRT fails to identify a large amount of standard library functions, especially resource-release functions, due to these limitations. To address this issue, OXIDIZER propagates FLIRT-recognized function symbols to their caller functions if their caller functions only contain a single call. Additionally, if an unmatched function contains calls only to resource-release functions (identified by FLIRT), we mark that function as a resource-release function as well. This process is recursively applied until a fixed point is reached.
这些局限导致 FLIRT 无法识别大量标准库函数,尤其是资源释放函数。为解决该问题,OXIDIZER 将 FLIRT 识别的函数符号传播到仅包含单个调用的调用函数。此外,若未匹配函数仅调用 FLIRT 识别的资源释放函数,则将其标记为资源释放函数。该过程递归执行直至达到不动点。
Rust Standard Library Type Application. C decompilers often apply known C library function prototypes and struct definitions to improve decompilation quality. For example, angr parses and embeds thousands of function prototypes from glibc, Win32 APIs, and so on. Likewise, we parse struct definitions and function prototypes from different versions of Rust standard libraries, and embed them into OXIDIZER.
Rust 标准库类型应用。C 反编译器通常应用已知 C 库函数原型与结构体定义提升反编译质量,例如 angr 解析并嵌入数千个来自 glibc、Win32 API 等的函数原型。同理,我们解析不同版本 Rust 标准库的结构体定义与函数原型,嵌入到 OXIDIZER 中。
There are two additional challenges to apply struct and function types to Rust binaries: the layout of structs and the discriminants of enum variants are not guaranteed to be the same across different Rust binaries (Section 4, Fidelity Issues 9 & 10). We looked into the Rust compiler source code and found that while Rust does not guarantee the layout of structs and the discriminants of enum variants to be consistent across Rust compiler versions, the logic of struct layout reordering and discriminant assignment is deterministic within the same Rust compiler version, unrelated to what program it is compiling. As such, we build a Rust standard library type database for each Rust compiler version. OXIDIZER loads and applies the corresponding Rust standard library type database for the identified Rust compiler version.
将结构体与函数类型应用到 Rust 二进制还有两项额外挑战:结构体布局与枚举变体判别值不保证在不同 Rust 二进制间一致(第 4 节,保真度问题 9、10)。我们查阅 Rust 编译器源码发现:尽管 Rust 不保证结构体布局与枚举变体判别值在不同编译器版本间一致,但同一编译器版本内,结构体布局重排与判别值分配逻辑是确定的,与编译的程序无关。因此,我们为每个 Rust 编译器版本构建标准库类型数据库,OXIDIZER 为识别到的编译器版本加载并应用对应数据库。
5.2. fCFG-level Simplification without Types
无类型 fCFG 级简化
Resource Release Simplification. Rust's automatic resource management feature introduces extraneous complex code in decompilation (Section 4, Fidelity Issue 6). To tackle this issue, OXIDIZER removes Rust-only resource-release function calls and their related code, such as drop_in_place, drop, and __rust_dealloc.
资源释放简化。Rust 的自动资源管理特性在反编译中引入冗余复杂代码(第 4 节,保真度问题 6)。为解决该问题,OXIDIZER 移除 Rust 独有资源释放函数调用及其相关代码,如 drop_in_place、drop、__rust_dealloc。
Security Check Simplification. OXIDIZER removes function calls that are related to security checks (e.g., index out of bounds) as well as related code.
安全检查简化。OXIDIZER 移除与安全检查相关的函数调用(如越界索引)及其相关代码。
5.3. Rust Type Inference
Rust 类型推断
We found that Rust codebases have a high density of function calls (8.19 calls per function with 22.28 LoC on average in our dataset), and struct and enum types are prevalent (there are 16,650 struct types and 7,586 enum types among all 39,868 types in our dataset). However, traditional constraint-based type inference algorithms are mostly intra-procedural, which miss the type information from callee functions. They are also very conservative at inferring struct types, and are unable to infer enum types. As such, we implement inter-procedural function prototype inference to mitigate these limitations.
我们发现 Rust 代码库函数调用密度高(数据集中平均每个函数 22.28 行代码,含 8.19 次调用),结构体与枚举类型普遍(数据集 39 868 个类型中,有 16 650 个结构体类型、7 586 个枚举类型)。但传统基于约束的类型推断算法多为过程内算法,丢失被调用函数的类型信息,在推断结构体类型时非常保守,且无法推断枚举类型。因此,我们实现过程间函数原型推断以缓解这些局限。
Function Prototype Inference. OXIDIZER infers (1) struct and enum function return types and (2) struct function argument types.
函数原型推断。OXIDIZER 推断:(1)结构体与枚举函数返回类型;(2)结构体函数参数类型。
Struct and Enum Return Types. OXIDIZER analyzes both the callee function and its call sites to infer the return type of a callee function. When a Rust function returns a large struct (typically larger than 16 bytes), it writes the struct content to a caller-allocated buffer to which the first function argument point. OXIDIZER traverses all non-looping paths leading to every return block and collects memory writes to the buffer on each path. When writes to multiple offsets in the buffer or writes of multiple sizes exist, it merges the writes on all paths into one to infer the size of the returned struct (or enum, as we will explain next). For structs of 16 bytes or fewer, the Rust compiler may instead return them via registers (e.g., rax and rdx on x86-64). OXIDIZER handles this case by detecting multi-register returns through calling convention analysis.
结构体与枚举返回类型。OXIDIZER 同时分析被调用函数及其调用点,推断被调用函数返回类型。当 Rust 函数返回大结构体(通常大于 16 字节)时,会将结构体内容写入调用方分配的缓冲区,该缓冲区由函数第一个参数指向。OXIDIZER 遍历所有通向返回块的非循环路径,收集每条路径对缓冲区的内存写入。当存在对缓冲区多个偏移的写入或多尺寸写入时,合并所有路径写入以推断返回结构体(或枚举,后续说明)的大小。对 16 字节及以下的结构体,Rust 编译器可能通过寄存器返回(如 x86-64 上的 rax、rdx),OXIDIZER 通过调用约定分析检测多寄存器返回以处理该情况。
In most cases, the sizes of the inferred struct types (memory writes or register writes) along different paths are the same because Rust functions must return a fixed-size type. An exception is enum types where different enum variants may hold associated fields of different sizes. In this case, OXIDIZER infers the returned type as an enum type.
多数情况下,不同路径推断的结构体类型大小(内存写入或寄存器写入)相同,因为 Rust 函数必须返回固定大小类型。例外是枚举类型:不同枚举变体可持有不同大小的关联字段,此时 OXIDIZER 推断返回类型为枚举类型。
Error Handling Enum Return Types. OXIDIZER also infers error handling enum return types using heuristics. Result<T, E> and Option are two common enum return types for recoverable error handling. If the inferred type is an enum type with two variants, OXIDIZER checks if the return type could be an error handling enum type using the following heuristics. As Section 4 explains, Rust binaries distinguish enum variants using the discriminant stored at the beginning of the enum value.
错误处理枚举返回类型。OXIDIZER 还使用启发式规则推断错误处理枚举返回类型。Result<T, E> 与 Option<T> 是处理可恢复错误的两种常见枚举返回类型。若推断类型为双变体枚举,OXIDIZER 使用以下启发式规则检查是否为错误处理枚举类型。如第 4 节所述,Rust 二进制使用存储在枚举值开头的判别值区分枚举变体。
The None variant of Option has no associated fields. When an inferred enum type has two variants and one of them has a discriminant but no associated fields, OXIDIZER infers it as Option, treating that variant as None and the other (with one associated field) as Some. The same heuristics also apply to Result<T, E>. To distinguish Ok from Err, OXIDIZER uses Rust's declaration order: The variant with the lower discriminant value is Ok. When niche-value optimization eliminates one discriminant (i.e., only one variant has a discriminant), OXIDIZER distinguishes variants based on context. For Option, the missing discriminant indicates the Some variant. For Result<T, E>, if the matched variant leads to an early return after the call site, OXIDIZER assigns the other variant as Ok; otherwise, the larger variant is empirically assigned as Ok. Fully niche-optimized types such as Option<&T>, where the enum is represented as a single pointer with null indicating None, are indistinguishable from raw pointers and integers in binary code and are a limitation of OXIDIZER.
Option<T> 的 None 变体无关联字段。当推断枚举类型为双变体,且其中一个变体有判别值但无关联字段时,OXIDIZER 推断为 Option<T>,将该变体视为 None,另一个带单个关联字段的变体视为 Some。相同启发式规则适用于 Result<T, E>。为区分 Ok 与 Err,OXIDIZER 使用 Rust 声明顺序:判别值较小的变体为 Ok。当空位值优化消除一个判别值(即仅一个变体有判别值)时,OXIDIZER 基于上下文区分变体。对 Option<T>,缺失判别值表示 Some 变体;对 Result<T, E>,若匹配变体在调用点后导致提前返回,则将另一个变体设为 Ok,否则经验上将较大变体设为 Ok。完全空位优化类型(如 Option<&T>,枚举表示为单指针,空值表示 None)在二进制代码中与原始指针、整数无法区分,这是 OXIDIZER 的局限。
Struct Argument Types. When struct arguments are constructed at call sites or accessed in the callee function, OXIDIZER can infer struct argument types by collecting memory writes and reads.
结构体参数类型。当结构体参数在调用点构造或在被调用函数中访问时,OXIDIZER 可通过收集内存读写推断结构体参数类型。
Constraint-based Type Inference. We implement the constraint-based type inference algorithm Retypd [52] for OXIDIZER and enhance the type system and the solver with enum types. OXIDIZER performs function prototype inference for non-standard-library callee functions and the analyzed function itself. The recovered function argument types and return types will be translated to type constraints, which will significantly contribute to the type inference result.
基于约束的类型推断。我们为 OXIDIZER 实现基于约束的类型推断算法 Retypd[52],并用枚举类型增强类型系统与求解器。OXIDIZER 对非标准库被调用函数与待分析函数自身执行函数原型推断,还原的函数参数类型与返回类型会转换为类型约束,显著提升类型推断结果。
5.4. fCFG-level Simplification with Types
带类型 fCFG 级简化
Struct & Enum Initialization Recovery. OXIDIZER recovers the initialization of struct and enum variables, using the type and variable information recovered by the prior type inference stage.
结构体与枚举初始化还原。OXIDIZER 使用前一阶段类型推断还原的类型与变量信息,还原结构体与枚举变量初始化。
Outlining Macros. OXIDIZER outlines frequently used standard macros, including println, format, write, panic, and so on. Rather than relying on superficial syntactic pattern matching, OXIDIZER performs inter-procedural control-flow and data-flow analysis on the simplified fCFG, combined with recognized Rust standard library function symbols and inferred variable types, to identify macro expansion patterns across different compiler versions and optimization levels.
宏外联。OXIDIZER 对常用标准宏外联,包括 println、format、write、panic 等。不依赖表面语法模式匹配,OXIDIZER 对简化后的 fCFG 执行过程间控制流与数据流分析,结合已识别的 Rust 标准库函数符号与推断的变量类型,识别跨编译器版本与优化级别的宏展开模式。
Deref Coercion Simplification. OXIDIZER identifies explicit struct type conversions that are handled by Deref::deref function calls and reverts them to the original implicit form as in Rust. These function calls are sometimes inlined in Rust binaries. Because deref coercion is implemented differently for different type pairs, the simplification for inlined deref coercion in OXIDIZER must be specified for each type pair. OXIDIZER implements inlined deref coercion simplification for &String→&str and &Vec→slice pairs.
解引用强制转换简化。OXIDIZER 识别由 Deref::deref 函数调用处理的显式结构体类型转换,并将其还原为 Rust 中的原始隐式形式。这些函数调用有时在 Rust 二进制中内联。由于不同类型对的解引用强制转换实现不同,OXIDIZER 中内联解引用强制转换简化必须为每个类型对指定。OXIDIZER 实现 &String→&str 与 &Vec→切片 类型对的内联解引用强制转换简化。
5.5. Structuring
结构化
Pattern Matching Recovery. This component aims to identify potential pattern matching constructs. OXIDIZER traverses all condition nodes and checks if the condition is a comparison between the discriminant of an enum typed value and the expected discriminant. When eligible, OXIDIZER replaces the node with a match or if let node.
模式匹配还原。该模块旨在识别潜在模式匹配结构。OXIDIZER 遍历所有条件节点,检查条件是否为枚举类型值的判别值与预期判别值的比较。符合条件时,OXIDIZER 将节点替换为 match 或 if let 节点。
Error Propagation Simplification. After pattern match recovery, OXIDIZER then attempts to find all recovered pattern match structures that have an early return branch. Eligible matches will be simplified to the ? error propagation operators.
错误传播简化。模式匹配还原后,OXIDIZER 尝试查找所有带提前返回分支的已还原模式匹配结构,符合条件的匹配将简化为 ? 错误传播运算符。
6. Decompilation Evaluation Metrics
反编译评估指标
Existing work evaluates decompilation quality primarily along two dimensions: conciseness [6, 7, 53] and source-code fidelity [8, 31]. Conciseness measures code complexity (e.g., Cyclomatic Complexity [54]), while fidelity reflects how closely the output matches the original source [31]; these two dimensions have been shown to be unaligned [8]. Existing fidelity metrics are either qualitative and manually assessed [31] or tailored specifically to C [8].
现有工作主要从两个维度评估反编译质量:简洁性[6, 7, 53] 与源码保真度[8, 31]。简洁性衡量代码复杂度(如圈复杂度[54]),保真度反映输出与原始源码的贴近程度[31];已有研究表明这两个维度并不完全一致[8]。现有保真度指标要么是定性且人工评估[31],要么专门针对 C 设计[8]。
Although prior work evaluates decompiled C code, evaluating complexity becomes harder when decompiling non-C languages using C-oriented tools. Some work applies machine-learning metrics such as string edit distance [55], but we find these insufficient for capturing structural aspects such as control flow and types. Therefore, we design new metrics and extend existing ones to assess the conciseness and fidelity for decompiling Rust binaries and especially malware [9, 28, 56].
尽管已有工作评估反编译 C 代码,但用面向 C 的工具反编译非 C 语言时,复杂度评估更困难。部分工作应用字符串编辑距离等机器学习指标[55],但我们发现这些指标不足以捕捉控制流、类型等结构特征。因此,我们设计新指标并扩展现有指标,评估 Rust 二进制(尤其是恶意软件[9, 28, 56])反编译的简洁性与保真度。
Conciseness Metrics
简洁性指标
We motivate these metrics using previous work by Enders et al. that evaluates the quality of decompilers on malware using human studies [56]. Their work identifies metrics that analysts found important during malware reverse engineering with various decompilers.
我们基于 Enders 等人的前期工作[56]设定这些指标,该工作通过人工评估评估反编译器在恶意软件上的质量,识别出分析人员在使用不同反编译器进行恶意软件逆向时认为重要的指标。
Cyclomatic Complexity. Cyclomatic Complexity (CC) [54] is a software engineering metric that quantitatively estimates code complexity and is used in previous decompiler evaluations [53]. CC measures the number of independent paths through a program, which estimates the impact of conditions on control flow. This metric rewards decompilers for reducing the total number of explicit conditions, like booleans in if-statements, which addresses a common complaint of decompiler users reported in previous work [56].
圈复杂度。圈复杂度(CC)[54] 是软件工程指标,定量评估代码复杂度,用于前期反编译器评估[53]。CC 衡量程序中独立路径数量,评估条件对控制流的影响。该指标奖励反编译器减少显式条件总数(如 if 语句中的布尔值),解决前期工作中反编译器用户反馈的常见问题[56]。
Lines of Code (LoC). LoC is the most common way to evaluate the conciseness of code [6, 7]. It gives an overview of general complexity in a program [57].
代码行数(LoC)。代码行数是评估代码简洁性最常用的方式[6, 7],可直观反映程序的整体复杂度[57]。
Number of Operators (NofOp). To compensate for lines of code full of chained expressions, we also count the total number of operators present. These operators include arithmetic, assignment, comparison, and logical operations. Similarly, previous work uses logical operators to estimate the complexity of decompilation [8]. This metric rewards decompilers that simplify data flow and conditions.
运算符数量(NofOp)。为弥补链式表达式导致的行数失真,我们统计所有运算符总数,包括算术、赋值、比较与逻辑运算。与前期工作一致[8],该指标奖励简化数据流与条件的反编译器。
Number of Variables (NofVar). Another reported complaint of decompiler users in previous work [56] is that decompilation often introduces "unnecessary variables." As such, we also evaluate the number of variables present in decompilation, which can estimate a decompiler's ability to coalesce and simplify data access.
变量数量(NofVar)。前期工作[56] 中用户反馈的另一问题是反编译常引入"不必要变量"。因此我们评估反编译结果中的变量数量,衡量反编译器合并与简化数据访问的能力。
Fidelity Metrics
保真度指标
Previous work shows that decompilations closer to their source are more readable [31], and fidelity also serves as a proxy for correctness. To assess the fidelity of OXIDIZER on Rust code, we design metrics informed by the challenges in Section 4 as follows.
前期工作表明,更贴近源码的反编译结果可读性更高[31],保真度也可作为正确性的替代指标。为评估 OXIDIZER 对 Rust 代码的保真度,我们基于第 4 节的挑战设计如下指标。
Gotos. Many previous works [6--8, 53, 58] use the number of goto statements in decompilation as a conciseness metric. However, in Rust source code gotos are invalid control-flow structures. To align with source, Rust decompilers should emit zero gotos to be correct.
Goto 语句数。多项前期工作[6--8, 53, 58] 将反编译结果中的 goto 语句数量作为简洁性指标。但在 Rust 源码中,goto 是非法控制流结构。为与源码一致,合格的 Rust 反编译器应输出 0 个 goto。
Matched String Literals. String literals are key in software reverse engineering [56]. A good decompiler should preserve the strings found in the original source. Rust binaries notoriously make it harder to display string literals in decompilation. This metric measures the fidelity of data displayed in decompilation.
匹配字符串字面量。字符串字面量是软件逆向的关键信息[56],优秀的反编译器应保留源码中的字符串。Rust 二进制向来难以在反编译中正确显示字符串,该指标衡量反编译数据的保真度。
Matched Source Function Calls. Since Rust binaries contain many inlining and transformations, an ideal Rust decompiler must have the same number of each call found in the source. This metric measures control-flow fidelity as used in prior work [8].
匹配源码函数调用。由于 Rust 二进制包含大量内联与变换,理想的 Rust 反编译器应还原出与源码数量一致的各类调用。该指标沿用前期工作[8] 衡量控制流保真度。
Extraneous Function Calls. Rust compilers introduce extraneous function calls that do not align with any function calls in the source. An ideal Rust decompiler should remove these extraneous function calls. This metric measures control-flow fidelity similar to the previous metric.
冗余函数调用。Rust 编译器会引入源码中不存在的冗余函数调用,理想的 Rust 反编译器应移除这些调用。该指标与上一指标同理,衡量控制流保真度。
Matched Source Macro Calls. Like function calls, macro calls also indicate how well a decompiler is at recovering original source control flow. In Rust, macros like println are often used. High-fidelity Rust decompilation should match as many of these macros as possible.
匹配源码宏调用。与函数调用类似,宏调用反映反编译器还原原始源码控制流的能力。println! 等宏在 Rust 中广泛使用,高保真 Rust 反编译应尽可能还原这些宏。
Type Inference Accuracy. Following prior work [59], we evaluate type inference accuracy using precision, recall, and F1 score for local variables, function argument variables, and function return types.
类型推断准确率。沿用前期工作[59],我们用精确率、召回率与 F1 分数评估局部变量、函数参数变量与函数返回类型的推断效果。
7. Evaluation
评估
Our evaluation consists of four parts: a comparative evaluation, an ablation study, a human evaluation, and a malware case study. In the comparative study, we evaluate OXIDIZER against existing C and Rust decompilation techniques regarding conciseness and fidelity losses (Section 7.1). We then study the contribution of each component in OXIDIZER through an ablation study (Section 7.2). We also conduct a user study to measure if and how much OXIDIZER improves humans' performance in Rust reverse engineering tasks (Section 7.3). Finally, we include several case studies to highlight the advantages of OXIDIZER in more concrete scenarios (Section 7.4).
本文评估包含四部分:对比评估、消融实验、人工评估与恶意软件案例研究。对比研究中,我们将 OXIDIZER 与现有 C/Rust 反编译技术在简洁性与保真度上对比(7.1 节);随后通过消融实验研究 OXIDIZER 各组件的贡献(7.2 节);通过用户研究衡量 OXIDIZER 对人类 Rust 逆向任务性能的提升(7.3 节);最后通过多个案例研究展示 OXIDIZER 在实际场景中的优势(7.4 节)。
7.1. Comparative Evaluation
对比评估
Datasets. We include 27 projects from the top 50 Rust repositories on GitHub (ranked by stars), excluding the Rust compiler repository, tutorials, non-Rust projects (e.g., those embedding Rust libraries), and projects that we could not build for non-trivial reasons. We also include the Rust reimplementation of GNU Coreutils [19], as GNU Coreutils is widely used in prior work on decompilation [6--8]. The complete list of selected Rust projects can be found in Appendix A.
数据集。我们选取 GitHub Star 排名前 50 的 Rust 仓库中的 27 个项目,排除 Rust 编译器仓库、教程、非 Rust 项目(如嵌入 Rust 库的项目)与无法正常构建的项目。同时加入 GNU Coreutils 的 Rust 重实现版[19],因其在前期反编译研究中广泛使用[6--8]。所选 Rust 项目完整列表见附录 A。
We compiled these projects on Ubuntu 22.04 LTS with several optimization levels (O0, O1, O2, O3, Os, and Oz) and two different versions of Rust compilers (nightly-2025-05-22 and nightly-2023-05-22)². Additionally, we disable link-time optimization (LTO) and LLVM inlining to minimize the effects of function inlining. With inlining enabled, source-level baselines become unreliable for certain metrics (e.g., matched strings and matched functions), because inlined functions no longer one-to-one correspond with source functions. We provide evaluation results on inlining-enabled binaries (nightly-2025-05-22, O3) in Appendix D to demonstrate OXIDIZER's effectiveness under different settings. All binaries are stripped.
我们在 Ubuntu 22.04 LTS 上编译这些项目,使用多种优化级别(O0、O1、O2、O3、Os、Oz)与两个 Rust 编译器版本(nightly-2025-05-22 与 nightly-2023-05-22)²。同时禁用链接时优化(LTO)与 LLVM 内联,最小化函数内联影响。启用内联时,内联函数与源码函数不再一一对应,字符串匹配、函数匹配等指标的源码基准不可靠。我们在附录 D 提供启用内联二进制(nightly-2025-05-22,O3)的评估结果,证明 OXIDIZER 在不同配置下的有效性。所有二进制均已去符号。
² We use nightly Rust compiler versions so that we can use the unstable option -Z inline-llvm=false to disable LLVM inlining.
² 我们使用 nightly 版 Rust 编译器,以便使用不稳定选项 -Z inline-llvm=false 禁用 LLVM 内联。
Method. For conciseness and fidelity, we evaluate OXIDIZER using the metrics described in Section 6 and compare against existing decompilers including angr, Hex-Rays, Ghidra, Binary Ninja (with Pseudo C representation), and Binary Ninja (with Pseudo Rust representation). Although Ghidrust is another potential baseline, we exclude it from our experiments due to its high failure rate. In particular, when tested on the 30,151 functions that Ghidra successfully decompiled, Ghidrust failed to transpile 80% of them. We only include functions that are successfully decompiled by all decompilers.
方法。针对简洁性与保真度,我们用第 6 节所述指标评估 OXIDIZER,并与现有反编译器对比:angr、Hex-Rays、Ghidra、Binary Ninja(伪 C 表示)、Binary Ninja(伪 Rust 表示)。尽管 Ghidrust 是潜在基准,但因其高失败率被排除。具体而言,在 Ghidra 成功反编译的 30 151 个函数中,Ghidrust 80% 无法转译。我们仅纳入所有反编译器均成功反编译的函数。
Results. We present results for one configuration (nightly-2025-05-22 with O3). OXIDIZER similarly outperforms other decompilers under the other two representative configurations (2025--O0 and 2023--O3) which are presented in Appendix D. All decompilers successfully decompiled 12,472 functions in the 2025-O3 dataset.
结果。我们展示一组配置(nightly-2025-05-22,O3)的结果,OXIDIZER 在另外两组代表性配置(2025--O0 与 2023--O3)下同样优于其他反编译器,结果见附录 D。所有反编译器在 2025-O3 数据集中均成功反编译 12 472 个函数。
OXIDIZER outperforms all other decompilers on all metrics except for matched function calls (marginally behind Binary Ninja). This is primarily because unmatched function calls are mostly caused by function inlining, but OXIDIZER does not implement function outlining because it is out of scope.
OXIDIZER 在除匹配函数调用外的所有指标上均优于其他反编译器(仅略低于 Binary Ninja),主要原因是不匹配的函数调用多由内联导致,而函数外联不在 OXIDIZER 实现范围内。
A notable result is that the average number of variables produced by OXIDIZER (10.65) remains significantly higher than in the original source code (0.70). We investigated this gap and found that it is largely due to Rust's functional programming style, which encourages the use of chained expressions. As a result, Rust code often avoids introducing intermediate variables. However, none of the evaluated decompilers including OXIDIZER actively reconstruct such chained expressions. It is also interesting that Binary Ninja (Pseudo Rust) produces 21% fewer operators compared to Binary Ninja (Pseudo C). That is because Binary Ninja (Pseudo Rust) converts algorithmic memory access expressions to macro-like annotations, which is neither considered as a function call nor an operator.
值得注意的是,OXIDIZER 生成的平均变量数(10.65)仍远高于原始源码(0.70)。我们调查发现,这主要源于 Rust 函数式编程风格鼓励链式表达式,源码通常避免引入中间变量,而包括 OXIDIZER 在内的所有参评反编译器均未主动重建这类链式表达式。另一个有趣的结果是,Binary Ninja(伪 Rust)比 Binary Ninja(伪 C)少生成 21% 的运算符,原因是伪 Rust 模式将算法化内存访问表达式转换为类宏注解,既不算函数调用也不算运算符。
OXIDIZER is also the only decompiler that can recover macro calls. It correctly recovers 9.8% of the macro calls from the source. OXIDIZER fails to recover the remaining targeted macros mainly because compiler optimizations introduce unexpected control flow and data flow, or because the macros are user-defined or standard macros not covered by OXIDIZER.
OXIDIZER 也是唯一能还原宏调用的反编译器,正确还原 9.8% 的源码宏调用。未能还原剩余目标宏的主要原因是编译器优化引入意外控制流与数据流,或宏为用户自定义、或不在 OXIDIZER 覆盖的标准宏范围内。

TABLE 1: Evaluation results on conciseness and fidelity metrics for the nightly-2025-05-22-O3 dataset. 12,472 functions are evaluated across 157 binaries. We present average and median values per function for each metric. We also show percentage relative to the source code in parentheses for average values. The best average values per metric are highlighted in bold.
表 1:nightly-2025-05-22-O3 数据集上的简洁性与保真度指标评估结果。本次评估涵盖 157 个二进制文件中的 12,472 个函数。表中呈现各指标下每个函数的平均值与中位数;平均值同时以括号形式给出相对于源码的百分比。每个指标的最佳平均值以粗体标注。
| Metric 指标 | Source 源码 | OXIDIZER | angr | Hex-Rays | Ghidra | Binary Ninja (Pseudo C) | Binary Ninja (Pseudo Rust) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Avg. 平均值 | Med. 中位数 | Avg. 平均值 (相对源码占比) | Med. 中位数 | Avg. 平均值 (相对源码占比) | Med. 中位数 | Avg. 平均值 (相对源码占比) | Med. 中位数 | Avg. 平均值 (相对源码占比) | Med. 中位数 | Avg. 平均值 (相对源码占比) | Med. 中位数 | Avg. 平均值 (相对源码占比) | Med. 中位数 | |
| CC 圈复杂度 | 2.99 | 2 | 3.20 (107.3%) | 1 | 3.90 (130.7%) | 2 | 4.14 (138.8%) | 2 | 4.07 (136.4%) | 2 | 4.23 (141.7%) | 2 | 4.23 (141.6%) | 2 |
| LoC 代码行数 | 21.71 | 11 | 41.06 (189.1%) | 21 | 49.58 (228.4%) | 25 | 47.88 (220.6%) | 23 | 61.21 (281.9%) | 29 | 53.75 (247.6%) | 25 | 56.07 (258.3%) | 26 |
| NofVar 变量数量 | 0.70 | 0 | 10.65 (1,530.3%) | 5 | 12.20 (1,752.8%) | 6 | 13.74 (1,974.2%) | 8 | 16.16 (2,321.7%) | 7 | 15.39 (2,211.6%) | 8 | 15.39 (2,211.6%) | 8 |
| NofOps 运算符数量 | 3.01 | 1 | 21.39 (711.0%) | 9 | 31.27 (1,039.3%) | 13 | 30.53 (1,014.8%) | 11 | 31.55 (1,048.6%) | 14 | 29.38 (976.4%) | 14 | 23.09 (767.4%) | 10 |
| Gotos Goto 语句数 | 0.00 | 0 | 0.28 | 0 | 0.51 | 0 | 0.79 | 0 | 0.74 | 0 | 0.50 | 0 | 0.50 | 0 |
| Matched Strings 匹配字符串数 | 1.38 | 0 | 0.62 (45.2%) | 0 | 0.10 (7.0%) | 0 | 0.00 (0.3%) | 0 | 0.39 (28.5%) | 0 | 0.04 (3.1%) | 0 | 0.04 (3.1%) | 0 |
| Matched Calls 匹配函数调用数 | 7.78 | 4 | 2.53 (32.5%) | 1 | 2.57 (33.0%) | 1 | 2.88 (37.1%) | 1 | 2.00 (25.7%) | 1 | 2.92 (37.6%) | 1 | 2.92 (37.6%) | 1 |
| Extraneous Calls 冗余函数调用数 | 0.00 | 0 | 2.79 | 1 | 3.67 | 1 | 3.88 | 1 | 3.08 | 1 | 4.10 | 1 | 4.10 | 1 |
| Matched Macros 匹配宏调用数 | 0.67 | 0 | 0.07 (9.8%) | 0 | 0.00 (0.0%) | 0 | 0.00 (0.0%) | 0 | 0.00 (0.0%) | 0 | 0.00 (0.0%) | 0 | 0.00 (0.0%) | 0 |
TABLE 2: Type inference evaluation results across decompilers for the nightly-2025-05-22-O3 dataset. The precision, recall, and F1 score are reported per type category, and the best scores are highlighted in bold. The total number of types per category is also provided. For struct and enum types, OXIDIZER outperforms other decompilers on all scores.
表 2:nightly-2025-05-22-O3 数据集下各反编译器的类型推断评估结果。表中按类型类别分别报告精确率、召回率与 F1 分数,最佳分数以粗体标注;同时提供每类别的总类型数量。在结构体与枚举类型上,OXIDIZER 的所有指标均优于其他反编译器。
| Category 类型类别 | OXIDIZER | angr | Hex-Rays | Ghidra | Binary Ninja | Total 总数 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Precision 精确率 | Recall 召回率 | F1 | Precision 精确率 | Recall 召回率 | F1 | Precision 精确率 | Recall 召回率 | F1 | Precision 精确率 | Recall 召回率 | F1 | Precision 精确率 | Recall 召回率 | F1 | ||
| Overall 总体 | 4.57% | 15.40% | 7.05% | 0.84% | 3.63% | 1.36% | 1.15% | 5.20% | 1.88% | 0.79% | 4.18% | 1.32% | 0.75% | 4.27% | 1.27% | 53,113 |
| Primitive 基础类型 | 1.87% | 30.34% | 3.53% | 1.07% | 35.05% | 2.07% | 1.41% | 52.02% | 2.74% | 1.01% | 42.00% | 1.97% | 1.22% | 41.57% | 2.38% | 4,771 |
| Reference 引用类型 | 3.16% | 2.40% | 2.73% | 0.38% | 1.05% | 0.55% | 0.45% | 0.90% | 0.60% | 0.27% | 0.86% | 0.42% | 0.19% | 1.16% | 0.33% | 11,885 |
| Array 数组类型 | 0.00% | 0.00% | 0.00% | 0.02% | 0.79% | 0.03% | 0.11% | 11.81% | 0.22% | 0.09% | 6.30% | 0.17% | 0.00% | 0.00% | 0.00% | 127 |
| Struct 结构体类型 | 8.08% | 29.89% | 12.72% | 0.00% | 0.00% | 0.00% | 2.53% | 0.27% | 0.48% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 16,681 |
| Option 选项类型 | 3.28% | 17.72% | 5.53% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 2,049 |
| Result<T, E> 结果类型 | 6.21% | 18.67% | 9.32% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 3,536 |
| Other Enum 其他枚举类型 | 35.25% | 7.02% | 11.71% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 2,180 |
Table 2 presents the precision, recall, and F1 scores for type inference achieved by OXIDIZER and existing decompilers. The majority of variables in our datasets are of struct types (16,681 out of 53,113), where OXIDIZER achieves a precision of 8.08% and a recall of 29.89%, outperforming all other decompilers. Notably, OXIDIZER is the only decompiler capable of recovering enum types. We separate Option, Result<T, E> from other enum types because of their prevalence among all enum types. As discussed in Section 6, all evaluated decompilers tend to produce a large number of variables that do not exist in the original source code. We consider any inferred type associated with such non-existent variables as a false positive, which contributes to the low precision across many type categories and decompilers.
表 2 展示 OXIDIZER 与现有反编译器的类型推断精确率、召回率与 F1 分数。数据集中多数变量为结构体类型(53 113 个中的 16 681 个),OXIDIZER 在该类型上精确率 8.08%、召回率 29.89%,优于所有其他反编译器。值得注意的是,OXIDIZER 是唯一能还原枚举类型的反编译器。我们将 Option<T>、Result<T, E> 与其他枚举类型分开,因其在所有枚举类型中最常见。如第 6 节所述,所有参评反编译器均会生成大量源码中不存在的变量,我们将与这些不存在变量关联的推断类型视为假正例,导致所有类型分类与反编译器的精确率偏低。
7.2. Ablation Study
消融实验
We perform a cascade analysis by disabling each of the following three components in OXIDIZER individually and examining how the removal of each component affects both its direct output and downstream stages.
我们通过分别禁用 OXIDIZER 以下三个组件,进行级联分析,观察每个组件移除对直接输出与下游阶段的影响。
Rust Standard Library Function Identification. Disabling Rust standard library function identification means OXIDIZER cannot use any function symbols. In this case, the macro collapsing component fails because Rust standard library function calls are part of macro expansion patterns. The number of extraneous calls increased from 2.79 to 3.52 because OXIDIZER cannot identify resource-release functions to remove. These results highlight the critical role of symbol information in enabling effective Rust decompilation.
Rust 标准库函数识别。禁用该组件后,OXIDIZER 无法使用任何函数符号,宏收拢模块失效,因为 Rust 标准库函数调用是宏展开模式的一部分。冗余调用数从 2.79 升至 3.52,因为 OXIDIZER 无法识别并移除资源释放函数。这些结果凸显符号信息对高效 Rust 反编译的关键作用。
FLIRT Signature Propagation. The direct impact of disabling this component is a drop in standard library function identification from 12% to 8%. The impact is more pronounced for resource-release function identification, which drops from 23% to 3%, as most such functions are identified through propagated signatures. As a consequence, # Extraneous Calls increases from 2.79 to 3.34, as OXIDIZER can no longer remove calls to unrecognized resource-release functions. This also highlights the improvements of OXIDIZER's Rust symbol recovery over FLIRT.
FLIRT 签名传播。禁用该组件的直接影响是标准库函数识别率从 12% 降至 8%;资源释放函数识别率受影响更显著,从 23% 降至 3%,因这类函数多通过传播签名识别。结果冗余调用数从 2.79 升至 3.34,OXIDIZER 无法移除未识别资源释放函数的调用。这也凸显 OXIDIZER 的 Rust 符号还原相比 FLIRT 的改进。
Function Prototype Inference. In the nightly-2025-O3 dataset, 66% functions are user-defined functions. Disabling function prototype inference means OXIDIZER can no longer infer parameter and return types for user-defined functions and the unrecognized Rust standard library functions. With this component disabled, struct recovery recall significantly drops to around 6% and enum recovery recall drops to nearly zero. This is because without function prototype inference, constraint-based type inference can only infer struct and enum types from known Rust standard library functions. Furthermore, the number of recovered match and if let expressions drops by 77% and 85% respectively (from 3,138 to 721 and 3,926 to 579), as accurate function prototypes are essential for identifying enum discriminant checks and reconstructing pattern matching logic.
函数原型推断。在 nightly-2025-O3 数据集中,66% 函数为用户自定义函数。禁用函数原型推断后,OXIDIZER 无法推断用户自定义函数与未识别 Rust 标准库函数的参数与返回类型。结构体还原召回率大幅降至约 6%,枚举还原召回率几乎为 0,因为无函数原型推断时,基于约束的类型推断仅能从已知 Rust 标准库函数推断结构体与枚举类型。此外,还原的 match 与 if let 表达式数量分别下降 77% 与 85%(从 3 138 降至 721、从 3 926 降至 579),因为准确的函数原型对识别枚举判别值检查、重建模式匹配逻辑至关重要。
7.3. Human Evaluation
人工评估
Method. To measure the impact of OXIDIZER on humans' performance in Rust reverse engineering tasks, we conduct an A/B test by dividing participants into two groups, OXIDIZER group and Hex-Rays group respectively. Each participant is assigned four reverse engineering challenges, each targeting a distinct capability: environment checking, file encryption, reverse shell generation, and command-line parsing. Participants self-reported their reverse engineering expertise level (expert, intermediate, or beginner) prior to the study.
方法。为衡量 OXIDIZER 对人类 Rust 逆向任务性能的影响,我们开展 A/B 测试,将受试者分为两组:OXIDIZER 组与 Hex-Rays 组。每位受试者完成四项逆向挑战,分别对应不同能力:环境检查、文件加密、反弹 shell 生成、命令行解析。受试者在实验前自评逆向经验水平(专家、中级、初学者)。
Results. We collected three types of results from the human study: the scores participants achieved in the study, the time they spent to finish the study, and their perception of the decompilation quality.
结果。我们从人工评估中收集三类结果:受试者任务得分、完成耗时、对反编译质量的主观评价。

TABLE 3: Average task scores and completion times across self-reported reverse engineering expertise levels. The participant split between the OXIDIZER and Hex-Rays groups is shown for each expertise level.
表 3:不同自我报告逆向工程经验水平下的平均任务得分与完成时间。表中同时展示了各经验水平下 OXIDIZER 组与 Hex-Rays 组的受试者分配情况。
| Expertise 经验水平 | Participants 受试者总数 | Split 分组分配 | OXIDIZER | Hex-Rays | ||
|---|---|---|---|---|---|---|
| Score (%) 得分(%) | Time (sec.) 耗时(秒) | Score (%) 得分(%) | Time (sec.) 耗时(秒) | |||
| All 全部 | 37 | 18 / 19 | 86.87 | 1,951 | 67.94 | 2,449 |
| Expert 专家级 | 5 | 3 / 2 | 93.94 | 2,179 | 81.82 | 3,368 |
| Intermediate 中级 | 22 | 10 / 12 | 87.27 | 1,907 | 70.45 | 2,195 |
| Beginner 初学者 | 10 | 5 / 5 | 81.82 | 1,901 | 56.36 | 2,693 |
Table 3 presents the average scores and completion times of participants across different expertise levels. The results indicate that participants generally achieved higher scores and completed the tasks more quickly when using OXIDIZER compared to Hex-Rays, which suggests that OXIDIZER produces decompiled output that is easier to understand.
表 3 展示不同经验水平受试者的平均得分与完成时间。结果表明,使用 OXIDIZER 的受试者相比 Hex-Rays 组得分更高、完成更快,说明 OXIDIZER 的反编译输出更易理解。
In addition to objective task scores, user ratings further highlight a clear preference for OXIDIZER. Across all tasks, participants consistently rated OXIDIZER more favorably, with an overall average rating of 4.49 out of 5, compared to 2.61 for Hex-Rays. User comments consistently favored OXIDIZER's output for its readability and Rust-like structure, while Hex-Rays was often seen as more verbose and harder to follow. In summary, users not only performed better with OXIDIZER, but also perceived its decompilation output as significantly more understandable and helpful.
除客观任务得分外,用户评分进一步显示对 OXIDIZER 的明显偏好。所有任务中,受试者对 OXIDIZER 评分始终更高,总分 5 分下平均 4.49 分,Hex-Rays 仅 2.61 分。用户评论普遍认可 OXIDIZER 输出的可读性与类 Rust 结构,而 Hex-Rays 常被认为冗长难懂。总之,用户不仅用 OXIDIZER 表现更好,也认为其反编译输出显著更易懂、更有用。
7.4. Case Study
案例研究
Motivating Example. Compared to the three decompilation or reconstruction outputs shown in Listing 2, OXIDIZER produces the result closest to the original Rust source code (as Listing 1 shows).
示例案例。与清单 2 中的三组反编译或重构输出相比,OXIDIZER 生成的结果最贴近原始 Rust 源码(见清单 1)。
First, OXIDIZER removes extraneous resource-release code, making the control flow much clearer. Second, OXIDIZER recovers Rust error handling constructs match and if let. These structures are essential for understanding control flow and error handling in Rust. In contrast, Hex-Rays decompile these into C-style control structures, losing the original high-level semantics, while ChatGPT's reconstruction wrongly presents error propagation operators, which do not align with the source.
首先,OXIDIZER 移除冗余资源释放代码,让控制流更清晰;其次,还原 Rust 错误处理结构 match 与 if let,这些结构对理解 Rust 控制流与错误处理至关重要。相比之下,Hex-Rays 将其反编译为 C 风格控制结构,丢失原始高层语义;而 ChatGPT 重构错误呈现错误传播运算符,与源码不符。
Despite these advantages, OXIDIZER still has limitations. It fails to recover the types of function arguments a1 and a2 as &[u8].
尽管有这些优势,OXIDIZER 仍存在局限:未能将函数参数 a1、a2 还原为 &[u8] 类型。

TABLE 4: Evaluation results on conciseness and the # Gotos metric for stripped malware samples. 1,180 functions are evaluated. OXIDIZER outperforms Hex-Rays on all conciseness metrics, and produces fewer gotos.
表 4:去符号恶意软件样本的简洁性与 Goto 语句数指标评估结果。本次评估涵盖 1,180 个函数。OXIDIZER 在所有简洁性指标上均优于 Hex-Rays,且生成的 Goto 语句数量更少。
| Metric 指标 | OXIDIZER | Hex-Rays | ||
|---|---|---|---|---|
| Avg. 平均值 | Med. 中位数 | Avg. 平均值 | Med. 中位数 | |
| CC 圈复杂度 | 12.63 | 7 | 20.25 | 10 |
| LoC 代码行数 | 103.52 | 57 | 149.47 | 77 |
| NofVar 变量数量 | 23.80 | 13 | 29.43 | 17 |
| NofOps 运算符数量 | 76.53 | 36 | 96.77 | 45 |
| Gotos Goto 语句数 | 1.99 | 0 | 5.75 | 1 |
Malware Analysis. We conduct an evaluation on eight stripped Rust malware samples that we collected from [13]. The complete list of selected Rust projects can be found in Appendix B. Table 4 shows that OXIDIZER outperforms Hex-Rays on real-world samples in terms of conciseness. We do not include fidelity metrics since we do not have the source code of these malware samples to compare with.
恶意软件分析。我们在从[13]收集的 8 个去符号 Rust 恶意软件样本上开展评估,所选样本完整列表见附录 B。表 4 显示,OXIDIZER 在真实样本的简洁性上优于 Hex-Rays。因无这些恶意软件样本的源码对比,未纳入保真度指标。
Listing 6: The mode selection logic in Luna as decompiled by OXIDIZER and Hex-Rays.
清单 6:Luna 中模式选择逻辑的 OXIDIZER 与 Hex-Rays 反编译结果对比。
(a) OXIDIZER 反编译代码
rust
1 if std::path::Path::is_file(*(v14 as &i64), *((v14 + 16) as &i64)) {
2 println!("{} {} {} encrypting", v12, v14 - 24, v14);
3 ...
4 }
(b) Hex-Rays 反编译代码
c
1 std::path::Path::is_file();
2 if (v8) {
3 if (v1 <= v5) {
4 core::panicking::panic_bounds_check();
5 }
6 if (v1 <= v4) {
7 core::panicking::panic_bounds_check();
8 }
9 *(_QWORD *)&v23 = v0;
10 *(_QWORD *)&v23 + 1) = ::fmt;
11 *(_QWORD *)&v24 = v2 - 24;
12 *((_QWORD *)&v24 + 1) = ::fmt;
13 v25 = (__int128 *)v2;
14 v26 = ::fmt;
15 *(_QWORD *)&v27 = &off_74480;
16 *((_QWORD *)&v27 + 1) = 4LL;
17 *(_QWORD *)&v28 = 0LL;
18 v29 = &v23;
19 v30 = 3LL;
20 std::io::stdio::_print();
21 ...
22 }
- 代码可读性差异 :OXIDIZER 的结果保留了 Rust 语言特性,直接呈现了文件检查逻辑与
println!宏调用,语义清晰;Hex-Rays 的输出则充满了边界检查 panic、底层指针操作与手动格式化逻辑,丢失了高层意图。- 冗余代码的影响:Hex-Rays 未消除 Rust 编译器生成的边界检查代码,导致反编译结果冗长且难以聚焦核心逻辑。
- 宏还原的价值 :OXIDIZER 成功还原了
println!宏,而 Hex-Rays 只能呈现其底层实现的手动内存操作,无法识别宏调用的高层语义。
We also conduct a case study to demonstrate how OXIDIZER and Hex-Rays perform differently on real-world malware samples. Listing 6 contains the mode selection code in Luna [60] as decompiled by OXIDIZER and Hex-Rays, respectively. OXIDIZER's Rust-specific simplifications result in decompiled code that is much easier to understand:
我们还开展了案例研究,以展示 OXIDIZER 与 Hex-Rays 在真实恶意软件样本上的表现差异。清单 6 展示了 Luna [60] 中模式选择逻辑的代码,分别由 OXIDIZER 与 Hex-Rays 反编译生成。OXIDIZER 针对 Rust 的专用简化优化,使反编译结果更易理解:
(1) The extraneous code removal simplifications eliminate bounds checking code and memory deallocation code;
冗余代码消除优化移除了边界检查代码与内存释放代码;
(2) OXIDIZER recovers the expanded println! macro calls, with the string pieces and arguments.
OXIDIZER 还原了展开后的 println! 宏调用,包括字符串片段与参数。
Listing 7: The VM listing logic in BlackCat (Sphynx) as decompiled by OXIDIZER and Hex-Rays. In OXIDIZER's decompilation (Listing 7a), we can clearly see that the ESXi command vim-cmd vmsvc/getallvms is executed, and the result is propagated using Rust's ? operator to handle errors concisely.
清单 7:BlackCat(Sphynx)勒索软件中虚拟机列表逻辑的 OXIDIZER 与 Hex-Rays 反编译结果对比。在 OXIDIZER 的反编译结果(清单 7a)中,可以清晰地看到代码执行了 ESXi 命令 vim-cmd vmsvc/getallvms,并通过 Rust 的 ? 运算符简洁地传递错误结果。
(a) OXIDIZER 反编译代码
rust
1 fn compat_core::esxi::vm::remove_snapshots() -> Result<(), Error> {
2 ...
3 v25 = compat_core::esxi::utils::esxi_run_command_with_output("vim-cmd vmsvc/getallvms")?;
4 ...
5 }
(b) Hex-Rays 反编译代码
c
1 __int64 compat_core::esxi::vm::remove_snapshots(__int64 *a1)
2 {
3 ...
4 compat_core::esxi::utils::esxi_run_command_with_output(&v46, aVimCmdVmsvcGet, 23LL);
5 v2 = v48;
6 if (v46 == (char **)((char *)&dword_0 + 1)) {
7 *a1 = v47;
8 a1[1] = v2;
9 return v1;
10 }
11 ...
12 }
高层语义还原能力差异
- OXIDIZER:还原了地道的 Rust 语法,包括函数签名、
Result类型返回值,以及?错误传播运算符,一眼就能看出函数的意图是执行命令并处理错误。- Hex-Rays:仅能呈现底层 C 风格的指针操作、条件分支与内存地址比较,完全丢失了 Rust 的错误处理语义,难以理解代码的业务逻辑。
恶意软件分析的实际价值
- OXIDIZER 清晰展示了恶意软件执行
vim-cmd vmsvc/getallvms命令以枚举虚拟机的行为,直接揭示了 BlackCat 勒索软件针对虚拟化环境的攻击逻辑。- Hex-Rays 的输出无法体现这一高层意图,需要逆向工程师手动分析大量底层操作才能还原出命令执行与错误处理的流程,大幅提升了分析成本。
Listing 7 shows a comparison between OXIDIZER and Hex-Rays for a snippet from the remove_snapshots function, which lists virtual machines by executing a command. This function internally executes the ESXi command vim-cmd vmsvc/getallvms, and the result is propagated using Rust's ? operator to handle errors concisely.
清单 7 对比了 remove_snapshots 函数的一段代码,该函数通过执行命令列出虚拟机。函数内部会执行 ESXi 命令 vim-cmd vmsvc/getallvms,并使用 Rust 的 ? 运算符简洁地传递错误结果。
Based on the recovered types, OXIDIZER successfully identifies the error-propagation construct and simplifies the decompilation into idiomatic Rust syntax using the ? operator. This makes the high-level semantics of the original code much easier to understand. By contrast, the Hex-Rays output shows low-level pointer comparisons and conditional branches, making it harder to identify the intent of error propagation.
基于还原出的类型信息,OXIDIZER 成功识别了错误传递结构,并将反编译结果简化为使用 ? 运算符的地道 Rust 语法,大幅提升了原始代码高层语义的可读性。相比之下,Hex-Rays 的输出仅呈现底层指针比较与条件分支,难以识别错误传递的设计意图。
With these recovered Rust high-level abstractions, malware analysts can quickly focus on the normal execution flow and reason about the core functionality without being distracted by the low-level noise introduced by compilation.
凭借这些还原的 Rust 高层抽象,恶意软件分析人员可快速聚焦正常执行流,分析核心功能,不受编译引入的底层冗余信息干扰。
LLM4Decompile. We evaluate LLM4Decompile [61] (6.7B-v2) by feeding it Ghidra's decompilation output for 1,562 functions from the Rust coreutils binaries. Of these, 1,318 functions (84.4%) produced degenerate output: the model either repeatedly generated the same tokens or was truncated before completing the decompilation. For the remaining 244 functions, we randomly sampled 30 functions for manual inspection and found 28 of them (93.3%) suffered from hallucination, including fabricated struct types, missing code, incorrect constants, invented control-flow logic, and broken syntax. These results indicate that current fine-tuned LLM-based decompilers remain unreliable for practical Rust decompilation. Beyond reliability, LLM4Decompile produces only textual C output and cannot recover Rust-specific abstractions such as enums, pattern matching, or macros.
LLM4Decompile。我们用 LLM4Decompile[61](6.7B-v2)处理 Rust coreutils 二进制中 1 562 个函数的 Ghidra 反编译输出。其中 1 318 个函数(84.4%)生成退化输出:模型重复生成相同令牌,或在完成反编译前被截断。剩余 244 个函数中,随机抽样 30 个人工检查,发现 28 个(93.3%)存在幻觉,包括伪造结构体类型、代码缺失、常量错误、虚构控制流逻辑、语法错误。这些结果表明,当前微调的基于 LLM 的反编译器对实际 Rust 反编译仍不可靠。除可靠性外,LLM4Decompile 仅生成文本 C 输出,无法还原枚举、模式匹配、宏等 Rust 特有抽象。
8. Limitations
局限
In this section, we discuss some of the limitations in OXIDIZER and areas for future work in Rust decompilation.
本节讨论 OXIDIZER 的局限与 Rust 反编译的未来研究方向。
Generalizability. OXIDIZER relies on pattern-matching heuristics tailored to specific compiler behaviors, which may break when new rustc versions change code generation patterns, enum layouts, or ABI conventions. Adapting to a new compiler version requires regenerating version-specific type databases and potentially updating analysis rules. Additionally, Oxidizer targets a fixed set of high-level constructs (structs, enums, and macros) and does not recover other Rust abstractions such as traits, generics, or closures.
泛化性。OXIDIZER 依赖适配特定编译器行为的模式匹配启发式规则,当新版本 rustc 改变代码生成模式、枚举布局或 ABI 约定时,规则可能失效。适配新版本编译器需要重新生成版本专用类型数据库,可能还需更新分析规则。此外,OXIDIZER 仅针对固定的高层结构(结构体、枚举、宏),未还原 trait、泛型、闭包等其他 Rust 抽象。
Compiler Optimization. A noticeable consequence of Rust compiler optimization is the presence of gotos in Rust binaries. Compiler optimization also hinders OXIDIZER from recovering Rust high-level abstractions during macro recovery, string recovery, and so on, by causing unexpected control flow and data flow in decompilation. Previous researchers have found that compiler optimization is the cause of extraneous gotos in C decompilation [8], and they revert compiler optimization before structuring to avoid unexpected schemas that cause extraneous gotos. The same idea of reverting compiler optimization may also apply to Rust decompilation, but we leave that to future work.
编译器优化。Rust 编译器优化的一个明显后果是 Rust 二进制中出现 goto 语句。编译器优化还会在反编译中引入意外控制流与数据流,阻碍 OXIDIZER 在宏还原、字符串还原等阶段还原 Rust 高层抽象。前期研究者发现,编译器优化是 C 反编译中冗余 goto 的原因[8],并在结构化前回退编译器优化,避免导致冗余 goto 的意外模式。回退编译器优化的思路也可用于 Rust 反编译,我们将其留作未来工作。
Lossy Compilation. Decompilation is challenging due to the loss of information inherent in the compilation process, notably in high-level types [62]. For example, a function that returns a struct may look the same as one that returns an enum with two variants in decompilation, if the two variants are the same size as the struct. It may be hard, or even impossible, for decompilers to distinguish these two cases deterministically. Recent researchers have used machine learning and LLMs to overcome the indeterminacy of decompilation [14--16, 59].
有损编译。反编译的挑战源于编译过程固有的信息丢失,尤其是高层类型[62]。例如,返回结构体的函数与返回双变体枚举的函数,在反编译中可能看起来相同(若两个变体与结构体大小一致),反编译器很难甚至无法确定性区分这两种情况。近期研究者使用机器学习与 LLM 克服反编译的不确定性[14--16, 59]。
Obfuscation. Similar to previous work [6--8], OXIDIZER is not resilient to anti-static analysis techniques, including packing and other code obfuscation techniques. As a static decompiler, OXIDIZER also cannot handle self-modifying code (SMC) or JIT-compiled code, which may appear in Rust malware. Instead, we rely on previous work on general deobfuscation, some of which operate on the control-flow graph level [63] or binary level [64], and would be agnostic to Rust.
混淆。与前期工作[6--8]类似,OXIDIZER 无法抵抗加壳、代码混淆等抗静态分析技术。作为静态反编译器,OXIDIZER 也无法处理 Rust 恶意软件中可能出现的自修改代码(SMC)或 JIT 编译代码。我们依赖前期通用去混淆工作,部分工作在控制流图级别[63]或二进制级别[64]运行,与 Rust 无关。
Recompilation. OXIDIZER does not generate recompilable Rust code. State-of-the-art C decompilers, including angr, Hex-Rays, Binary Ninja, and Ghidra, also do not guarantee recompilability. Previous research has explored reassembly [65, 66] and recompilable IRs [67], whereas to the best of our knowledge, there is no research work on recompilable decompilation.
可重编译性。OXIDIZER 不生成可重编译的 Rust 代码。包括 angr、Hex-Rays、Binary Ninja、Ghidra 在内的最优 C 反编译器也不保证可重编译性。前期研究探索了重汇编[65, 66]与可重编译中间表示[67],但据我们所知,尚无关于可重编译反编译的研究工作。
Language Applicability. Although challenges in decompiling Rust binaries are similar to those in other high-level languages, addressing new languages, such as Swift, requires a different set of language-specific research. We inherently designed OXIDIZER to work on Rust; as a result, this limits the applicability of our techniques to other languages. However, we maintain that OXIDIZER will help broaden high-level decompilation research beyond the C/C++ family and improve program understanding for languages increasingly used in both benign and malicious software.
语言适用性。尽管 Rust 二进制反编译的挑战与其他高层语言类似,但解决 Swift 等新语言需要另一套语言专用研究。OXIDIZER 专为 Rust 设计,限制了技术在其他语言上的适用性。但我们认为,OXIDIZER 有助于将高层反编译研究拓展到 C/C++ 家族之外,提升对良性与恶意软件中日益常用的语言的程序理解能力。
9. Related Work
相关工作
Rust Support in C Decompilers. To the best of our knowledge, there is no publicly available specialized Rust decompiler. Analysts often use state-of-the-art C decompilers, including angr [18], Hex-Rays [10], Ghidra [11], and Binary Ninja [21] to analyze Rust binaries. Binary Ninja introduced Pseudo Rust representation since 4.2 [29], similar to OXIDIZER's Rust pseudocode generator component, which translates the structured AST into human-readable Rust pseudocode. Ghidrust [30] is a Rust binary analysis extension for Ghidra, which supports emitting Rust pseudocode by parsing decompiled C code. Ghidrust also supports identifying Rust binaries by searching for the error messages used in Rust standard library functions.
C 反编译器的 Rust 支持。据我们所知,尚无公开可用的专用 Rust 反编译器。分析人员通常使用 angr[18]、Hex-Rays[10]、Ghidra[11]、Binary Ninja[21] 等最优 C 反编译器分析 Rust 二进制。Binary Ninja 自 4.2 版本起提供伪 Rust 表示[29],与 OXIDIZER 的 Rust 伪代码生成模块类似,将结构化抽象语法树转换为人类可读的 Rust 伪代码。Ghidrust[30] 是 Ghidra 的 Rust 二进制分析扩展,支持解析反编译 C 代码生成 Rust 伪代码,还支持通过搜索 Rust 标准库函数中的错误信息识别 Rust 二进制。
Aside from Rust support in C decompilers, there is also a transpiler C2Rust [68] that migrates C99-compliant compilable C code to Rust.
除 C 反编译器的 Rust 支持外,还有转换器 C2Rust[68],可将符合 C99 标准的可编译 C 代码迁移为 Rust。
LLM-assisted Decompilation. Large language models (LLMs) have demonstrated remarkable performance in decompilation quality improvement. Notably, DeGPT [14] represents one of the earliest efforts in this direction, employing off-the-shelf LLMs as collaborative agents to refine decompiler outputs and improve readability. Other works, such as ReSym [16], employ LLMs for recovering variable names and types, while TypeForge [17] focuses specifically on reconstructing composite variable types. The most recent work, FidelityGPT [15], tackles fidelity distortion in decompiled outputs by leveraging distortion-aware prompting, retrieval-augmented generation, and variable dependency analysis to detect and correct semantic inconsistencies between decompiled code and its source, thereby enhancing its readability and accuracy. Even though they are not directly comparable with OXIDIZER as they either aim to improve decompilation readability or improve one component in the decompilation pipeline, these LLM-based techniques could potentially be adapted to improve Rust decompilation.
LLM 辅助反编译。大语言模型(LLM)在提升反编译质量上表现出色。DeGPT[14] 是该方向早期工作之一,使用现成 LLM 作为协作代理优化反编译器输出、提升可读性。ReSym[16] 等工作使用 LLM 还原变量名与类型,TypeForge[17] 专门聚焦复合变量类型重建。最新工作 FidelityGPT[15] 通过失真感知提示、检索增强生成、变量依赖分析,检测并修正反编译代码与源码的语义不一致,解决反编译输出的保真度失真,提升可读性与准确性。尽管这些工作或旨在提升反编译可读性、或仅优化反编译流程中的一个模块,与 OXIDIZER 不直接可比,但这类基于 LLM 的技术有望适配并改进 Rust 反编译。
Binary Decompilation. Since Cifuentes et al. laid the academic foundation [58, 69] for modern decompilers, many efforts have been made to improve the performance of decompilers. The scientific challenges in binary decompilation mainly include control-flow structuring, type recovery and variable name recovery. Phoenix [6] uses semantics-preservation and iterative refinement to reduce the number of gotos. DREAM [7] and rev.ng [53] further introduce algorithms to eliminate gotos. SAILR [8] reduces gotos by inverting compiler-aware transformations to maintain high structure similarity to the source code. To achieve type inference in binary decompilation, previous works have used program analysis techniques (TIE [70], Retypd [52], Osprey [71]) and machine learning techniques (DIRTY [72], TYGR [59]). High-precision variable name prediction is especially useful for making decompilation more understandable by humans. Many recent works use neural models for variable name prediction (DIRE [73], DIRTY [72], VarBERT [74], ReSym [16]). Unfortunately, these techniques target C decompilations and improve decompilation quality in general, but do not address the challenges in decompiling Rust binaries.
二进制反编译。自 Cifuentes 等人奠定现代反编译器的学术基础[58, 69]以来,大量工作致力于提升反编译器性能。二进制反编译的科学挑战主要包括控制流结构化、类型还原与变量名还原。Phoenix[6] 使用语义保留与迭代精化减少 goto 数量;DREAM[7] 与 rev.ng[53] 进一步提出消除 goto 的算法;SAILR[8] 通过反转编译器感知变换,保持与源码的高结构相似度,减少 goto。为实现二进制反编译的类型推断,前期工作使用程序分析技术(TIE[70]、Retypd[52]、Osprey[71])与机器学习技术(DIRTY[72]、TYGR[59])。高精度变量名预测对提升反编译的人类可读性尤为重要,近期多项工作使用神经网络模型预测变量名(DIRE[73]、DIRTY[72]、VarBERT[74]、ReSym[16])。遗憾的是,这些技术针对 C 反编译,仅整体提升反编译质量,未解决 Rust 二进制反编译的特有挑战。
10. Conclusion
结论
Decompilation plays a critical role in binary analysis, yet existing decompilers are designed primarily for C/C++ binaries and fall short when applied to Rust binaries directly. This is due to the fundamental differences in language design and semantics. In this work, we present the first comprehensive study of the challenges in Rust decompilation, identifying the fidelity issues and the underlying causes. Based on these insights, we propose a set of targeted techniques tailored for concise and high-fidelity Rust decompilation, presenting our prototype OXIDIZER. We conduct extensive evaluation to demonstrate the efficacy of OXIDIZER, making a significant step towards Rust binary analysis.
反编译在二进制分析中至关重要,但现有反编译器主要为 C/C++ 二进制设计,直接应用于 Rust 二进制时效果不佳,根源是语言设计与语义的根本性差异。本文首次全面研究 Rust 反编译的挑战,识别保真度问题与根本原因,基于这些发现提出一套针对性技术,实现简洁、高保真的 Rust 反编译原型 OXIDIZER。我们通过大量评估证明 OXIDIZER 的有效性,向 Rust 二进制分析迈出重要一步。
Acknowledgements
致谢
This paper has received funding from the Air Force Office of Scientific Research No. FA9550-24-1-020; the Advanced Research Projects Agency for Health (ARPA-H) No. SP4701-23-C-007; the Defense Advanced Research Projects Agency (DARPA) and Naval Information Warfare Center Pacific (NIWC Pacific) No. N66001-20-C-4020; and the National Science Foundation No. 2232915 and 214656.
本研究由以下项目资助:美国空军科学研究办公室项目 No. FA9550-24-1-020;健康高级研究计划局(ARPA-H)项目 No. SP4701-23-C-007;美国国防高级研究计划局(DARPA)与太平洋海军信息战中心(NIWC Pacific)项目 No. N66001-20-C-4020;以及美国国家科学基金会项目 No. 2232915 与 214656。
Ethics Considerations
伦理声明
This research on Rust decompilation was conducted with careful attention to ethical considerations. The goal of the research was to facilitate security analyses, such as malware analysis and binary auditing, not to facilitate software piracy, unauthorized access, or the circumvention of security mechanisms. The human study in this research was exempted by our Institutional Review Board (IRB) at Arizona State University. However, we followed the same ethics and privacy requirements that an IRB would normally enforce.
本项 Rust 反编译研究严格遵守伦理规范。研究目的是支持安全分析(如恶意软件分析、二进制审计),不用于软件盗版、未授权访问或绕过安全机制。本研究中的人体实验已获得亚利桑那州立大学机构审查委员会(IRB)豁免,但我们仍遵循 IRB 规定的全部伦理与隐私要求。
LLM Usage Considerations
大模型使用说明
LLMs were used for editorial purposes in this paper, and all outputs were inspected by the authors to ensure accuracy and originality. The methodology of this paper does not integrate LLMs or any ideas generated by LLMs. No LLMs were trained or fine-tuned in this process.
本文仅将大语言模型(LLM)用于编辑辅助,所有输出均经作者人工校验以保证准确与原创性。本文方法不集成 LLM,也不使用任何 LLM 生成的思路。本研究未训练或微调任何 LLM。
References
参考文献
1\] S. Klabnik and C. Nichols. The Rust programming language. No Starch Press, 2023.
\[2\] Airborne Engineering Limited. blethrs. https://github.com/airborneengineering/blethrs.
\[3\] H. Li, L. Guo, Y. Yang, S. Wang, and M. Xu. An empirical study of Rust-for-Linux: The success, dissatisfaction, and compromise. In 2024 USENIX Annual Technical Conference (USENIX ATC 24), pages 425--443, 2024.
\[4\] J. Corbet. Rust for embedded linux kernels. https://lwn.net/Articles/970216/, Apr. 2024.
\[5\] H. Evans. 5 malware variants you should know. https://www.reliaquest.com/blog/5-malware-variants-you-should-know/, Aug. 2024.
\[6\] D. Brumley, J. Lee, E. J. Schwartz, and M. Woo. Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In 22nd USENIX Security Symposium (USENIX Security 13), pages 353--368, Washington, D.C., Aug. 2013. USENIX Association.
\[7\] K. Yakdan, S. Eschweiler, E. Gerhards-Padilla, and M. Smith. No more gotos: Decompilation using pattern-independent control-flow structuring and semantics-preserving transformations. In NDSS. Citeseer, 2015.
\[8\] Z. L. Basque, A. P. Bajaj, W. Gibbs, J. O'Kain, D. Miao, T. Bao, A. Doupé, Y. Shoshitaishvili, and R. Wang. Ahoy SAILR! there is no need to DREAM of c: A Compiler-Aware structuring algorithm for binary decompilation. In 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, Aug. 2024. USENIX Association.
\[9\] M. Botacin. What do malware analysts want from academia? a survey on the state-of-the-practice to guide research developments. In Proceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses, pages 77--96, 2024.
\[10\] Hex-Rays. Disassemble, decompile and debug with ida. 2025. https://hex-rays.com/.
\[11\] NSA. Ghidra. 2025. https://ghidra-sre.org/.
\[12\] N. Fishbein and J. A. Guerrero-Saade. Project 0xa11c deoxidizing the rust malware ecosystem. https://www.youtube.com/watch?v=3WsTYSUz-UQ, Oct. 2024.
\[13\] C. Xiao. Rust malware gallery. 2024. https://github.com/cxiao/rust-malware-gallery.
\[14\] P. Hu, R. Liang, and K. Chen. Degpt: Optimizing decompiler output with llm. In Proceedings 2024 Network and Distributed System Security Symposium, 2024.
\[15\] Z. Zhou, X. Li, R. Feng, Y. Zhang, Y. Li, W. Feng, Y. Wang, and Y. Li. Fidelitygpt: Correcting decompilation distortions with retrieval augmented generation. In Proceedings 2026 Network and Distributed System Security Symposium (NDSS). Internet Society, 2026.
\[16\] D. Xie, Z. Zhang, N. Jiang, X. Xu, L. Tan, and X. Zhang. Resym: Harnessing llms to recover variable and data structure symbols from stripped binaries. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 4554--4568, 2024.
\[17\] Y. Wang, R. Liang, Y. Li, P. Hu, K. Chen, and B. Zhang. Typeforge: Synthesizing and selecting best-fit composite data types for stripped binaries. In 2025 IEEE Symposium on Security and Privacy (SP), pages 2847--2864. IEEE, 2025.
\[18\] Y. Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegel, and G. Vigna. Sok: (state of) the art of war: Offensive techniques in binary analysis. In 2016 IEEE Symposium on Security and Privacy (SP), pages 138--157, 2016.
\[19\] uutils. uutils coreutils. 2025. https://github.com/uutils/coreutils.
\[20\] Neyrian. Fakecrypt: Easy to use ransomware-like tool for linux or windows, developed in rust. GitHub repository, 2025.
\[21\] Vector35. Binary ninja. 2025. https://binary.ninja/.
\[22\] OpenAI. Chatgpt (gpt-5). https://chat.openai.com/, 2025.
\[23\] Z. Xu, S. Jain, and M. S. Kankanhalli. Hallucination is inevitable: An innate limitation of large language models. arXiv preprint arXiv:2401.11817, 2024. https://arxiv.org/abs/2401.11817.
\[24\] Microsoft Security Response Center. Why Rust for safe systems programming. https://msrc.microsoft.com/blog/2019/07/why-rust-for-safe-systems-programming/, 2019.
\[25\] Microsoft Threat Intelligence. Hive ransomware gets upgrades in Rust. https://www.microsoft.com/en-us/security/blog/2022/07/05/hive-ransomware-gets-upgrades-in-rust/, 2022.
\[26\] Intezer. WildCard: The APT Behind SysJoker Targets Critical Sectors in Israel. https://intezer.com/blog/wildcard-evolution-of-sysjoker-cyber-threat/, 2023.
\[27\] Rust Project Developers. The rustc-dev-guide. https://rustc-dev-guide.rust-lang.org/overview.html, 2024. Accessed: 2026-03-24.
\[28\] K. Yakdan, S. Dechand, E. Gerhards-Padilla, and M. Smith. Helping johnny to analyze malware: A usability-optimized decompiler and malware analysis user study. In 2016 IEEE Symposium on Security and Privacy (SP), pages 158--177. IEEE, 2016.
\[29\] Binary Ninja Team. Binary Ninja 4.2 "Frogstar" Released. Nov. 2024. https://binary.ninja/2024/11/20/4.2-frogstar.html.
\[30\] DMaroo. Ghidrust. 2024. https://github.com/DMaroo/GhidRust. Accessed: 2024-11-03.
\[31\] L. Dramko, J. Lacomis, E. J. Schwartz, B. Vasilescu, and C. L. Goues. A taxonomy of c decompiler fidelity issues. In 33rd USENIX Security Symposium (USENIX Security 24), pages 379--396, Philadelphia, PA, Aug. 2024. USENIX Association.
\[32\] M. F. Oberhumer, L. Molnár, and J. F. Reiser. Upx: The ultimate packer for executables. 2025. https://github.com/upx/upx.
\[33\] O. Technologies. Themida -- advanced windows software protection system. 2025. https://www.themida.com/.
\[34\] V. Software. Vmprotect -- software protection utility. 2025. https://vmpsoft.com/vmprotect/.
\[35\] G. Iaculo. rustfuscator. GitHub repository. https://github.com/gianiac/rustfuscator.
\[36\] P. Dronavalli. Rust-obfuscator. GitHub repository. https://github.com/dronavallipranav/rust-obfuscator.
\[37\] frank2. goldberg. GitHub repository, 2025. https://github.com/frank2/goldberg.
\[38\] joaovarelas. Obfuscator-llvm-16.0. GitHub repository. https://github.com/joaovarelas/Obfuscator-LLVM-16.0.
\[39\] 0xlane. ollvm-rust. GitHub repository, 2025. https://github.com/0xlane/ollvm-rust.
\[40\] The Rust Language Project. The Rust Reference --- Enumerations. 2025. https://doc.rust-lang.org/reference/items/enumerations.html.
\[41\] F. S. Foundation. Link Options --- GNU Compiler Collection (GCC) Documentation. 2025. https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html.
\[42\] T. L. Project. Clang --- The Clang C, C++, and Objective-C Compiler / Command Guide. https://clang.llvm.org/docs/CommandGuide/clang.html.
\[43\] The Rust Project. The Rust Reference --- Linkage. https://doc.rust-lang.org/reference/linkage.html.
\[44\] J. Thagen. min-sized-rust: How to minimize rust binary size. GitHub repository, 2025. https://github.com/johnthagen/min-sized-rust.
\[45\] cppreference.com contributors. malloc -- c reference. 2025. https://en.cppreference.com/w/c/memory/malloc.
\[46\] Steve Klabnik, Carol Nichols, and The Rust Project. The Rust Programming Language. No Starch Press, 2025.
\[47\] International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC). Programming languages --- C. ISO/IEC, 1990.
\[48\] The Rust Project. Strings -- rust by example / the big book of rust interop. https://doc.rust-lang.org/rust-by-example/std/str.html and https://nrc.github.io/big-book-ffi/reference/strings.html, 2025.
\[49\] cppreference.com contributors. Struct declaration --- c reference. 2025. https://en.cppreference.com/w/c/language/struct.html.
\[50\] The Rust Language Project. The Rust Reference --- Type Layout. 2025. https://doc.rust-lang.org/reference/type-layout.html.
\[51\] Hex-Rays. Fast Library Identification and Recognition Technology (FLIRT). https://docs.hex-rays.com/user-guide/signatures/flirt.
\[52\] M. Noonan, A. Loginov, and D. Cok. Polymorphic type inference for machine code. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 27--41, 2016.
\[53\] A. Gussoni, A. Di Federico, P. Fezzardi, and G. Agosta. A comb for decompiled c code. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security (ASIA CCS '20), pages 637--651, New York, NY, USA, 2020. Association for Computing Machinery.
\[54\] T. J. McCabe. A complexity measure. IEEE Transactions on Software Engineering, 4:308--320, 1976.
\[55\] I. Hosseini and B. Dolan-Gavitt. Beyond the C: Retargetable decompilation using neural machine translation. In Workshop on Binary Analysis Research (BAR), 2022.
\[56\] S. Enders, E.-M. C. Behner, N. Bergmann, M. Rybalka, E. Padilla, E. X. Hui, H. Low, and N. Sim. dewolf: Improving decompilation by leveraging user surveys. In Proceedings 2023 Workshop on Binary Analysis Research (BAR 2023). Internet Society, 2023.
\[57\] S. Bhatia and J. Malhotra. A survey on impact of lines of code on software complexity. In 2014 International Conference on Advances in Engineering \& Technology Research (ICAETR-2014), pages 1--4. IEEE, 2014.
\[58\] C. Cifuentes. Reverse compilation techniques. Queensland University of Technology, Brisbane, 1994.
\[59\] C. Zhu, Z. Li, A. Xue, A. P. Bajaj, W. Gibbs, Y. Liu, R. Alur, T. Bao, H. Dai, A. Doupé, et al. TYGR: Type inference on stripped binaries using graph neural networks. In 33rd USENIX Security Symposium (USENIX Security 24), pages 4283--4300, 2024.
\[60\] Kaspersky. Luna in Rust: new ransomware group emerges using cross-platform programming language. https://www.kaspersky.com/about/press-releases/lunain-rust-new-ransomware-group-emerges-using-crossplatform-programming-language, 2022.
\[61\] H. Tan, Q. Luo, J. Li, and Y. Zhang. Llm4decompile: Decompiling binary code with large language models. arXiv preprint arXiv:2403.05286, 2024.
\[62\] W. K. Wong, H. Wang, Z. Li, Z. Liu, S. Wang, Q. Tang, S. Nie, and S. Wu. Refining decompiled c code with large language models. Technical report, The Hong Kong University of Science and Technology, 2023. https://arxiv.org/pdf/2310.06530v2.
\[63\] OALabs. Control-flow deobfuscation. 2022. https://research.openanalysis.net/angr/symbolic%20execution/deobfuscation/research/2022/03/26/angr_notes.html.
\[64\] S. Li, C. Jia, P. Qiu, Q. Chen, J. Ming, and D. Gao. Chosen-instruction attack against commercial code virtualization obfuscators. In Proceedings 2022 Network and Distributed System Security Symposium (NDSS), 2022.
\[65\] R. Wang, Y. Shoshitaishvili, A. Bianchi, A. Machiry, J. Grosen, P. Grosen, C. Kruegel, and G. Vigna. Ramblr: Making reassembly great again. In 24th Annual Network and Distributed System Security Symposium (NDSS 2017). The Internet Society, 2017.
\[66\] H. Kim, S. Kim, J. Lee, K. Jee, and S. K. Cha. Reassembly is hard: A reflection on challenges and strategies. In 32nd USENIX Security Symposium (USENIX Security '23), pages 1469--1486. USENIX Association, 2023.
\[67\] A. Altinay, J. Nash, T. Kroes, P. Rajasekaran, D. Zhou, A. Dabrowski, D. Gens, Y. Na, S. Volckaert, C. Giuffrida, H. Bos, and M. Franz. Binrec: Dynamic binary lifting and recompilation. In Proceedings of the 15th European Conference on Computer Systems (EuroSys '20), 2020.
\[68\] C2rust demonstration. 2024. https://c2rust.com/. Accessed: 2024-11-03.
\[69\] C. Cifuentes and K. J. Gough. Decompilation of binary programs. Software: Practice and Experience, 25(7):811--829, 1995.
\[70\] J. Lee, T. Avgerinos, and D. Brumley. Tie: Principled reverse engineering of types in binary programs. In NDSS, 2011.
\[71\] Z. Zhang, Y. Ye, W. You, G. Tao, W.-c. Lee, Y. Kwon, Y. Aafer, and X. Zhang. Osprey: Recovery of variable and data structure via probabilistic analysis for stripped binary. In 2021 IEEE Symposium on Security and Privacy (SP), pages 813--832. IEEE, 2021.
\[72\] Q. Chen, J. Lacomis, E. J. Schwartz, C. Le Goues, G. Neubig, and B. Vasilescu. Augmenting decompiler output with learned variable names and types. In 31st USENIX Security Symposium (USENIX Security 22), pages 4327--4343, 2022.
\[73\] J. Lacomis, P. Yin, E. Schwartz, M. Allamanis, C. Le Goues, G. Neubig, and B. Vasilescu. Dire: A neural approach to decompiled identifier renaming. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 628--639. IEEE, 2019.
\[74\] K. K. Pal, A. P. Bajaj, P. Banerjee, A. Dutcher, M. Nakamura, Z. L. Basque, H. Gupta, S. A. Sawant, U. Anantheswaran, Y. Shoshitaishvili, et al. "len or index or count, anything but v1": Predicting variable names in decompilation output with transfer learning. In 2024 IEEE Symposium on Security and Privacy (SP), pages 4069--4087. IEEE, 2024.
### Appendix A. Selected Popular Rust Projects for Evaluation
附录 A:评估所用的精选主流 Rust 项目
The 28 selected popular Rust projects we used for evaluation includes: rustdesk, uv, sway, alacritty, fuel-core, ripgrep, bat, meilisearch, starship, vaultwarden, typst, fuelsrs, ruff, helix, fd, lapce, nushell, swc, fish-shell, lineraprotocol, sniffnet, influxdb, firecracker, zoxide, surrealdb, turborepo, just, coreutils.
本次评估使用的 28 个主流 Rust 项目包括:rustdesk、uv、sway、alacritty、fuel-core、ripgrep、bat、meilisearch、starship、vaultwarden、typst、fuelsrs、ruff、helix、fd、lapce、nushell、swc、fish-shell、lineraprotocol、sniffnet、influxdb、firecracker、zoxide、surrealdb、turborepo、just、coreutils。
### Appendix B. Selected Real-world Malware Samples
附录 B:精选真实恶意软件样本
The 8 selected real-world Rust malware samples we used for the case study evaluation are sourced from the Rust Malware Gallery \[13\]. They include: Luna Ransomware (1cbbf1...ab51), Realst Stealer (2af0e2...13e2), CosmicRust (3315e5...29a), Convuster (947ae8...66a), RustBucket (9ca914...c747), RansomExx2 (a7ea1e...8c5c), BlackCat Sphynx (c0e70e...38cc), and RustBucket (de81e5...d500).
案例评估所用的 8 个真实 Rust 恶意软件样本来自 Rust Malware Gallery\[13\],包括:Luna 勒索软件、Realst 窃密木马、CosmicRust、Convuster、RustBucket、RansomExx2、BlackCat(Sphynx)、RustBucket。
### Appendix C. Human Study Details
附录 C:人工评估详情
#### C.1. A Sample Recruitment Email
C.1 招募邮件示例
Hi,
We're conducting a human study on OXIDIZER, a research prototype Rust decompiler developed by the SEFCOM lab at Arizona State University. The goal is to evaluate whether OXIDIZER can assist with reverse engineering Rust programs.
We're looking for 30 participants with programming experience (Rust experience not required). The study will be conducted remotely, and upon successful completion, you'll receive a $50 Amazon gift card as reward.
Thank you!
您好,
我们正在开展一项关于 OXIDIZER 的使用研究,它是亚利桑那州立大学 SEFCOM 实验室研发的 Rust 反编译器原型。研究目标是评估 OXIDIZER 是否能辅助 Rust 程序逆向分析。
我们招募 30 名有编程经验的参与者(无需 Rust 经验)。实验全程线上完成,完成后将获得 50 美元亚马逊礼品卡。
感谢参与!
#### C.2. An Example Task and Free-Text Question
C.2 任务与开放式问题示例
Q1. Describe the functionality of this function with at most three sentences.
问题 1:用最多三句话描述该函数的功能。
#### C.3. Example Multi-choice Question
C.3 选择题示例
Q1. What will happen if this function fails to save the resulting file?
(a) It saves to a temporary file
(b) It logs an error and returns
(c) It panics immediately
(d) It returns without logging
(e) I don't know
问题 1:如果该函数保存文件失败,会发生什么?
(a) 保存到临时文件
(b) 打印错误并返回
(c) 直接 panic
(d) 直接返回不打印
(e) 不知道
### Appendix D. Additional Evaluation Results
附录 D:补充评估结果

### Appendix E. Meta-Review (S\&P 2026)
附录 E:程序委员会汇总评审意见
#### E.1. Summary
E.1 总结
This paper introduces Oxidizer, a novel decompiler specifically designed to handle compiled Rust binaries by recovering high-level Rust abstractions such as enums and macros. The authors systematically identify why traditional C/C++ decompilers fail on Rust executables and evaluate Oxidizer using both quantitative metrics and a human-subject study, showing impressive improvements over existing state-of-the-art C decompilers.
本文提出 Oxidizer,一款专门处理 Rust 二进制的新型反编译器,能够还原枚举、宏等 Rust 高层抽象。作者系统分析了传统 C/C++ 反编译器在 Rust 程序上失效的原因,并通过定量指标与人体实验评估 Oxidizer,结果显著优于现有最优 C 反编译器。
#### E.2. Scientific Contributions
E.2 学术贡献
* Creates a New Tool to Enable Future Science.
构建全新工具,支撑后续研究
* Provides a Valuable Step Forward in an Established Field.
在成熟领域中迈出重要一步。
#### E.3. Reasons for Acceptance
E.3 录用理由
1. This paper provides a valuable step forward in an established field. The difficulty of analyzing Rust binaries with existing C/C++ decompilers is a recognized challenge. By systematically collecting and characterizing these failures, and providing a targeted and well-engineered solution, the authors demonstrate clear practical improvements in readability.
本文在成熟领域中取得了重要进展。使用现有 C/C++ 反编译器分析 Rust 二进制文件的难度是公认的挑战。作者通过系统收集和归纳这些失败案例,并提供了针对性、设计精良的解决方案,证明了反编译结果可读性的显著提升。
2. Furthermore, the paper creates a new tool to enable future science. Oxidizer represents the first open-source, dedicated decompiler for Rust. By making this tool available and validating its real-world effectiveness through a rare and highly appreciated human-subject study, the authors have laid a strong foundation for future research in Rust binary analysis.
此外,本文开发了一款全新工具,为后续研究奠定了基础。Oxidizer 是首个开源的 Rust 专用反编译器。作者公开了该工具,并通过罕见且备受认可的人体实验验证了其实际效果,为 Rust 二进制分析领域的未来研究打下了坚实基础。
#### E.4. Noteworthy Concerns
E.4 主要顾虑
1. The proposed system relies heavily on specific pattern-matching and heuristics, which raises concerns about how brittle the tool might be against new compiler versions, aggressive optimizations, or future language features. The paper lacks a deeper discussion regarding a more principled, generalizable design approach to address the "test-of-time" survival of these techniques.
所提出的系统高度依赖特定的模式匹配与启发式规则,这引发了对工具鲁棒性的担忧:面对新编译器版本、激进优化或未来语言特性时,工具可能会失效。本文未深入讨论更具原则性、可泛化的设计方法,以确保这些技术能经受住长期使用的考验。
*** ** * ** ***
## reference
* Oxidizing Ubuntu: adopting Rust utilities by default 2025