2025 Q3 | Science of Security Virtual Organization

2025 Q3

Leveraging Machine Learning for Binary Software Understanding

Research Team Status

Names of researchers and position
(e.g. Research Scientist, PostDoc, Student (Undergrad/Masters/PhD))
- Yan Shoshitaishvili - Lead PI, Associate Professor
- Adam Doupe - Co-I, Associate Professor
- Chitta Baral - Co-I, Professor
- Zion Basque – PhD Student
- Yibo Liu – PhD Student
- Ati Priya Bajaj – PhD Student
- Chang Zhu – PhD Student
- Divij Handa - PhD Student
- William Gibbs - PhD Student
- Michael Tompkins - PhD Student
- Jayakrishna Vadayath - PhD Student
Any new collaborations with other universities/researchers?
- None.

Project Goals

What is the current project goal?
- Task 3 (Option Year 2): Deriving Binary Software Intention
  - The goal of Task 3 is to help the human analyst answer the why of the binary software. A key aspect to answering the why of binary software is to understand what protocols it uses to communicate with and why. This is important for an analyst to be able to interact with the binary software.
How does the current goal factor into the long-term goal of the project?
- Long-Term Goal: Achieving binary software understanding, in order to make identifying security issues much easier and cheaper.
- Task 3: Building on the semantically-equivalent decompilation created in Task 1 and the human descriptions of that decompilation in Task 2, in Task 3 we aim to understand the intention of the binary software. We are driven by the following intuition: much of the purpose of a program is in what it communicates. Therefore, we will focus on understanding the communication of the software, and helping human analysts understand what protocols it is using to communicate and, more fundamentally, what is the intention of the communication.

Accomplishments

Address whether project milestones were met. If milestones were not met, explain why, and what are the next steps.
What is the contribution to foundational cybersecurity research? Was there something discovered or confirmed?
Impact of research
- Internal to the university (coursework/curriculum)
- External to the university (transition to industry/government (local/federal); patents, start-ups, software, etc.)
- Any acknowledgements, awards, or references in media?

Recompilable Decompilation:

March - July 2025: The goal of this project is to make angr's decompiled code recompilable, ensuring that the recompiled binary not only compiles successfully but also exhibits the intended behavior. A key focus is on verifying the correctness of the recompiled binaries' behavior, ensuring they faithfully reproduce the original functionality. We do this validation by trying to achieve byte equivalence.

Decompiled code typically does not recompile out of the box because it does not conform to the C syntax rules expected by compilers like GCC. We have developed a preliminary pipeline that attempts to recompile the decompiled code and verify the functionality of the recompiled binary.

July - September 2025: We have been working on analyzing the byte equivalence of functions that recompiled successfully. Based on preliminary analysis, we found that incorrect callee function prototype recovery, arithmetic, and incorrect masking operations in angr make the functions not byte equivalent. We will continue to investigate more such issues and fix them in angr as we encounter them.

Software Reconstruction and Collaborative Reverse Engineering

January - March 2025: The research investigates collaborative dynamics in software reconstruction within reverse engineering (RE), focusing on human factors in the recovery and recompilation phases. Unlike traditional RE, which is often an individual effort, this study explores reconstruction as a team-driven process, particularly in large-scale projects like video game recovery. The research analyzes the methodologies and workflows used by the video game community, a highly active and diverse group that engages in cross-platform and multi-language software reconstruction.

April - June 2025: To ground this investigation, we conducted a survey targeting experienced reverse engineers involved in collaborative reconstruction projects. The survey collected both qualitative and quantitative data on contributors’ challenges, workflows, and tools. By examining the lived experiences of practitioners, we gained valuable insights into knowledge sharing, role distribution, decision-making processes, and the use of ML-based tools and CI/CD pipelines. The responses also shed light on common obstacles in the recompilation phase, such as achieving byte-level equivalence, managing toolchain compatibility, and handling missing or undocumented components.

Focusing on the highly active video game reverse engineering community, this research uncovers not only the technical strategies but also the social infrastructure that enables distributed collaboration—ranging from onboarding newcomers to resolving conflicts and coordinating large teams.

The results contribute to a deeper understanding of RE as a social and collaborative process. Findings from the survey inform recommendations for improving tool support, promoting sustainable project practices, and fostering effective team communication. Ultimately, this work supports the broader goal of software preservation by redefining RE through the lens of collaborative reconstruction.

TYGR: TYGR was accepted to USENIX Security 2024. We have presented the paper in the conference and we have also open-sourced the tool: https://github.com/sefcom/TYGR

July - September 2025: No updates.

Understanding Decompilation Metrics and Goals

In our recent exploratory work, we have been investigating how various end-user goals of decompilation may affect the way users evaluate decompilation quality. As such, we have been interested in first identifying the primary use-cases of decompilation and then understanding if modern metrics align with those goals.

We have primarily focused on studying how cyber reasoning systems (CRS) powered by LLMs interact with decompilation transformed for various metrics. The first task we have studied is vulnerability patching by LLMs on decompilation, structured with varying levels of distance from the source code. We have measured distance using our USENIX 2024 metric, CFGED, discovered in SAILR.

In our preliminary results on patching bugs in the AIxCC dataset, namely on nginx, we have observed that a lower edit distance on decompilation is likely associated with better reasoning capabilities by LLMs for patching. Specifically, having a CFGED of 0 is an indicator that an LLM will have better reasoning abilities on that code. This early work indicates that some metrics we have developed can directly predict the usefulness of our decompilation for LLMs. We intend to discover if there are other metrics that better align with other tasks, like vulnerability discovery.

We are actively exploring this area and intend to make more progress on it in the coming weeks to both establish goals and associated metrics.

Rust Decompilation

July- September 2025: Our research on Rust decompilation aims to develop a Rust decompiler on top of C/C++ decompiler angr to generate Rust pseudocode. We have finished a prototype of Rust decompiler called Oxidizer, which is able to produce Rust pseudocode close to the Rust source code. We have submitted our research paper to NDSS 2026. The latest decompilation on our manually crafted running example is shown below:

[Image Link: Rust Decompilation - UPDATED]

Our contributions are:

We empirically study the Rust-specific decompilation issues that arise during decompiling Rust binaries using state-of-the-art C decompilers. We also identify what techniques should be implemented in a Rust decompiler to solve these issues.
We designed a new decompilation pipeline and incorporated multiple techniques to address these decompilation issues. Future researchers can improve existing techniques or introduce new techniques based on our Rust decompiler prototype.
We thoroughly study the effectiveness of Oxidizer by evaluating it on uutils coreutils, a Rust rewrite of the GNU Coreutils suite, and real-world Rust malware samples.

Right now we are creating a benchmark for evaluating Oxidizer and also helping future work.

Annotating Decompiled Code with LLM for Fuzzing

We are currently working on a fuzzing project that can use LLM’s to read decompiled code which is then annotated with feedback for the fuzzer in order to guide the exploration using the techniques described in the paper “Ijon: Exploring deep state spaces via fuzzing” from Aschermann et al.

On source code, this approach yields great results and we believe we can apply it to binary code as well using the latest results from decompilation, static-binary rewriting and LLM.

Publications and presentations

Add publication reference in the publications section below. An authors copy or final should be added in the report file(s) section. This is for NSA's review only.
Optionally, upload technical presentation slides that may go into greater detail. For NSA's review only.
No new published papers since last quarterly report.

Lead PI:

Yan Shoshitaishvili

Co-Pi(s):

Adam Doupé