2024 Q2 | Science of Security Virtual Organization

2024 Q2

Research Team Status

Names of researchers and position
(e.g. Research Scientist, PostDoc, Student (Undergrad/Masters/PhD))
- Dung Thuy "Judy" Nguyen, PhD student
- Kailani "Cai" Lemieux-Mack , PhD student
Any new collaborations with other universities/researchers?
- Discussion with Noah Tobin and Jessica Inman at Georgia Tech. This was a useful discussion surrounding the selection of recent datasets appropriate for this line of research.

Project Goals

What is the current project goal?
The current quarter of this project has involved improving our approach to malware dataset augmentation. Previously, we had focused on the BODMAS dataset. We are now considering additional datasets based on a conversation with researchers at Georgia Tech.
We have also begun exploring techniques for purifying malware classifiers so that they are more robust against adversarial perturbation and backdoor attacks -- that is, data that may have been compromised or affected by the adversary during model training. The idea is to modify an existing, trained model so that it no longer suffers from a backdoor attack that may have been planted by an adversary.
How does the current goal factor into the long-term goal of the project?
- Both malware augmentation techniques and model purification techniques serve the long-term goal of ensuring malware classifiers are robust against adversarial perturbation and capable of adapting to novel techniques that may appear over time as malware evolves.

Accomplishments

Address whether project milestones were met. If milestones were not met, explain why, and what are the next steps.
- We are currently meeting the milestones set for year 1. We have developed a framework for extracting feature vectors of malware samples and are using them to develop plausible novel samples in the feature space that help improve model performance on unseen, out-of-distribution samples. In addition, we are currently working on techniques that help to improve model resiliency against adversarial tampering, both for image-based representations and feature-based representations of malware samples.
What is the contribution to foundational cybersecurity research? Was there something discovered or confirmed?
- The techniques developed during this year 1 effort improve neural binary and family-based malware classifiers across several different representations. This improves the state-of-the-art in security by increasing the margins against attackers who tamper with data used in training and who attempt to increase malware concealment techniques.
Impact of research
- Internal to the university (coursework/curriculum)
  - None to report since last quarter.
- External to the university (transition to industry/government (local/federal); patents, start-ups, software, etc.)
  - Software packages and data collected as part of our malware augmentation and model purification techniques.
- Any acknowledgements, awards, or references in media?
  - None to report at this time.

Publications and presentations

Add publication reference in the publications section below. An authors copy or final should be added in the report file(s) section. This is for NSA's review only.
- There are no new publications to report since last Quarter.
Optionally, upload technical presentation slides that may go into greater detail. For NSA's review only.

Lead PI:

Co-Pi(s):