Related papers: BinPRE: Enhancing Field Inference in Binary Analys…
Recovering high-level type information in binaries is a key task in reverse engineering and binary analysis. Binaries contain very little explicit type information. The structure of binary code is incredibly flexible allowing for ad-hoc…
Retrieving binary code via natural language queries is a pivotal capability for downstream tasks in the software security domain, such as vulnerability detection and malware analysis. However, it is challenging to identify binary functions…
Protocol Reverse Engineering (PRE) is used to analyze protocols by inferring their structure and behavior. However, current PRE methods mainly focus on field identification within a single protocol and neglect Protocol State Machine (PSM)…
Binary reverse engineering is a challenging task because it often necessitates reasoning using both domain-specific knowledge (e.g., understanding entrypoint idioms common to an ABI) and logical inference (e.g., reconstructing…
Knowledge of the input format of binary executables is important for finding bugs and vulnerabilities, such as generating data for fuzzing or manual reverse engineering. This paper presents an algorithm to recover the structure and semantic…
Binary code similarity detection is a core task in reverse engineering. It supports malware analysis and vulnerability discovery by identifying semantically similar code in different contexts. Modern methods have progressed from manually…
Binary code clone analysis is an important technique which has a wide range of applications in software engineering (e.g., plagiarism detection, bug detection). The main challenge of the topic lies in the semantics-equivalent code…
Protocol reverse engineering based on traffic traces infers the behavior of unknown network protocols by analyzing observable network messages. To perform correct deduction of message semantics or behavior analysis, accurate message type…
Enforcing open source licenses such as the GNU General Public License (GPL), analyzing a binary for possible vulnerabilities, and code maintenance are all situations where it is useful to be able to determine the source code provenance of a…
Clone detection is widely exploited for software vulnerability search. The approaches based on source code analysis cannot be applied to binary clone detection because the same source code can produce significantly different binaries. In…
Inferring protocol formats is critical for many security applications. However, existing format-inference techniques often miss many formats, because almost all of them are in a fashion of dynamic analysis and rely on a limited number of…
Recent work in time series forecasting has explored reformulating regression as a classification task. By discretizing the continuous target space into bins and predicting over a fixed set of classes, these approaches benefit from more…
A wide range of binary analysis applications, such as bug discovery, malware analysis and code clone detection, require recovery of contextual meanings on a binary code. Recently, binary analysis techniques based on machine learning have…
Binary code search plays a crucial role in applications like software reuse detection. Currently, existing models are typically based on either internal code semantics or a combination of function call graphs (CG) and internal code…
Sequence models for binary analysis are bottlenecked by byte-level tokenization: raw bytes waste precious context window capacity for transformers and other neural network architectures, and many existing text-oriented tokenizers fail on…
Digital twinning offers a capability of effective real-time monitoring and control, which are vital for cost-intensive experimental facilities, particularly the ones where extreme conditions exist. Sparse experimental measurements collected…
Human-Oriented Binary Reverse Engineering (HOBRE) lies at the intersection of binary and source code, aiming to lift binary code to human-readable content relevant to source code, thereby bridging the binary-source semantic gap. Recent…
Binary Code Embedding (BCE) has important applications in various reverse engineering tasks such as binary code similarity detection, type recovery, control-flow recovery and data-flow analysis. Recent studies have shown that the…
This paper studies the problem of online parameter estimation for cyber-physical systems with binary outputs that may be subject to adversarial data tampering. Existing methods are primarily offline and unsuitable for real-time learning. To…
Binary Code Similarity Analysis (BCSA) has a wide spectrum of applications, including plagiarism detection, vulnerability discovery, and malware analysis, thus drawing significant attention from the security community. However, conventional…