Related papers: Seneca: Taint-Based Call Graph Construction for Ja…
Nowadays, an increasing number of applications uses deserialization. This technique, based on rebuilding the instance of objects from serialized byte streams, can be dangerous since it can open the application to attacks such as remote code…
Untrusted deserialization exploits, where a serialised object graph is used to achieve denial-of-service or arbitrary code execution, have become so prominent that they were introduced in the 2017 OWASP Top 10. In this paper, we present a…
Java (de)serialization is prone to causing security-critical vulnerabilities that attackers can invoke existing methods (gadgets) on the application's classpath to construct a gadget chain to perform malicious behaviors. Several techniques…
Speech tokenizers are a key building block of fully discrete Speech LLMs.Existing tokenizers either prioritize semantic encoding,fuse semantic content with acoustic style inseparably,or achieve incomplete semantic-acoustic…
Static analysis plays a key role in finding bugs, including security issues. A critical step in static analysis is building accurate call graphs that model function calls in a program. However, due to hard-to-analyze language features,…
In managed languages, serialization of objects is typically done in bespoke binary formats such as Protobuf, or markup languages such as XML or JSON. The major limitation of these formats is readability. Human developers cannot read binary…
The inherent determinism of blockchain technology poses a significant challenge to generating secure random numbers within smart contracts, leading to exploitable vulnerabilities, particularly in decentralized finance (DeFi) ecosystems and…
Transformers have demonstrated success in graph learning, particularly for node-level tasks. However, existing methods encounter an information bottleneck when generating graph-level representations. The prevalent single token paradigm…
Large-scale decentralized learning frameworks such as federated learning (FL), require both communication efficiency and strong data security, motivating the study of secure aggregation (SA). While information-theoretic SA is well…
Java deserialization vulnerability is a severe threat in practice. Researchers have proposed static analysis solutions to locate candidate vulnerabilities and fuzzing solutions to generate proof-of-concept (PoC) serialized objects to…
We introduce DeSCo, a scalable neural deep subgraph counting pipeline, designed to accurately predict both the count and occurrence position of queries on target graphs post single training. Firstly, DeSCo uses a novel canonical partition…
The success of large pretrained Transformers is closely tied to tokenizers, which convert raw input into discrete symbols. Extending these models to graph-structured data remains a significant challenge. In this work, we introduce a graph…
Security vulnerabilities are among the most critical software defects in existence. As such, they require patches that are correct and quickly deployed. This motivates an automatic patch generation method that emphasizes both soundness and…
Java deserialization gadget chains are a well-researched critical software weakness. The vast majority of known gadget chains rely on gadgets from software dependencies. Furthermore, it has been shown that small code changes in dependencies…
Knowledge graphs serve as critical resources supporting intelligent systems, but they can be noisy due to imperfect automatic generation processes. Existing approaches to noise detection often rely on external facts, logical rule…
Building sound and precise static call graphs for real-world JavaScript applications poses an enormous challenge, due to many hard-to-analyze language features. Further, the relative importance of these features may vary depending on the…
In the Data-Centric Artificial Intelligence (AI) paradigm, improving data quality is essential for robust machine learning. However, many denoising methods rely on rigid statistical assumptions or require clean reference data, which limits…
Static analysis has established itself as a weapon of choice for detecting security vulnerabilities. Taint analysis in particular is a very general and powerful technique, where security policies are expressed in terms of forbidden flows,…
Interoperability of potentially heterogeneous databases has been an ongoing research issue for a number of years in the database community. With the trend towards globalization of data location and data access and the consequent requirement…
Static analysis is sound in theory, but an implementation may unsoundly fail to analyze all of a program's code. Any such omission is a serious threat to the validity of the tool's output. Our work is the first to measure the prevalence of…