Related papers: Proximity Preserving Binary Code using Signed Grap…
Persistence diagrams have been widely used to quantify the underlying features of filtered topological spaces in data visualization. In many applications, computing distances between diagrams is essential; however, computing these distances…
In this paper, we address the problem of learning compact similarity-preserving embeddings for massive high-dimensional streams of data in order to perform efficient similarity search. We present a new online method for computing binary…
Binary code similarity detection is a core task in reverse engineering. It supports malware analysis and vulnerability discovery by identifying semantically similar code in different contexts. Modern methods have progressed from manually…
The problem of cross-platform binary code similarity detection aims at detecting whether two binary functions coming from different platforms are similar or not. It has many security applications, including plagiarism detection, malware…
In this paper we consider the binary similarity problem that consists in determining if two binary functions are similar only considering their compiled form. This problem is know to be crucial in several application scenarios, such as…
Embedding image features into a binary Hamming space can improve both the speed and accuracy of large-scale query-by-example image retrieval systems. Supervised hashing aims to map the original features to compact binary codes in a manner…
Binary embedding is a nonlinear dimension reduction methodology where high dimensional data are embedded into the Hamming cube while preserving the structure of the original space. Specifically, for an arbitrary $N$ distinct points in…
Graph-structured data is central to many scientific and industrial domains, where the goal is often to optimize objectives defined over graph structures. Given the combinatorial complexity of graph spaces, such optimization problems are…
Binary Code Similarity Detection (BCSD) is not only essential for security tasks such as vulnerability identification but also for code copying detection, yet it remains challenging due to binary stripping and diverse compilation…
The use of high-dimensional features has become a normal practice in many computer vision applications. The large dimension of these features is a limiting factor upon the number of data points which may be effectively stored and processed,…
Learning compact binary codes for image retrieval problem using deep neural networks has recently attracted increasing attention. However, training deep hashing networks is challenging due to the binary constraints on the hash codes. In…
In this paper, we study a generalization of the classical minimum cut prob- lem, called Connectivity Preserving Minimum Cut (CPMC) problem, which seeks a minimum cut to separate a pair (or pairs) of source and destination nodes and…
Deep hashing has shown to be a complexity-efficient solution for the Approximate Nearest Neighbor search problem in high dimensional space. Many methods usually build the loss function from pairwise or triplet data points to capture the…
The binary similarity problem consists in determining if two functions are similar by only considering their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as…
This paper proposes a binarization scheme for vectors of high dimension based on the recent concept of anti-sparse coding, and shows its excellent performance for approximate nearest neighbor search. Unlike other binarization schemes, this…
Network embedding has attracted an increasing attention over the past few years. As an effective approach to solve graph mining problems, network embedding aims to learn a low-dimensional feature vector representation for each node of a…
Can we use machine learning to compress graph data? The absence of ordering in graphs poses a significant challenge to conventional compression algorithms, limiting their attainable gains as well as their ability to discover relevant…
This paper addresses the problem of Approximate Nearest Neighbor (ANN) search in pattern recognition where feature vectors in a database are encoded as compact codes in order to speed-up the similarity search in large-scale databases.…
Recent advances in the field of network embedding have shown that low-dimensional network representation is playing a critical role in network analysis. Most existing network embedding methods encode the local proximity of a node, such as…
Binary code similarity detection is an important problem with applications in areas such as malware analysis, vulnerability research and license violation detection. This paper proposes a novel graph neural network architecture combined…