Related papers: Fingerprint databases for theorems
Recent advances in computing have changed not only the nature of mathematical computation, but mathematical proof and inquiry itself. While artificial intelligence and formalized mathematics have been the major topics of this conversation,…
Machine learning solutions are very popular in the field of chemoinformatics, where they have numerous applications, such as novel drug discovery or molecular property prediction. Molecular fingerprints are algorithms commonly used for…
An introduction to the On-Line Encyclopedia of Integer Sequences (or OEIS, https://oeis.org) for graduate students in mathematics
Fingerprint verification and identification algorithms based on minutiae features are used in many biometric systems today (e.g., governmental e-ID programs, border control, AFIS, personal authentication for portable devices). Researchers…
Recent research found that cloud data warehouses are text-heavy. However, their capabilities for efficiently processing string columns remain limited, relying primarily on techniques like dictionary encoding and prefix-based partition…
Extracting minutiae from fingerprint images is one of the most important steps in automatic fingerprint identification system. Because minutiae matching are certainly the most well-known and widely used method for fingerprint matching,…
Thousands of vulnerabilities are reported on a monthly basis to security repositories, such as the National Vulnerability Database. Among these vulnerabilities, software misconfiguration is one of the top 10 security risks for web…
This paper presents an effective fingerprint classification method designed based on a hierarchical agglomerative clustering technique. The performance of the technique was evaluated in terms of several real-life datasets and a significant…
In this paper we propose a novel fingerprint indexing approach for speeding up in the fingerprint recognition system. What kind of features are used for indexing and how to employ the extracted features for searching are crucial for the…
The selection of algorithms is a crucial step in designing AI services for real-world time series classification use cases. Traditional methods such as neural architecture search, automated machine learning, combined algorithm selection,…
In this work, we present scikit-fingerprints, a Python package for computation of molecular fingerprints for applications in chemoinformatics. Our library offers an industry-standard scikit-learn interface, allowing intuitive usage and easy…
We explore the possibility of using machine learning to identify interesting mathematical structures by using certain quantities that serve as fingerprints. In particular, we extract features from integer sequences using two empirical laws:…
Understanding and creating mathematics using natural mathematical language - the mixture of symbolic and natural language used by humans - is a challenging and important problem for driving progress in machine learning. As a step in this…
When sharing sensitive relational databases with other parties, a database owner aims to (i) have privacy guarantees for the database entries, (ii) have liability guarantees (via fingerprinting) in case of unauthorized sharing of its…
Obtaining a relevant dataset is central to conducting empirical studies in software engineering. However, in the context of mining software repositories, the lack of appropriate tooling for large scale mining tasks hinders the creation of…
We investigate fingerprints in pretraining datasets for large language models (LLMs) through dataset classification experiments. Building on prior work demonstrating the existence of fingerprints or biases in popular computer vision…
Structural identifiability is an important property of parametric ODE models. When conducting an experiment and inferring the parameter value from the time-series data, we want to know if the value is globally, locally, or non-identifiable.…
The behavior of LLMs does not depend solely on the model itself. Components of the inference system, such as the inference engine, attention backend, and hardware platform, subtly influence how inputs are processed. These components differ…
Fingerprint classification is one of the most common approaches to accelerate the identification in large databases of fingerprints. Fingerprints are grouped into disjoint classes, so that an input fingerprint is compared only with those…
Increasing amounts of available data have led to a heightened need for representing large-scale probabilistic knowledge bases. One approach is to use a probabilistic database, a model with strong assumptions that allow for efficiently…