Related papers: Mini-Batch Consistent Slot Set Encoder for Scalabl…
Recent work on mini-batch consistency (MBC) for set functions has brought attention to the need for sequentially processing and aggregating chunks of a partitioned set while guaranteeing the same output for all partitions. However, existing…
Block encoding of sparse matrices underpins powerful quantum algorithms such as quantum singular value transformation, Hamiltonian simulation, and quantum linear solvers, yet its efficient gate-level realization for general sparse matrices…
Model Predictive Control (MPC) is a popular control approach due to its ability to consider constraints, including input and state restrictions, while minimizing a cost function. However, in practice, these constraints can result in…
Mini-batch training is a cornerstone of modern deep learning, offering computational efficiency and scalability for training complex architectures. However, existing deep subspace clustering (DSC) methods, which typically combine an…
In online clustering problems, there is often a large amount of uncertainty over possible cluster assignments that cannot be resolved until more data are observed. This difficulty is compounded when clusters follow complex distributions, as…
We present a compact encoder for image categorization that emphasizes computation economy through content-conditioned multi-pass processing. The model employs a single lightweight core block that can be re-applied a small number of times,…
Subword tokenization is a common method for vocabulary building in Neural Machine Translation (NMT) models. However, increasingly complex tasks have revealed its disadvantages. First, a vocabulary cannot be modified once it is learned,…
Mini-batch algorithms have become increasingly popular due to the requirement for solving optimization problems, based on large-scale data sets. Using an existing online expectation-{}-maximization (EM) algorithm framework, we demonstrate…
Temporal set prediction involves forecasting the elements that will appear in the next set, given a sequence of prior sets, each containing a variable number of elements. Existing methods often rely on intricate architectures with…
Consensus clustering has been widely used in bioinformatics and other applications to improve the accuracy, stability and reliability of clustering results. This approach ensembles cluster co-occurrences from multiple clustering runs on…
Semiconstrained systems were recently suggested as a generalization of constrained systems, commonly used in communication and data-storage applications that require certain offending subsequences be avoided. In an attempt to apply…
Along with the progress of AI democratization, machine learning (ML) has been successfully applied to edge applications, such as smart phones and automated driving. Nowadays, more applications require ML on tiny devices with extremely…
Conformal prediction provides a principled framework for constructing predictive sets with finite-sample validity. While much of the focus has been on univariate response variables, existing multivariate methods either impose rigid…
We present an optimal method for encoding cluster assignments of arbitrary data sets. Our method, Random Cycle Coding (RCC), encodes data sequentially and sends assignment information as cycles of the permutation defined by the order of…
Stochastic gradient Markov Chain Monte Carlo (SG-MCMC) has been developed as a flexible family of scalable Bayesian sampling algorithms. However, there has been little theoretical analysis of the impact of minibatch size to the algorithm's…
Microbiome sample representation to input into LLMs is essential for downstream tasks such as phenotype prediction and environmental classification. While prior studies have explored embedding-based representations of each microbiome…
Sparse coding aims to model data vectors as sparse linear combinations of basis elements, but a majority of related studies are restricted to continuous data without spatial or temporal structure. A new model-based sparse coding (MSC)…
This paper focuses on controlling the absorbing set spectrum for a class of regular LDPC codes known as separable, circulant-based (SCB) codes. For a specified circulant matrix, SCB codes all share a common mother matrix, examples of which…
Batch codes, introduced by Ishai, Kushilevitz, Ostrovsky and Sahai in [1], are methods for solving the following data storage problem: n data items are to be stored in m servers in such a way that any k of the n items can be retrieved by…
To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation-Maximization algorithm for general latent variable models is proposed. For exponential models the algorithm is shown…