English
Related papers

Related papers: oASIS: Adaptive Column Sampling for Kernel Matrix …

200 papers

We propose a technique called Optimal Analysis-Specific Importance Sampling (OASIS) to reduce the number of simulated events required for a high-energy experimental analysis to reach a target sensitivity. We provide recipes to obtain the…

High Energy Physics - Phenomenology · Physics 2021-02-17 Konstantin T. Matchev , Prasanth Shyamsundar

Selecting a good column (or row) subset of massive data matrices has found many applications in data analysis and machine learning. We propose a new adaptive sampling algorithm that can be used to improve any relative-error column selection…

Data Structures and Algorithms · Computer Science 2015-10-15 Saurabh Paul , Malik Magdon-Ismail , Petros Drineas

In continual instruction tuning (CIT) scenarios, where new instruction tuning data continuously arrive in an online streaming manner, training delays from large-scale data significantly hinder real-time adaptation. Data selection can…

Computer Vision and Pattern Recognition · Computer Science 2025-10-10 Minjae Lee , Minhyuk Seo , Tingyu Qu , Tinne Tuytelaars , Jonghyun Choi

We consider the related tasks of matrix completion and matrix approximation from missing data and propose adaptive sampling procedures for both problems. We show that adaptive sampling allows one to eliminate standard incoherence…

Machine Learning · Statistics 2014-07-15 Akshay Krishnamurthy , Aarti Singh

We study the problem of inferring a sparse vector from random linear combinations of its components. We propose the Accelerated Orthogonal Least-Squares (AOLS) algorithm that improves performance of the well-known Orthogonal Least-Squares…

Machine Learning · Statistics 2018-04-17 Abolfazl Hashemi , Haris Vikalo

Annealed Sequential Monte Carlo (ASMC) samplers are special cases of SMC samplers where the sequence of distributions can be embedded in a smooth path of distributions. Using this underlying path and a performance model based on the…

Computation · Statistics 2025-12-03 Saifuddin Syed , Alexandre Bouchard-Côté , Kevin Chern , Arnaud Doucet

Most kernel-based methods, such as kernel or Gaussian process regression, kernel PCA, ICA, or $k$-means clustering, do not scale to large datasets, because constructing and storing the kernel matrix $\mathbf{K}_n$ requires at least…

Machine Learning · Statistics 2018-03-28 Daniele Calandriello , Alessandro Lazaric , Michal Valko

We study the problem of exact completion for $m \times n$ sized matrix of rank $r$ with the adaptive sampling method. We introduce a relation of the exact completion problem with the sparsest vector of column and row spaces (which we call…

Machine Learning · Computer Science 2022-03-08 Ilqar Ramazanli , Barnabas Poczos

Importance sampling (IS) is a powerful Monte Carlo methodology for the approximation of intractable integrals, very often involving a target probability density function. The performance of IS heavily depends on the appropriate selection of…

Computation · Statistics 2023-06-22 Víctor Elvira , Emilie Chouzenoux , Ömer Deniz Akyildiz , Luca Martino

We study the problem of recovering an incomplete $m\times n$ matrix of rank $r$ with columns arriving online over time. This is known as the problem of life-long matrix completion, and is widely applied to recommendation system, computer…

Machine Learning · Computer Science 2016-12-04 Maria-Florina Balcan , Hongyang Zhang

Molecular dynamics (MD) simulations are useful in obtaining thermodynamic and kinetic properties of bio-molecules but are limited by the timescale barrier, i.e., we may be unable to efficiently obtain properties because we need to run…

Chemical Physics · Physics 2017-08-23 Surl-Hee Ahn , Jay W. Grate , Eric F. Darve

While high-capacity AI models have advanced state-of-the-art performance, their practical deployment is often hindered by high inference costs, environmental impact, and a "one-size-fits-all" approach that ignores varying sample complexity.…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Turkoglu Mikael , Bary Tim , Thielens Vincent , Dausort Manon , Macq Benoît

Quantum machine learning with quantum kernels for classification problems is a growing area of research. Recently, quantum kernel alignment techniques that parameterise the kernel have been developed, allowing the kernel to be trained and…

The sequential importance sampling (SIS) algorithm has gained considerable popularity for its empirical success. One of its noted applications is to the binary contingency tables problem, an important problem in statistics, where the goal…

Statistics Theory · Mathematics 2011-06-29 Ivona Bezakova , Alistair Sinclair , Daniel Stefankovic , Eric Vigoda

Most current sampling algorithms for high-dimensional distributions are based on MCMC techniques and are approximate in the sense that they are valid only asymptotically. Rejection sampling, on the other hand, produces valid samples, but is…

Artificial Intelligence · Computer Science 2012-07-04 Marc Dymetman , Guillaume Bouchard , Simon Carter

In this work, we introduce a novel method for solving the set inversion problem by formulating it as a binary classification problem. Aiming to develop a fast algorithm that can work effectively with high-dimensional and computationally…

Machine Learning · Computer Science 2021-06-01 Binh T. Nguyen , Duy M. Nguyen , Lam Si Tung Ho , Vu Dinh

Medical image segmentation is a fundamental task in medical image analysis. Despite that deep convolutional neural networks have gained stellar performance in this challenging task, they typically rely on large labeled datasets, which have…

Computer Vision and Pattern Recognition · Computer Science 2019-12-06 Qikui Zhu , Bo Du , Pingkun Yan

The dramatic growth of big datasets presents a new challenge to data storage and analysis. Data reduction, or subsampling, that extracts useful information from datasets is a crucial step in big data analysis. We propose an orthogonal…

Methodology · Statistics 2021-06-01 Lin Wang , Jake Elmstedt , Weng Kee Wong , Hongquan Xu

Entity resolution (ER) presents unique challenges for evaluation methodology. While crowdsourcing platforms acquire ground truth, sound approaches to sampling must drive labelling efforts. In ER, extreme class imbalance between matching and…

Machine Learning · Computer Science 2017-06-27 Neil G. Marchant , Benjamin I. P. Rubinstein

The proliferation of spectroscopic data across various scientific and engineering fields necessitates automated processing. We introduce OASIS (Omni-purpose Analysis of Spectra via Intelligent Systems), a machine learning (ML) framework for…

Machine Learning · Computer Science 2025-09-16 Chris Young , Juejing Liu , Marie L. Mortensen , Yifu Feng , Elizabeth Li , Zheming Wang , Xiaofeng Guo , Kevin M. Rosso , Xin Zhang
‹ Prev 1 2 3 10 Next ›