Related papers: Multi-Condition Conformal Selection
Selecting high-quality candidates from large datasets is critical in applications such as drug discovery, precision medicine, and alignment of large language models (LLMs). While Conformal Selection (CS) provides rigorous uncertainty…
Selecting a subset of promising candidates from a large pool is crucial across various scientific and real-world applications. Conformal selection offers a distribution-free and model-agnostic framework for candidate selection with…
When selecting from a list of potential candidates, it is important to ensure not only that the selected items are of high quality, but also that they are sufficiently dissimilar so as to both avoid redundancy and to capture a broader range…
Conformal selection (CS) uses calibration data to identify test inputs whose unobserved outcomes are likely to satisfy a pre-specified minimal quality requirement, while controlling the false discovery rate (FDR). Existing methods fix the…
Model selection/optimization in conformal inference is challenging, since it may break the exchangeability between labeled and unlabeled data. We study this problem in the context of conformal selection, which uses conformal p-values to…
A supervised feature selection method selects an appropriate but concise set of features to differentiate classes, which is highly expensive for large-scale datasets. Therefore, feature selection should aim at both minimizing the number of…
Determining optimal well placements and controls are two important tasks in oil field development. These problems are computationally expensive, nonconvex, and contain multiple optima. The practical solution of these problems require…
Multivariate conformal prediction requires nonconformity scores that compress residual vectors into scalars while preserving certain implicit geometric structure of the residual distribution. We introduce a Multivariate Kernel Score (MKS)…
This paper proposes a novel modulation and coding scheme (MCS) selection framework that integrates mutual information (MI) prediction based on vector similarity search (VSS) for massive multi-user multiple-input multiple-output orthogonal…
We introduce our method, conformal highest conditional density sets (CHCDS), that forms conformal prediction sets using existing estimated conditional highest density predictive regions. We prove the validity of the method, and that…
We introduce Multistage Conditional Compositional Optimization (MCCO) as a new paradigm for decision-making under uncertainty that combines aspects of multistage stochastic programming and conditional stochastic optimization. MCCO minimizes…
In many practices, scientists are particularly interested in detecting which of the predictors are truly associated with a multivariate response. It is more accurate to model multiple responses as one vector rather than separating each…
Multimodal AI systems are evaluated by downstream task accuracy, but high accuracy does not mean the underlying data is coherent. A model can score well on Visual Question Answering (VQA) while its inputs contradict each other. We introduce…
Knowing the features of a complex system that are highly relevant to a particular target variable is of fundamental interest in many areas of science. Existing approaches are often limited to linear settings, sometimes lack guarantees, and…
A fundamental challenge in approximating an unknown density using finite Gaussian mixture models is selecting the number of mixture components, also known as order. Traditional approaches choose a single best model using information…
Many optimization problems in science and engineering are highly nonlinear, and thus require sophisticated optimization techniques to solve. Traditional techniques such as gradient-based algorithms are mostly local search methods, and often…
We study the computational complexity of Markov chain Monte Carlo (MCMC) methods for high-dimensional Bayesian linear regression under sparsity constraints. We first show that a Bayesian approach can achieve variable-selection consistency…
Heterogeneous nonmonotonic multi-context systems (MCS) permit different logics to be used in different contexts, and link them via bridge rules. We investigate the role of symmetry detection and symmetry breaking in such systems to…
Recently, Large Language Models (LLMs) have demonstrated remarkable advancements in Natural Language Processing (NLP). However, generating high-quality text that balances coherence, diversity, and relevance remains challenging. Traditional…
The unsupervised and principled diagnosis of multi-scale data is a fundamental obstacle in modern scientific problems from, for instance, weather and climate prediction, neurology, epidemiology, and turbulence. Multi-scale data is…