Related papers: Sampling Techniques in Bayesian Target Encoding

Encoding Categorical Variables with Conjugate Bayesian Models for WeWork Lead Scoring Engine

Applied Data Scientists throughout various industries are commonly faced with the challenging task of encoding high-cardinality categorical features into digestible inputs for machine learning algorithms. This paper describes a Bayesian…

Machine Learning · Computer Science 2019-05-01 Austin Slakey , Daniel Salas , Yoni Schamroth

Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features

Since most machine learning (ML) algorithms are designed for numerical inputs, efficiently encoding categorical variables is a crucial aspect in data analysis. A common problem are high cardinality features, i.e. unordered categorical…

Machine Learning · Statistics 2022-03-07 Florian Pargent , Florian Pfisterer , Janek Thomas , Bernd Bischl

Distributional encoding for Gaussian process regression with qualitative inputs

Gaussian Process (GP) regression is a popular and sample-efficient approach for many engineering applications, where observations are expensive to acquire, and is also a central ingredient of Bayesian optimization (BO), a highly prevailing…

Machine Learning · Statistics 2025-06-06 Sébastien Da Veiga

Target encoding is an effective technique to deliver better performance for conventional machine learning methods, and recently, for deep neural networks as well. However, the existing target encoding approaches require significant increase…

Machine Learning · Computer Science 2019-10-22 Mayoore S. Jaiswal , Bumsoo Kang , Jinho Lee , Minsik Cho

BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization

The success of neural summarization models stems from the meticulous encodings of source articles. To overcome the impediments of limited and sometimes noisy training data, one promising direction is to make better use of the available…

Computation and Language · Computer Science 2019-06-13 Kai Wang , Xiaojun Quan , Rui Wang

An Application of Bayesian classification to Interval Encoded Temporal mining with prioritized items

In real life, media information has time attributes either implicitly or explicitly known as temporal data. This paper investigates the usefulness of applying Bayesian classification to an interval encoded temporal database with prioritized…

Databases · Computer Science 2009-08-10 C. Balasubramanian , K. Duraiswamy

Dealing with Categorical and Integer-valued Variables in Bayesian Optimization with Gaussian Processes

Bayesian Optimization (BO) methods are useful for optimizing functions that are expen- sive to evaluate, lack an analytical expression and whose evaluations can be contaminated by noise. These methods rely on a probabilistic model of the…

Machine Learning · Statistics 2020-02-04 Eduardo C. Garrido-Merchán , Daniel Hernández-Lobato

A Tutorial on Learning With Bayesian Networks

A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest. When used in conjunction with statistical techniques, the graphical model has several advantages for data analysis. One, because…

Machine Learning · Computer Science 2022-01-11 David Heckerman

Progressive Sampling-Based Bayesian Optimization for Efficient and Automatic Machine Learning Model Selection

Purpose: Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting…

Machine Learning · Computer Science 2018-12-10 Xueqiang Zeng , Gang Luo

Bayesian Optimisation for Machine Translation

This paper presents novel Bayesian optimisation algorithms for minimum error rate training of statistical machine translation systems. We explore two classes of algorithms for efficiently exploring the translation space, with the first…

Computation and Language · Computer Science 2014-12-24 Yishu Miao , Ziyu Wang , Phil Blunsom

Machine Learning and the Future of Bayesian Computation

Bayesian models are a powerful tool for studying complex data, allowing the analyst to encode rich hierarchical dependencies and leverage prior information. Most importantly, they facilitate a complete characterization of uncertainty…

Machine Learning · Statistics 2023-04-25 Steven Winter , Trevor Campbell , Lizhen Lin , Sanvesh Srivastava , David B. Dunson

A Less Uncertain Sampling-Based Method of Batch Bayesian Optimization

This paper presents a method called sampling-computation-optimization (SCO) to design batch Bayesian optimization. SCO does not construct new high-dimensional acquisition functions but samples from the existing one-site acquisition function…

Optimization and Control · Mathematics 2022-02-22 Kai Jia , Xiaojun Duan , Zhengming Wang , Liang Yan

Bayesian Estimation and Regularization Techniques in Categorical Data Analysis

This paper explores Bayesian estimation for categorical data, focusing on simple yet effective models that provide a foundation for applying more advanced methods accurately and reliably in real-world applications. We begin by revisiting…

Methodology · Statistics 2025-09-03 Jan Kalina

Bayesian Bi-clustering Methods with Applications in Computational Biology

Bi-clustering is a useful approach in analyzing biological data when observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to…

Applications · Statistics 2021-02-11 Han Yan , Jiexing Wu , Yang Li , Jun S. Liu

Optimal Encoding and Decoding for Point Process Observations: an Approximate Closed-Form Filter

The process of dynamic state estimation (filtering) based on point process observations is in general intractable. Numerical sampling techniques are often practically useful, but lead to limited conceptual insight about optimal…

Machine Learning · Statistics 2016-09-13 Yuval Harel , Ron Meir , Manfred Opper

Bayesian Computing in the Undergraduate Statistics Curriculum

Bayesian statistics has gained great momentum since the computational developments of the 1990s. Gradually, advances in Bayesian methodology and software have made Bayesian techniques much more accessible to applied statisticians and, in…

Computation · Statistics 2020-11-04 Jim Albert , Jingchen Hu

Model-based Sparse Coding beyond Gaussian Independent Model

Sparse coding aims to model data vectors as sparse linear combinations of basis elements, but a majority of related studies are restricted to continuous data without spatial or temporal structure. A new model-based sparse coding (MSC)…

Methodology · Statistics 2021-08-24 Xin Xing , Rui Xie , Wenxuan Zhong

A layered multiple importance sampling scheme for focused optimal Bayesian experimental design

We develop a new computational approach for "focused" optimal Bayesian experimental design with nonlinear models, with the goal of maximizing expected information gain in targeted subsets of model parameters. Our approach considers…

Computation · Statistics 2019-03-28 Chi Feng , Youssef M. Marzouk

Bayesian sampling using interacting particles

Bayesian sampling is an important task in statistics and machine learning. Over the past decade, many ensemble-type sampling methods have been proposed. In contrast to the classical Markov chain Monte Carlo methods, these new methods deploy…

Numerical Analysis · Mathematics 2024-05-14 Shi Chen , Zhiyan Ding , Qin Li

Language Model Embeddings Can Be Sufficient for Bayesian Optimization

Bayesian Optimization is ubiquitous in experimental design and black-box optimization for improving search efficiency. However, most existing approaches rely on regression models which are limited to fixed search spaces and structured,…

Machine Learning · Computer Science 2025-10-10 Tung Nguyen , Qiuyi Zhang , Bangding Yang , Chansoo Lee , Jorg Bornschein , Yingjie Miao , Sagi Perel , Yutian Chen , Xingyou Song