English
Related papers

Related papers: Confidence sets for split points in decision trees

200 papers

By revisiting the end-cut preference (ECP) phenomenon associated with a single CART (Breiman et al. (1984)), we introduce MinimaxSplit decision trees, a robust alternative to CART that selects splits by minimizing the worst-case child risk…

Statistics Theory · Mathematics 2026-04-16 Zhenyuan Zhang , Hengrui Luo

Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For binary classification and regression models, this approach recursively divides the data into two near-homogenous…

Machine Learning · Statistics 2020-08-17 Jason M. Klusowski

Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For regression models, this approach recursively divides the data into two near-homogenous daughter nodes according to…

Machine Learning · Statistics 2020-11-20 Jason M. Klusowski

We study the problem of identifying the source of a diffusion spreading over a regular tree. When the degree of each node is at least three, we show that it is possible to construct confidence sets for the diffusion source with size…

Statistics Theory · Mathematics 2018-08-08 Justin Khim , Po-Ling Loh

Filamentary structures, also called ridges, generalize the concept of modes of density functions and provide low-dimensional representations of point clouds. Using kernel type plug-in estimators, we give asymptotic confidence regions for…

Statistics Theory · Mathematics 2024-05-02 Wanli Qiao

Connected acyclic graphs (trees) are data objects that hierarchically organize categories. Collections of trees arise in a diverse variety of fields, including evolutionary biology, public health, machine learning, social sciences and…

Methodology · Statistics 2025-12-01 Maria Alejandra Valdez Cabrera , Amy D Willis , Armeen Taeb

Recursive decision trees are widely used to estimate heterogeneous causal treatment effects in experimental and observational studies. These methods are typically implemented using CART-type recursive partitioning and are often viewed as…

Statistics Theory · Mathematics 2026-03-19 Matias D. Cattaneo , Jason M. Klusowski , Ruiqi Rae Yu

Decision trees partition the feature space using hard binary thresholds, assigning identical confidence to instances far from a decision boundary and to those directly on it. We introduce ternary decision trees, which augment each split…

Machine Learning · Computer Science 2026-05-22 William Smits

Decision trees are powerful machine learning algorithms, widely used in fields such as economics and medicine for their simplicity and interpretability. However, decision trees such as CART are prone to overfitting, especially when grown…

Machine Learning · Statistics 2026-01-13 Likun Zhang , Wei Ma

It is widely believed that the prediction accuracy of decision tree models is invariant under any strictly monotone transformation of the individual predictor variables. However, this statement may be false when predicting new observations…

Machine Learning · Statistics 2016-11-16 Tal Galili , Isaac Meilijson

Decision trees are important both as interpretable models amenable to high-stakes decision-making, and as building blocks of ensemble methods such as random forests and gradient boosting. Their statistical properties, however, are not well…

Machine Learning · Statistics 2021-10-20 Yan Shuo Tan , Abhineet Agarwal , Bin Yu

When using the Focused Information Criterion (FIC) for assessing and ranking candidate models with respect to how well they do for a given estimation task, it is customary to produce a so-called FIC plot. This plot has the different point…

Methodology · Statistics 2026-02-18 Céline Cunen , Nils Lid Hjort

Random survival forest and survival trees are popular models in statistics and machine learning. However, there is a lack of general understanding regarding consistency, splitting rules and influence of the censoring mechanism. In this…

Statistics Theory · Mathematics 2019-02-05 Yifan Cui , Ruoqing Zhu , Mai Zhou , Michael Kosorok

Decision tree classifiers are a widely used tool in data stream mining. The use of confidence intervals to estimate the gain associated with each split leads to very effective methods, like the popular Hoeffding tree algorithm. From a…

Machine Learning · Statistics 2016-04-13 Rocco De Rosa

In an era where artificial intelligence and machine learning algorithms increasingly impact human life, it is crucial to develop models that account for potential discrimination in their predictions. This paper tackles this problem by…

Machine Learning · Statistics 2024-10-10 Anna Gottard , Vanessa Verrina , Sabrina Giordano

Ensembles of decision trees are a useful tool for obtaining for obtaining flexible estimates of regression functions. Examples of these methods include gradient boosted decision trees, random forests, and Bayesian CART. Two potential…

Methodology · Statistics 2018-09-18 Antonio Ricardo Linero , Yun Yang

Outcomes of data-driven AI models cannot be assumed to be always correct. To estimate the uncertainty in these outcomes, the uncertainty wrapper framework has been proposed, which considers uncertainties related to model fit, input quality,…

Machine Learning · Computer Science 2022-01-11 Pascal Gerber , Lisa Jöckel , Michael Kläs

The end-cut preference (ECP) problem, referring to the tendency to favor split points near the boundaries of a feature's range, is a well-known issue in CART (Breiman et al., 1984). ECP may induce highly imbalanced and biased splits,…

Machine Learning · Statistics 2025-09-24 Xiaogang Su

While clustering is ubiquitously used across science and industry, uncertainty in cluster assignments is rarely quantified with rigorous guarantees. We propose a novel conformal inference framework for clustering that returns confidence…

Methodology · Statistics 2026-04-13 YoonHaeng Hur , Anirban Nath , Genevera Allen

We propose confidence regions with asymptotically correct uniform coverage probability of parameters whose Fisher information matrix can be singular at important points of the parameter set. Our work is motivated by the need for reliable…

Statistics Theory · Mathematics 2022-09-13 Karl Oskar Ekvall , Matteo Bottai
‹ Prev 1 2 3 10 Next ›