Related papers: Regression Phalanxes

Ensembling classification models based on phalanxes of variables with applications in drug discovery

Statistical detection of a rare class of objects in a two-class classification problem can pose several challenges. Because the class of interest is rare in the training data, there is relatively little information in the known class…

Machine Learning · Statistics 2015-05-18 Jabed H. Tomal , William J. Welch , Ruben H. Zamar

Identifying a minimal class of models for high-dimensional data

Model selection consistency in the high-dimensional regression setting can be achieved only if strong assumptions are fulfilled. We therefore suggest to pursue a different goal, which we call a minimal class of models. The minimal class of…

Methodology · Statistics 2015-11-26 Daniel Nevo , Ya'acov Ritov

Building Bridges between Regression, Clustering, and Classification

Regression, the task of predicting a continuous scalar target y based on some features x is one of the most fundamental tasks in machine learning and statistics. It has been observed and theoretically analyzed that the classical approach,…

Machine Learning · Statistics 2025-02-19 Lawrence Stewart , Francis Bach , Quentin Berthet

Distributional Adaptive Soft Regression Trees

Random forests are an ensemble method relevant for many problems, such as regression or classification. They are popular due to their good predictive performance (compared to, e.g., decision trees) requiring only minimal tuning of…

Methodology · Statistics 2022-10-20 Nikolaus Umlauf , Nadja Klein

Latent class analysis by regularized spectral clustering

The latent class model is a powerful tool for identifying latent classes within populations that share common characteristics for categorical data in social, psychological, and behavioral sciences. In this article, we propose two new…

Machine Learning · Computer Science 2023-10-31 Huan Qing

Spectral Clustering with Likelihood Refinement for High-dimensional Latent Class Recovery

Latent class models are widely used for identifying unobserved subgroups from multivariate categorical data in social sciences, with binary data as a particularly popular example. However, accurately recovering individual latent class…

Methodology · Statistics 2026-02-25 Zhongyuan Lyu , Yuqi Gu

Improved variable selection with Forward-Lasso adaptive shrinkage

Recently, considerable interest has focused on variable selection methods in regression situations where the number of predictors, $p$, is large relative to the number of observations, $n$. Two commonly applied variable selection approaches…

Applications · Statistics 2011-04-19 Peter Radchenko , Gareth M. James

Polynomial Regression as a Task for Understanding In-context Learning Through Finetuning and Alignment

Simple function classes have emerged as toy problems to better understand in-context-learning in transformer-based architectures used for large language models. But previously proposed simple function classes like linear regression or…

Machine Learning · Computer Science 2024-07-30 Max Wilcoxson , Morten Svendgård , Ria Doshi , Dylan Davis , Reya Vir , Anant Sahai

Collapsing Categories for Regression with Mixed Predictors

Categorical predictors are omnipresent in everyday regression practice: in fact, most regression data involve some categorical predictors, and this tendency is increasing in modern applications with more complex structures and larger data…

Methodology · Statistics 2025-11-11 Chaegeun Song , Zhong Zheng , Bing Li , Lingzhou Xue

Fast local linear regression with anchor regularization

Regression is an important task in machine learning and data mining. It has several applications in various domains, including finance, biomedical, and computer vision. Recently, network Lasso, which estimates local models by making…

Machine Learning · Computer Science 2020-03-13 Mathis Petrovich , Makoto Yamada

Modeling panels of extremes

Extreme value applications commonly employ regression techniques to capture cross-sectional heterogeneity or time-variation in the data. Estimation of the parameters of an extreme value regression model is notoriously challenging due to the…

Methodology · Statistics 2022-05-12 Debbie J. Dupuis , Sebastian Engelke , Luca Trapin

Least Angle Regression

The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a…

Statistics Theory · Mathematics 2007-06-13 Bradley Efron , Trevor Hastie , Iain Johnstone , Robert Tibshirani

Tensor Regression

Regression analysis is a key area of interest in the field of data analysis and machine learning which is devoted to exploring the dependencies between variables, often using vectors. The emergence of high dimensional data in technologies…

Machine Learning · Statistics 2023-08-23 Jiani Liu , Ce Zhu , Zhen Long , Yipeng Liu

Robust Functional Regression with Discretely Sampled Predictors

The functional linear model is an important extension of the classical regression model allowing for scalar responses to be modeled as functions of stochastic processes. Yet, despite the usefulness and popularity of the functional linear…

Methodology · Statistics 2025-11-27 Ioannis Kalogridis , Stanislav Nagy

A Survey Of Regression Algorithms And Connections With Deep Learning

Regression has attracted immense interest lately due to its effectiveness in tasks like predicting values. And Regression is of widespread use in multiple fields such as Economics, Finance, Business, Biology and so on. While considerable…

Machine Learning · Computer Science 2021-04-27 Yunpeng Tai

Group Lasso merger for sparse prediction with high-dimensional categorical data

Sparse prediction with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm for selection…

Methodology · Statistics 2021-12-22 Szymon Nowakowski , Piotr Pokarowski , Wojciech Rejchel

Targeting predictors in random forest regression

Random forest regression (RF) is an extremely popular tool for the analysis of high-dimensional data. Nonetheless, its benefits may be lessened in sparse settings due to weak predictors, and a pre-estimation dimension reduction (targeting)…

Econometrics · Economics 2020-11-09 Daniel Borup , Bent Jesper Christensen , Nicolaj Nørgaard Mühlbach , Mikkel Slot Nielsen

A Powerful Random Forest Featuring Linear Extensions (RaFFLE)

Random forests are widely used in regression. However, the decision trees used as base learners are poor approximators of linear relationships. To address this limitation we propose RaFFLE (Random Forest Featuring Linear Extensions), a…

Machine Learning · Computer Science 2025-02-17 Jakob Raymaekers , Peter J. Rousseeuw , Thomas Servotte , Tim Verdonck , Ruicong Yao

Systematic Ensemble Learning for Regression

The motivation of this work is to improve the performance of standard stacking approaches or ensembles, which are composed of simple, heterogeneous base models, through the integration of the generation and selection stages for regression…

Machine Learning · Statistics 2014-03-31 Roberto Aldave , Jean-Pierre Dussault

Prototypal Analysis and Prototypal Regression

Prototypal analysis is introduced to overcome two shortcomings of archetypal analysis: its sensitivity to outliers and its non-locality, which reduces its applicability as a learning tool. Same as archetypal analysis, prototypal analysis…

Machine Learning · Statistics 2017-08-24 Chenyue Wu , Esteban G. Tabak