Related papers: Scalable Algorithms for Learning High-Dimensional …

Machine Learning-Assisted High-Dimensional Matrix Estimation

Efficient estimation of high-dimensional matrices-including covariance and precision matrices-is a cornerstone of modern multivariate statistics. Most existing studies have focused primarily on the theoretical properties of the estimators…

Machine Learning · Computer Science 2026-03-31 Wan Tian , Hui Yang , Zhouhui Lian , Lingyue Zhang , Yijie Peng

Estimation, Confidence Intervals, and Large-Scale Hypotheses Testing for High-Dimensional Mixed Linear Regression

This paper studies the high-dimensional mixed linear regression (MLR) where the output variable comes from one of the two linear regression models with an unknown mixing proportion and an unknown covariance structure of the random…

Methodology · Statistics 2020-11-10 Linjun Zhang , Rong Ma , T. Tony Cai , Hongzhe Li

Kernel Methods and Multi-layer Perceptrons Learn Linear Models in High Dimensions

Empirical observation of high dimensional phenomena, such as the double descent behaviour, has attracted a lot of interest in understanding classical techniques such as kernel methods, and their implications to explain generalization…

Machine Learning · Statistics 2022-01-21 Mojtaba Sahraee-Ardakan , Melikasadat Emami , Parthe Pandit , Sundeep Rangan , Alyson K. Fletcher

A Survey on Large-scale Machine Learning

Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However,…

Machine Learning · Computer Science 2020-08-11 Meng Wang , Weijie Fu , Xiangnan He , Shijie Hao , Xindong Wu

Learning Single Index Models in High Dimensions

Single Index Models (SIMs) are simple yet flexible semi-parametric models for classification and regression. Response variables are modeled as a nonlinear, monotonic function of a linear combination of features. Estimation in this context…

Machine Learning · Statistics 2015-07-01 Ravi Ganti , Nikhil Rao , Rebecca M. Willett , Robert Nowak

Robust Linear Mixed Models using Hierarchical Gamma-Divergence

Linear mixed models (LMMs) are a popular class of methods for analyzing longitudinal and clustered data. However, such models can be sensitive to outliers, and this can lead to biased inference on model parameters and inaccurate prediction…

Methodology · Statistics 2025-03-28 Shonosuke Sugasawa , Francis K. C. Hui , Alan H. Welsh

Efficient Penalized Generalized Linear Mixed Models for Variable Selection and Genetic Risk Prediction in High-Dimensional Data

Sparse regularized regression methods are now widely used in genome-wide association studies (GWAS) to address the multiple testing burden that limits discovery of potentially important predictors. Linear mixed models (LMMs) have become an…

Methodology · Statistics 2022-06-27 Julien St-Pierre , Karim Oualkacha , Sahir Rai Bhatnagar

Adaptive Randomized Dimension Reduction on Massive Data

The scalability of statistical estimators is of increasing importance in modern applications. One approach to implementing scalable algorithms is to compress data into a low dimensional latent space using dimension reduction methods. In…

Machine Learning · Statistics 2015-04-14 Gregory Darnell , Stoyan Georgiev , Sayan Mukherjee , Barbara E Engelhardt

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

The estimation of high dimensional precision matrices has been a central topic in statistical learning. However, as the number of parameters scales quadratically with the dimension $p$, many state-of-the-art methods do not scale well to…

Computation · Statistics 2019-07-10 Cheng Wang , Binyan Jiang

Double Machine Learning for Partially Linear Mixed-Effects Models with Repeated Measurements

Traditionally, spline or kernel approaches in combination with parametric estimation are used to infer the linear coefficient (fixed effects) in a partially linear mixed-effects model for repeated measurements. Using machine learning…

Methodology · Statistics 2023-04-03 Corinne Emmenegger , Peter Bühlmann

LESA: Learnable LLM Layer Scaling-Up

Training Large Language Models (LLMs) from scratch requires immense computational resources, making it prohibitively expensive. Model scaling-up offers a promising solution by leveraging the parameters of smaller models to create larger…

Machine Learning · Computer Science 2025-02-20 Yifei Yang , Zouying Cao , Xinbei Ma , Yao Yao , Libo Qin , Zhi Chen , Hai Zhao

An Explorative Study on Distributed Computing Techniques in Training and Inference of Large Language Models

Large language models (LLM) are advanced AI systems trained on extensive textual data, leveraging deep learning techniques to understand and generate human-like language. Today's LLMs with billions of parameters are so huge that hardly any…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-14 Sheikh Azizul Hakim , Saem Hasan

A Spectral Algorithm for Learning Hidden Markov Models

Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard (under cryptographic assumptions), and…

Machine Learning · Computer Science 2012-07-10 Daniel Hsu , Sham M. Kakade , Tong Zhang

Sparse Linear Isotonic Models

In machine learning and data mining, linear models have been widely used to model the response as parametric linear functions of the predictors. To relax such stringent assumptions made by parametric linear models, additive models consider…

Machine Learning · Statistics 2017-10-18 Sheng Chen , Arindam Banerjee

Efficient Computation of High-Dimensional Penalized Generalized Linear Mixed Models by Latent Factor Modeling of the Random Effects

Modern biomedical datasets are increasingly high dimensional and exhibit complex correlation structures. Generalized Linear Mixed Models (GLMMs) have long been employed to account for such dependencies. However, proper specification of the…

Methodology · Statistics 2024-04-18 Hillary M. Heiling , Naim U. Rashid , Quefeng Li , Xianlu L. Peng , Jen Jen Yeh , Joseph G. Ibrahim

On Learning High Dimensional Structured Single Index Models

Single Index Models (SIMs) are simple yet flexible semi-parametric models for machine learning, where the response variable is modeled as a monotonic function of a linear combination of features. Estimation in this context requires learning…

Machine Learning · Statistics 2016-12-01 Nikhil Rao , Ravi Ganti , Laura Balzano , Rebecca Willett , Robert Nowak

Sparse Probit Linear Mixed Model

Linear Mixed Models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting…

Machine Learning · Statistics 2017-09-12 Stephan Mandt , Florian Wenzel , Shinichi Nakajima , John P. Cunningham , Christoph Lippert , Marius Kloft

Differentiable Linearized ADMM

Recently, a number of learning-based optimization methods that combine data-driven architectures with the classical optimization algorithms have been proposed and explored, showing superior empirical performance in solving various ill-posed…

Machine Learning · Computer Science 2019-05-16 Xingyu Xie , Jianlong Wu , Zhisheng Zhong , Guangcan Liu , Zhouchen Lin

Scalable Matrix-valued Kernel Learning for High-dimensional Nonlinear Multivariate Regression and Granger Causality

We propose a general matrix-valued multiple kernel learning framework for high-dimensional nonlinear multivariate regression problems. This framework allows a broad class of mixed norm regularizers, including those that induce sparsity, to…

Machine Learning · Computer Science 2014-08-12 Vikas Sindhwani , Ha Quang Minh , Aurelie Lozano

Scalable Matrix-valued Kernel Learning for High-dimensional Nonlinear Multivariate Regression and Granger Causality

We propose a general matrix-valued multiple kernel learning framework for high-dimensional nonlinear multivariate regression problems. This framework allows a broad class of mixed norm regularizers, including those that induce sparsity, to…

Machine Learning · Statistics 2013-03-11 Vikas Sindhwani , Minh Ha Quang , Aurelie C. Lozano