Related papers: Collaborative Filtering via High-Dimensional Regre…

ImplicitSLIM and How it Improves Embedding-based Collaborative Filtering

We present ImplicitSLIM, a novel unsupervised learning approach for sparse high-dimensional data, with applications to collaborative filtering. Sparse linear methods (SLIM) and their variations show outstanding performance, but they are…

Information Retrieval · Computer Science 2024-06-04 Ilya Shenbin , Sergey Nikolenko

Towards a Better Understanding of Linear Models for Recommendation

Recently, linear regression models, such as EASE and SLIM, have shown to often produce rather competitive results against more sophisticated deep learning models. On the other side, the (weighted) matrix factorization approaches have been…

Information Retrieval · Computer Science 2021-06-17 Ruoming Jin , Dong Li , Jing Gao , Zhi Liu , Li Chen , Yang Zhou

Factor-Adjusted Regularized Model Selection

This paper studies model selection consistency for high dimensional sparse regression when data exhibits both cross-sectional and serial dependency. Most commonly-used model selection methods fail to consistently recover the true model when…

Methodology · Statistics 2018-09-12 Jianqing Fan , Yuan Ke , Kaizheng Wang

Embarrassingly Shallow Autoencoders for Sparse Data

Combining simple elements from the literature, we define a linear model that is geared toward sparse data, in particular implicit feedback data for recommender systems. We show that its training objective has a closed-form solution, and…

Information Retrieval · Computer Science 2019-05-10 Harald Steck

High Dimensional Robust Sparse Regression

We provide a novel -- and to the best of our knowledge, the first -- algorithm for high dimensional sparse regression with constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse…

Machine Learning · Computer Science 2019-05-31 Liu Liu , Yanyao Shen , Tianyang Li , Constantine Caramanis

High Dimensional Classification with combined Adaptive Sparse PLS and Logistic Regression

Motivation: The high dimensionality of genomic data calls for the development of specific classification methodologies, especially to prevent over-optimistic predictions. This challenge can be tackled by compression and variable selection,…

Methodology · Statistics 2021-04-10 G. Durif , L. Modolo , J. Michaelsson , J. E. Mold , S. Lambert-Lacroix , F. Picard

Large-scale Collaborative Filtering with Product Embeddings

The application of machine learning techniques to large-scale personalized recommendation problems is a challenging task. Such systems must make sense of enormous amounts of implicit feedback in order to understand user preferences across…

Information Retrieval · Computer Science 2019-01-15 Thom Lake , Sinead A. Williamson , Alexander T. Hawk , Christopher C. Johnson , Benjamin P. Wing

High-Dimensional Distributed Sparse Classification with Scalable Communication-Efficient Global Updates

As the size of datasets used in statistical learning continues to grow, distributed training of models has attracted increasing attention. These methods partition the data and exploit parallelism to reduce memory and runtime, but suffer…

Machine Learning · Computer Science 2024-07-10 Fred Lu , Ryan R. Curtin , Edward Raff , Francis Ferraro , James Holt

Sparse Additive Models

We present a new class of methods for high-dimensional nonparametric regression and classification called sparse additive models (SpAM). Our methods combine ideas from sparse linear modeling and additive nonparametric regression. We derive…

Statistics Theory · Mathematics 2008-04-09 Pradeep Ravikumar , John Lafferty , Han Liu , Larry Wasserman

Improving Group Lasso for high-dimensional categorical data

Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm…

Methodology · Statistics 2022-11-14 Szymon Nowakowski , Piotr Pokarowski , Wojciech Rejchel , Agnieszka Sołtys

Modeling with Categorical Features via Exact Fusion and Sparsity Regularisation

We study the high-dimensional linear regression problem with categorical predictors that have many levels. We propose a new estimation approach, which performs model compression via two mechanisms by simultaneously encouraging (a)…

Methodology · Statistics 2026-03-30 Kayhan Behdin , Riade Benbaki , Peter Radchenko , Rahul Mazumder

Subspace Segmentation by Successive Approximations: A Method for Low-Rank and High-Rank Data with Missing Entries

We propose a method to reconstruct and cluster incomplete high-dimensional data lying in a union of low-dimensional subspaces. Exploring the sparse representation model, we jointly estimate the missing data while imposing the intrinsic…

Computer Vision and Pattern Recognition · Computer Science 2017-09-06 João Carvalho , Manuel Marques , João P. Costeira

A Consistent and Scalable Algorithm for Best Subset Selection in Single Index Models

Analysis of high-dimensional data has led to increased interest in both single index models (SIMs) and the best-subset selection. SIMs provide an interpretable and flexible modeling framework for high-dimensional data, while the best-subset…

Machine Learning · Statistics 2025-08-19 Borui Tang , Jin Zhu , Junxian Zhu , Xueqin Wang , Heping Zhang

High-dimensional classification by sparse logistic regression

We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic…

Statistics Theory · Mathematics 2018-11-20 Felix Abramovich , Vadim Grinshtein

Multi-Model Subset Selection

The two primary approaches for high-dimensional regression problems are sparse methods (e.g., best subset selection, which uses the L0-norm in the penalty) and ensemble methods (e.g., random forests). Although sparse methods typically yield…

Methodology · Statistics 2024-10-31 Anthony-Alexander Christidis , Stefan Van Aelst , Ruben Zamar

Greedy SLIM: A SLIM-Based Approach For Preference Elicitation

Preference elicitation is an active learning approach to tackle the cold-start problem of recommender systems. Roughly speaking, new users are asked to rate some carefully selected items in order to compute appropriate recommendations for…

Information Retrieval · Computer Science 2024-06-11 Claudius Proissl , Amel Vatic , Helmut Waldschmidt

CoSam: An Efficient Collaborative Adaptive Sampler for Recommendation

Sampling strategies have been widely applied in many recommendation systems to accelerate model learning from implicit feedback data. A typical strategy is to draw negative instances with uniform distribution, which however will severely…

Information Retrieval · Computer Science 2020-11-17 Jiawei Chen , Chengquan Jiang , Can Wang , Sheng Zhou , Yan Feng , Chun Chen , Martin Ester , Xiangnan He

Federated Sufficient Dimension Reduction Through High-Dimensional Sparse Sliced Inverse Regression

Federated learning has become a popular tool in the big data era nowadays. It trains a centralized model based on data from different clients while keeping data decentralized. In this paper, we propose a federated sparse sliced inverse…

Machine Learning · Statistics 2023-01-24 Wenquan Cui , Yue Zhao , Jianjun Xu , Haoyang Cheng

Efficient Distributed Learning with Sparsity

We propose a novel, efficient approach for distributed sparse learning in high-dimensions, where observations are randomly partitioned across machines. Computationally, at each round our method only requires the master machine to solve a…

Machine Learning · Statistics 2016-05-26 Jialei Wang , Mladen Kolar , Nathan Srebro , Tong Zhang

Sparse Group Selection Through Co-Adaptive Penalties

Recent work has focused on the problem of conducting linear regression when the number of covariates is very large, potentially greater than the sample size. To facilitate this, one useful tool is to assume that the model can be well…

Methodology · Statistics 2011-11-21 Zhou Fang