English
Related papers

Related papers: Adaptive Regularization for Weight Matrices

200 papers

The adaptive Metropolis (AM) algorithm of Haario, Saksman and Tamminen [Bernoulli 7 (2001) 223-242] uses the estimated covariance of the target distribution in the proposal distribution. This paper introduces a new robust adaptive…

Computation · Statistics 2011-05-30 Matti Vihola

Weight-sharing is ubiquitous in deep learning. Motivated by this, we propose a "weight-sharing regularization" penalty on the weights $w \in \mathbb{R}^d$ of a neural network, defined as $\mathcal{R}(w) = \frac{1}{d - 1}\sum_{i > j}^d |w_i…

Machine Learning · Computer Science 2024-03-12 Mehran Shakerinava , Motahareh Sohrabi , Siamak Ravanbakhsh , Simon Lacoste-Julien

Adaptive algorithms belong to an important class of algorithms used in radar target detection to overcome prior uncertainty of interference covariance. The contamination of the empirical covariance matrix by the useful signal leads to…

Signal Processing · Electrical Eng. & Systems 2021-01-01 Boris N. Oreshkin

Matrix factorization is a widely used approach for top-N recommendation and collaborative filtering. When implemented on implicit feedback data (such as clicks), a common heuristic is to upweight the observed interactions. This strategy has…

Information Retrieval · Computer Science 2025-10-14 Alex Ayoub , Samuel Robertson , Dawen Liang , Harald Steck , Nathan Kallus

This work establishes a novel link between the problem of PAC-learning high-dimensional graphical models and the task of (efficient) counting and sampling of graph structures, using an online learning framework. We observe that if we apply…

Machine Learning · Computer Science 2025-11-14 Arnab Bhattacharyya , Sutanu Gayen , Philips George John , Sayantan Sen , N. V. Vinodchandran

In this article, we present a method for increasing adaptivity of an existing robust estimation algorithm by learning two parameters to better fit the residual distribution. The analyzed method uses these two parameters to calculate weights…

Robotics · Computer Science 2023-06-27 Shounak Das , Jason Gross

Sparse regularization techniques are well-established in machine learning, yet their application in neural networks remains challenging due to the non-differentiability of penalties like the $L_1$ norm, which is incompatible with stochastic…

Machine Learning · Computer Science 2025-02-10 Chris Kolb , Tobias Weber , Bernd Bischl , David Rügamer

There is growing body of learning problems for which it is natural to organize the parameters into matrix, so as to appropriately regularize the parameters under some matrix norm (in order to impose some more sophisticated prior knowledge).…

Machine Learning · Computer Science 2010-10-19 Sham M. Kakade , Shai Shalev-Shwartz , Ambuj Tewari

Matrix-based optimizers have attracted growing interest for improving LLM training efficiency, with significant progress centered on orthogonalization/whitening based methods. While yielding substantial performance gains, a fundamental…

Machine Learning · Computer Science 2026-02-10 Wenbo Gong , Javier Zazo , Qijun Luo , Puqian Wang , James Hensman , Chao Ma

For many machine learning algorithms, two main assumptions are required to guarantee performance. One is that the test data are drawn from the same distribution as the training data, and the other is that the model is correctly specified.…

Machine Learning · Computer Science 2020-02-03 Kun Kuang , Ruoxuan Xiong , Peng Cui , Susan Athey , Bo Li

A common goal in statistics and machine learning is to learn models that can perform well against distributional shifts, such as latent heterogeneous subpopulations, unknown covariate shifts, or unmodeled temporal effects. We develop and…

Machine Learning · Statistics 2020-07-21 John Duchi , Hongseok Namkoong

To train machine learning models that are robust to distribution shifts in the data, distributionally robust optimization (DRO) has been proven very effective. However, the existing approaches to learning a distributionally robust model…

Machine Learning · Computer Science 2022-03-21 Farzin Haddadpour , Mohammad Mahdi Kamani , Mehrdad Mahdavi , Amin Karbasi

Empirical risk minimization (ERM) is not robust to changes in the distribution of data. When the distribution of test data is different from that of training data, the problem is known as out-of-distribution generalization. Recently, two…

Computer Vision and Pattern Recognition · Computer Science 2025-01-16 Shijian Xu

Regression evaluation has been performed for decades. Some metrics have been identified to be robust against shifting and scaling of the data but considering the different distributions of data is much more difficult to address (imbalance…

Machine Learning · Computer Science 2020-09-14 Mario Michael Krell , Bilal Wehbe

In this era of large-scale data, distributed systems built on top of clusters of commodity hardware provide cheap and reliable storage and scalable processing of massive data. Here, we review recent work on developing and implementing…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-07-28 Jiyan Yang , Xiangrui Meng , Michael W. Mahoney

We develop an approach to efficiently grow neural networks, within which parameterization and optimization strategies are designed by considering their effects on the training dynamics. Unlike existing growing methods, which follow simple…

Machine Learning · Computer Science 2023-06-23 Xin Yuan , Pedro Savarese , Michael Maire

We study the averaging-based distributed optimization solvers over random networks. We show a general result on the convergence of such schemes using weight-matrices that are row-stochastic almost surely and column-stochastic in expectation…

Optimization and Control · Mathematics 2020-10-06 Adel Aghajan , Behrouz Touri

With the increasing penetration of machine learning applications in critical decision-making areas, calls for algorithmic fairness are more prominent. Although there have been various modalities to improve algorithmic fairness through…

Machine Learning · Computer Science 2024-05-21 Zhihao Hu , Yiran Xu , Mengnan Du , Jindong Gu , Xinmei Tian , Fengxiang He

This thesis presents a novel approach to neural network training that addresses the challenge of determining the optimal number of learning factors. The proposed Adaptive Multiple Optimal Learning Factors (AMOLF) algorithm dynamically…

Machine Learning · Computer Science 2024-06-12 Jeshwanth Challagundla

Modeling and forecasting covariance matrices of asset returns play a crucial role in finance. The availability of high frequency intraday data enables the modeling of the realized covariance matrix directly. However, most models in the…

Applications · Statistics 2015-04-15 Keren Shen , Jianfeng Yao , Wai Keung Li
‹ Prev 1 2 3 10 Next ›