English
Related papers

Related papers: Towards Theoretically Inspired Neural Initializati…

200 papers

Innovations in neural architectures have fostered significant breakthroughs in language modeling and computer vision. Unfortunately, novel architectures often result in challenging hyper-parameter choices and training instability if the…

Machine Learning · Computer Science 2021-11-25 Chen Zhu , Renkun Ni , Zheng Xu , Kezhi Kong , W. Ronny Huang , Tom Goldstein

Research aimed at scaling up neuroscience inspired learning algorithms for neural networks is accelerating. Recently, a key research area has been the study of energy-based learning algorithms such as predictive coding, due to their…

Machine Learning · Computer Science 2026-01-30 Luca Pinchetti , Simon Frieder , Thomas Lukasiewicz , Tommaso Salvatori

It is often advantageous to train models on a subset of the available train examples, because the examples are of variable quality or because one would like to train with fewer examples, without sacrificing performance. We present Gradient…

Machine Learning · Computer Science 2024-07-30 Dante Everaert , Christopher Potts

Weight initialization plays a crucial role in the optimization behavior and convergence efficiency of neural networks. Most existing initialization methods, such as Xavier and Kaiming initializations, rely on random sampling and do not…

Machine Learning · Computer Science 2026-02-09 Shaowen Wang , Tariq Alkhalifah

The design of optimization algorithms for neural networks remains a critical challenge, with most existing methods relying on heuristic adaptations of gradient-based approaches. This paper introduces KO (Kinetics-inspired Optimizer), a…

Machine Learning · Computer Science 2025-05-22 Mingquan Feng , Yixin Huang , Yifan Fu , Shaobo Wang , Junchi Yan

The architecture and the parameters of neural networks are often optimized independently, which requires costly retraining of the parameters whenever the architecture is modified. In this work we instead focus on growing the architecture…

Machine Learning · Computer Science 2022-06-08 Utku Evci , Bart van Merriënboer , Thomas Unterthiner , Max Vladymyrov , Fabian Pedregosa

Iterative optimization is central to modern artificial intelligence (AI) and provides a crucial framework for understanding adaptive systems. This review provides a unified perspective on this subject, bridging classic theory with neural…

Machine Learning · Computer Science 2025-10-22 Jesús García Fernández , Nasir Ahmad , Marcel van Gerven

We propose a Newton-based scheme, initialized by neural operator predictions, to accelerate the parametric solution of nonlinear problems in computational solid mechanics. First, a physics informed conditional neural field is trained to…

Machine Learning · Computer Science 2025-11-11 Kianoosh Taghikhani , Yusuke Yamazaki , Jerry Paul Varghese , Markus Apel , Reza Najian Asl , Shahed Rezaei

Proper initialisation strategy is of primary importance to mitigate gradient explosion or vanishing when training neural networks. Yet, the impact of initialisation parameters still lacks a precise theoretical understanding for several…

Machine Learning · Computer Science 2026-05-12 Andrea Combette , Antoine Venaille , Nelly Pustelnik

We propose a novel network initialization method using Perlin noise for training image classification networks with a limited amount of data. Our main idea is to initialize the network parameters by solving an artificial noise…

Computer Vision and Pattern Recognition · Computer Science 2021-01-20 Nakamasa Inoue , Eisuke Yamagata , Hirokatsu Kataoka

Automatic neural architecture design has shown its potential in discovering powerful neural network architectures. Existing methods, no matter based on reinforcement learning or evolutionary algorithms (EA), conduct architecture search in a…

Machine Learning · Computer Science 2019-09-05 Renqian Luo , Fei Tian , Tao Qin , Enhong Chen , Tie-Yan Liu

We propose a new physics-informed neural network framework, IDPINN, based on the enhancement of initialization and domain decomposition to improve prediction accuracy. We train a PINN using a small dataset to obtain an initial network…

Machine Learning · Computer Science 2024-06-06 Chenhao Si , Ming Yan

Fast gradient-based optimization algorithms have become increasingly essential for the computationally efficient training of machine learning models. One technique is to multiply the gradient by a preconditioner matrix to produce a step,…

Machine Learning · Computer Science 2023-09-12 Isaac Liao , Rumen R. Dangovski , Jakob N. Foerster , Marin Soljačić

We report competitive results on object detection and instance segmentation on the COCO dataset using standard models trained from random initialization. The results are no worse than their ImageNet pre-training counterparts even when using…

Computer Vision and Pattern Recognition · Computer Science 2018-11-22 Kaiming He , Ross Girshick , Piotr Dollár

Overparameterized Neural Networks (NN) display state-of-the-art performance. However, there is a growing need for smaller, energy-efficient, neural networks tobe able to use machine learning applications on devices with limited…

Machine Learning · Statistics 2021-05-21 Soufiane Hayou , Jean-Francois Ton , Arnaud Doucet , Yee Whye Teh

Despite the recent success of stochastic gradient descent in deep learning, it is often difficult to train a deep neural network with an inappropriate choice of its initial parameters. Even if training is successful, it has been known that…

Machine Learning · Computer Science 2023-02-10 Cheolhyoung Lee , Kyunghyun Cho

Deep neural networks are usually initialized with random weights, with adequately selected initial variance to ensure stable signal propagation during training. However, selecting the appropriate variance becomes challenging especially as…

Machine Learning · Computer Science 2022-11-07 Jiawei Zhao , Florian Schäfer , Anima Anandkumar

Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. However, the use of stochastic gradient descent combined with the nonconvexity of the underlying optimization problems renders…

Machine Learning · Computer Science 2020-01-29 Ramina Ghods , Andrew S. Lan , Tom Goldstein , Christoph Studer

Network embedding has been intensively studied in the literature and widely used in various applications, such as link prediction and node classification. While previous work focus on the design of new algorithms or are tailored for various…

Social and Information Networks · Computer Science 2019-11-12 Wenqing Lin , Feng He , Faqiang Zhang , Xu Cheng , Hongyun Cai

We propose RSO (random search optimization), a gradient free Markov Chain Monte Carlo search based approach for training deep neural networks. To this end, RSO adds a perturbation to a weight in a deep neural network and tests if it reduces…

Machine Learning · Computer Science 2020-05-13 Rohun Tripathi , Bharat Singh
‹ Prev 1 2 3 10 Next ›