Related papers: New optimization algorithms for neural network tra…

Metric Learning to Accelerate Convergence of Operator Splitting Methods for Differentiable Parametric Programming

Recent work has shown a variety of ways in which machine learning can be used to accelerate the solution of constrained optimization problems. Increasing demand for real-time decision-making capabilities in applications such as artificial…

Machine Learning · Computer Science 2024-04-02 Ethan King , James Kotary , Ferdinando Fioretto , Jan Drgona

Operator-Splitting Methods for Neuromorphic Circuit Simulation

A novel splitting algorithm is proposed for the numerical simulation of neuromorphic circuits. The algorithm is grounded in the operator-theoretic concept of monotonicity, which bears both physical and algorithmic significance. The…

Systems and Control · Electrical Eng. & Systems 2025-05-29 Amir Shahhosseini , Thomas Chaffey , Rodolphe Sepulchre

Learning to Optimize Neural Nets

Learning to Optimize is a recently proposed framework for learning optimization algorithms using reinforcement learning. In this paper, we explore learning an optimization algorithm for training shallow neural nets. Such high-dimensional…

Machine Learning · Computer Science 2017-12-01 Ke Li , Jitendra Malik

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising…

Machine Learning · Computer Science 2022-07-11 Shaoru Chen , Eric Wong , J. Zico Kolter , Mahyar Fazlyab

Training Neural Networks at Any Scale

This article reviews modern optimization methods for training neural networks with an emphasis on efficiency and scale. We present state-of-the-art optimization algorithms under a unified algorithmic template that highlights the importance…

Machine Learning · Computer Science 2025-11-17 Thomas Pethick , Kimon Antonakopoulos , Antonio Silveti-Falls , Leena Chennuru Vankadara , Volkan Cevher

Learning Gradient Descent: Better Generalization and Longer Horizons

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li

Reducing the Training Time of Neural Networks by Partitioning

This paper presents a new method for pre-training neural networks that can decrease the total training time for a neural network while maintaining the final performance, which motivates its use on deep neural networks. By partitioning the…

Neural and Evolutionary Computing · Computer Science 2016-01-05 Conrado S. Miranda , Fernando J. Von Zuben

Efficient Neural Network Training via Subset Pretraining

In training neural networks, it is common practice to use partial gradients computed over batches, mostly very small subsets of the training set. This approach is motivated by the argument that such a partial gradient is close to the true…

Machine Learning · Computer Science 2024-11-25 Jan Spörer , Bernhard Bermeitinger , Tomas Hrycej , Niklas Limacher , Siegfried Handschuh

Splitting Convolutional Neural Network Structures for Efficient Inference

For convolutional neural networks (CNNs) that have a large volume of input data, memory management becomes a major concern. Memory cost reduction can be an effective way to deal with these problems that can be realized through different…

Computer Vision and Pattern Recognition · Computer Science 2020-02-11 Emad MalekHosseini , Mohsen Hajabdollahi , Nader Karimi , Shadrokh Samavi , Shahram Shirani

Adaptive Optimization Algorithms for Machine Learning

Machine learning assumes a pivotal role in our data-driven world. The increasing scale of models and datasets necessitates quick and reliable algorithms for model training. This dissertation investigates adaptivity in machine learning…

Machine Learning · Computer Science 2023-11-20 Slavomír Hanzely

A randomized operator splitting scheme inspired by stochastic optimization methods

In this paper, we combine the operator splitting methodology for abstract evolution equations with that of stochastic methods for large-scale optimization problems. The combination results in a randomized splitting scheme, which in a given…

Numerical Analysis · Mathematics 2022-10-12 Monika Eisenmann , Tony Stillfjord

A consistent operator splitting algorithm and a two-metric variant: Application to topology optimization

In this work, we explore the use of operator splitting algorithms for solving regularized structural topology optimization problems. The context is the classical structural design problems (e.g., compliance minimization and compliant…

Optimization and Control · Mathematics 2013-07-22 Cameron Talischi , Glaucio H. Paulino

Data optimization for large batch distributed training of deep neural networks

Distributed training in deep learning (DL) is common practice as data and models grow. The current practice for distributed training of deep neural networks faces the challenges of communication bottlenecks when operating at scale, and…

Machine Learning · Computer Science 2020-12-21 Shubhankar Gahlot , Junqi Yin , Mallikarjun Shankar

A Comparison of Optimization Algorithms for Deep Learning

In recent years, we have witnessed the rise of deep learning. Deep neural networks have proved their success in many areas. However, the optimization of these networks has become more difficult as neural networks going deeper and datasets…

Machine Learning · Computer Science 2020-08-05 Derya Soydaner

Optimization Networks for Integrated Machine Learning

Optimization networks are a new methodology for holistically solving interrelated problems that have been developed with combinatorial optimization problems in mind. In this contribution we revisit the core principles of optimization…

Machine Learning · Computer Science 2021-10-04 Michael Kommenda , Johannes Karder , Andreas Beham , Bogdan Burlacu , Gabriel Kronberger , Stefan Wagner , Michael Affenzeller

A New Operator Splitting Method for Euler's Elastica Model

Euler's elastica model has a wide range of applications in Image Processing and Computer Vision. However, the non-convexity, the non-smoothness and the nonlinearity of the associated energy functional make its minimization a challenging…

Numerical Analysis · Mathematics 2020-01-10 Liang-Jian Deng , Roland Glowinski , Xue-Cheng Tai

Splitting Steepest Descent for Growing Neural Architectures

We develop a progressive training approach for neural networks which adaptively grows the network structure by splitting existing neurons to multiple off-springs. By leveraging a functional steepest descent idea, we derive a simple…

Machine Learning · Computer Science 2019-11-06 Qiang Liu , Lemeng Wu , Dilin Wang

Optimization and Learning in Open Multi-Agent Systems

Modern artificial intelligence relies on networks of agents that collect data, process information, and exchange it with neighbors to collaboratively solve optimization and learning problems. This article introduces a novel distributed…

Optimization and Control · Mathematics 2026-01-15 Diego Deplano , Nicola Bastianello , Mauro Franceschelli , Karl H. Johansson

Dynamic Optimization of Neural Network Structures Using Probabilistic Modeling

Deep neural networks (DNNs) are powerful machine learning models and have succeeded in various artificial intelligence tasks. Although various architectures and modules for the DNNs have been proposed, selecting and designing the…

Neural and Evolutionary Computing · Computer Science 2018-01-24 Shinichi Shirakawa , Yasushi Iwata , Youhei Akimoto

A constrained optimization approach to improve robustness of neural networks

In this paper, we present a novel nonlinear programming-based approach to fine-tune pre-trained neural networks to improve robustness against adversarial attacks while maintaining high accuracy on clean data. Our method introduces…

Machine Learning · Computer Science 2024-10-28 Shudian Zhao , Jan Kronqvist