Related papers: Differentiable Integer Linear Programming is not D…

Differentiable Combinatorial Losses through Generalized Gradients of Linear Programs

When samples have internal structure, we often see a mismatch between the objective optimized during training and the model's goal during inference. For example, in sequence-to-sequence modeling we are interested in high-quality translated…

Machine Learning · Computer Science 2020-10-05 Xi Gao , Han Zhang , Aliakbar Panahi , Tom Arodz

Learning Differentiable Surrogate Losses for Structured Prediction

Structured prediction involves learning to predict complex structures rather than simple scalar values. The main challenge arises from the non-Euclidean nature of the output space, which generally requires relaxing the problem formulation.…

Machine Learning · Statistics 2024-11-19 Junjie Yang , Matthieu Labeau , Florence d'Alché-Buc

Nonlinear Integer Programming

Research efforts of the past fifty years have led to a development of linear integer programming as a mature discipline of mathematical optimization. Such a level of maturity has not been reached when one considers nonlinear systems subject…

Optimization and Control · Mathematics 2017-01-03 Raymond Hemmecke , Matthias Köppe , Jon Lee , Robert Weismantel

Differentiable Programming of Isometric Tensor Networks

Differentiable programming is a new programming paradigm which enables large scale optimization through automatic calculation of gradients also known as auto-differentiation. This concept emerges from deep learning, and has also been…

Quantum Physics · Physics 2022-02-01 Chenhua Geng , Hong-Ye Hu , Yijian Zou

Differential Invariants

Validation is a major challenge in differentiable programming. The state of the art is based on algorithmic differentiation. Consistency of first-order tangent and adjoint programs is defined by a well-known first-order differential…

Numerical Analysis · Mathematics 2021-01-12 Uwe Naumann

Differentiable Programming Tensor Networks

Differentiable programming is a fresh programming paradigm which composes parameterized algorithmic components and trains them using automatic differentiation (AD). The concept emerges from deep learning but is not only limited to training…

Strongly Correlated Electrons · Physics 2019-09-11 Hai-Jun Liao , Jin-Guo Liu , Lei Wang , Tao Xiang

Neural Network Training and Non-Differentiable Objective Functions

Many important computer vision tasks are naturally formulated to have a non-differentiable objective. Therefore, the standard, dominant training procedure of a neural network is not applicable since back-propagation requires the gradients…

Computer Vision and Pattern Recognition · Computer Science 2023-05-04 Yash Patel

Software-based Automatic Differentiation is Flawed

Various software efforts embrace the idea that object oriented programming enables a convenient implementation of the chain rule, facilitating so-called automatic differentiation via backpropagation. Such frameworks have no mechanism for…

Machine Learning · Computer Science 2023-05-09 Daniel Johnson , Trevor Maxfield , Yongxu Jin , Ronald Fedkiw

Adaptive Learning-based Surrogate Method for Stochastic Programs with Implicitly Decision-dependent Uncertainty

We consider a class of stochastic programming problems where the implicitly decision-dependent random variable follows a nonparametric regression model with heteroscedastic error. The Clarke subdifferential and surrogate functions are not…

Optimization and Control · Mathematics 2025-05-13 Boyang Shen , Junyi Liu

Gradient Methods Never Overfit On Separable Data

A line of recent works established that when training linear predictors over separable data, using gradient methods and exponentially-tailed losses, the predictors asymptotically converge in direction to the max-margin predictor. As a…

Machine Learning · Computer Science 2020-09-11 Ohad Shamir

Infeasibility detection with primal-dual hybrid gradient for large-scale linear programming

We study the problem of detecting infeasibility of large-scale linear programming problems using the primal-dual hybrid gradient method (PDHG) of Chambolle and Pock (2011). The literature on PDHG has mostly focused on settings where the…

Optimization and Control · Mathematics 2021-02-10 David Applegate , Mateo Díaz , Haihao Lu , Miles Lubin

The Implicit Bias of Gradient Descent on Separable Data

We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets. We show the predictor converges to the direction of the max-margin (hard margin SVM) solution. The…

Machine Learning · Statistics 2024-10-29 Daniel Soudry , Elad Hoffer , Mor Shpigel Nacson , Suriya Gunasekar , Nathan Srebro

The Elements of Differentiable Programming

Artificial intelligence has recently experienced remarkable advances, fueled by large models, vast datasets, accelerated hardware, and, last but not least, the transformative power of differentiable programming. This new programming…

Machine Learning · Computer Science 2025-06-25 Mathieu Blondel , Vincent Roulet

Iterative Linear Quadratic Optimization for Nonlinear Control: Differentiable Programming Algorithmic Templates

Iterative optimization algorithms depend on access to information about the objective function. In a differentiable programming framework, this information, such as gradients, can be automatically derived from the computational graph. We…

Optimization and Control · Mathematics 2025-07-08 Vincent Roulet , Siddhartha Srinivasa , Maryam Fazel , Zaid Harchaoui

Stable iterative refinement algorithms for solving linear systems

Iterative refinement (IR) is a popular scheme for solving a linear system of equations based on gradually improving the accuracy of an initial approximation. Originally developed to improve upon the accuracy of Gaussian elimination,…

Numerical Analysis · Mathematics 2025-06-24 Chai Wah Wu , Mark S. Squillante , Vasileios Kalantzis , Lior Horesh

Differentiable Scripting

In Computational Science, Engineering and Finance (CSEF) scripts typically serve as the "glue" between potentially highly complex and computationally expensive external subprograms. Differentiability of the resulting programs turns out to…

Mathematical Software · Computer Science 2021-12-07 Uwe Naumann

Estimating Predictability: Redundancy and Surrogate Data Method

A method for estimating theoretical predictability of time series is presented, based on information-theoretic functionals---redundancies and surrogate data technique. The redundancy, designed for a chosen model and a prediction horizon,…

comp-gas · Physics 2010-01-10 M. Paluš , L. Pecen , D. Pivka

A recipe for an unpredictable random number generator

In this work we present a model for computation of random processes in digital computers which solves the problem of periodic sequences and hidden errors produced by correlations. We show that systems with non-invertible non-linearities can…

Statistical Mechanics · Physics 2007-05-23 Monica A. Garcia-Nustes , Leonardo Trujillo , Jorge A. Gonzalez

Inherent Dependency Displacement Bias of Transition-Based Algorithms

A wide variety of transition-based algorithms are currently used for dependency parsers. Empirical studies have shown that performance varies across different treebanks in such a way that one algorithm outperforms another on one treebank…

Computation and Language · Computer Science 2020-04-01 Mark Anderson , Carlos Gómez-Rodríguez

Linearizable Implementations Do Not Suffice for Randomized Distributed Computation

Linearizability is the gold standard among algorithm designers for deducing the correctness of a distributed algorithm using implemented shared objects from the correctness of the corresponding algorithm using atomic versions of the same…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-21 Wojciech Golab , Lisa Higham , Philipp Woelfel