Related papers: Progress Extrapolating Algorithmic Learning to Arb…

Location Attention for Extrapolation to Longer Sequences

Neural networks are surprisingly good at interpolating and perform remarkably well when the training set examples resemble those in the test set. However, they are often unable to extrapolate patterns beyond the seen data, even when the…

Machine Learning · Computer Science 2020-04-23 Yann Dubois , Gautier Dagan , Dieuwke Hupkes , Elia Bruni

End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

Machine learning systems perform well on pattern matching tasks, but their ability to perform algorithmic or logical reasoning is not well understood. One important reasoning capability is algorithmic extrapolation, in which models trained…

Machine Learning · Computer Science 2022-10-18 Arpit Bansal , Avi Schwarzschild , Eitan Borgnia , Zeyad Emam , Furong Huang , Micah Goldblum , Tom Goldstein

Extrapolation-based Prediction-Correction Methods for Time-varying Convex Optimization

In this paper, we focus on the solution of online optimization problems that arise often in signal processing and machine learning, in which we have access to streaming sources of data. We discuss algorithms for online optimization based on…

Optimization and Control · Mathematics 2023-05-05 Nicola Bastianello , Ruggero Carli , Andrea Simonetto

Learning for Spatial Branching: An Algorithm Selection Approach

The use of machine learning techniques to improve the performance of branch-and-bound optimization algorithms is a very active area in the context of mixed integer linear problems, but little has been done for non-linear optimization. To…

Optimization and Control · Mathematics 2022-04-25 Bissan Ghaddar , Ignacio Gómez-Casares , Julio González-Díaz , Brais González-Rodríguez , Beatriz Pateiro-López , Sofía Rodríguez-Ballesteros

Sublinear Optimization for Machine Learning

We give sublinear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions…

Machine Learning · Computer Science 2010-10-22 Kenneth L. Clarkson , Elad Hazan , David P. Woodruff

Proximal Algorithms and Temporal Differences for Large Linear Systems: Extrapolation, Approximation, and Simulation

We consider large linear and nonlinear fixed point problems, and solution with proximal algorithms. We show that there is a close connection between two seemingly different types of methods from distinct fields: 1) Proximal iterations for…

Numerical Analysis · Computer Science 2019-09-05 Dimitri P. Bertsekas

Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM

Although deep learning models are highly effective for various learning tasks, their high computational costs prohibit the deployment to scenarios where either memory or computational resources are limited. In this paper, we focus on…

Computer Vision and Pattern Recognition · Computer Science 2017-09-14 Cong Leng , Hao Li , Shenghuo Zhu , Rong Jin

Local Binary Pattern Networks

Memory and computation efficient deep learning architec- tures are crucial to continued proliferation of machine learning capabili- ties to new platforms and systems. Binarization of operations in convo- lutional neural networks has shown…

Computer Vision and Pattern Recognition · Computer Science 2018-03-23 Jeng-Hau Lin , Yunfan Yang , Rajesh Gupta , Zhuowen Tu

Adaptive Discretization for Model-Based Reinforcement Learning

We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value…

Machine Learning · Computer Science 2020-10-26 Sean R. Sinclair , Tianyu Wang , Gauri Jain , Siddhartha Banerjee , Christina Lee Yu

Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function. State-of-the-art methods rely on error backpropagation, which suffers from several…

Machine Learning · Statistics 2019-06-06 Anna Choromanska , Benjamin Cowen , Sadhana Kumaravel , Ronny Luss , Mattia Rigotti , Irina Rish , Brian Kingsbury , Paolo DiAchille , Viatcheslav Gurev , Ravi Tejwani , Djallel Bouneffouf

Connecting the Dots Between MLE and RL for Sequence Prediction

Sequence prediction models can be learned from example sequences with a variety of training algorithms. Maximum likelihood learning is simple and efficient, yet can suffer from compounding error at test time. Reinforcement learning such as…

Machine Learning · Computer Science 2019-07-02 Bowen Tan , Zhiting Hu , Zichao Yang , Ruslan Salakhutdinov , Eric Xing

Learning-Augmented Algorithms with Explicit Predictors

Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data. These approaches have demonstrated an enhancement in performance when the predictions are accurate, while…

Machine Learning · Computer Science 2024-03-13 Marek Elias , Haim Kaplan , Yishay Mansour , Shay Moran

Reversing Large Language Models for Efficient Training and Fine-Tuning

Large Language Models (LLMs) are known for their expensive and time-consuming training. Thus, oftentimes, LLMs are fine-tuned to address a specific task, given the pretrained weights of a pre-trained LLM considered a foundation model. In…

Computation and Language · Computer Science 2025-12-05 Eshed Gal , Moshe Eliasof , Javier Turek , Uri Ascher , Eran Treister , Eldad Haber

Algorithmic Language Models with Neurally Compiled Libraries

Important tasks such as reasoning and planning are fundamentally algorithmic, meaning that solving them robustly requires acquiring true reasoning or planning algorithms, rather than shortcuts. Large Language Models lack true algorithmic…

Artificial Intelligence · Computer Science 2025-05-27 Lucas Saldyt , Subbarao Kambhampati

Guidelines for enhancing data locality in selected machine learning algorithms

To deal with the complexity of the new bigger and more complex generation of data, machine learning (ML) techniques are probably the first and foremost used. For ML algorithms to produce results in a reasonable amount of time, they need to…

Machine Learning · Computer Science 2020-01-10 Imen Chakroun , Tom Vander Aa , Thomas J. Ashby

Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in…

Machine Learning · Computer Science 2016-10-31 Jack W Rae , Jonathan J Hunt , Tim Harley , Ivo Danihelka , Andrew Senior , Greg Wayne , Alex Graves , Timothy P Lillicrap

An Introduction to Advanced Machine Learning : Meta Learning Algorithms, Applications and Promises

In [1, 2], we have explored the theoretical aspects of feature extraction optimization processes for solving largescale problems and overcoming machine learning limitations. Majority of optimization algorithms that have been introduced in…

Machine Learning · Computer Science 2019-08-28 Farid Ghareh Mohammadi , M. Hadi Amini , Hamid R. Arabnia

Algorithms for Adversarially Robust Deep Learning

Given the widespread use of deep learning models in safety-critical applications, ensuring that the decisions of such models are robust against adversarial exploitation is of fundamental importance. In this thesis, we discuss recent…

Machine Learning · Computer Science 2025-09-24 Alexander Robey

Simulating extrapolated dynamics with parameterization networks

An artificial neural network architecture, parameterization networks, is proposed for simulating extrapolated dynamics beyond observed data in dynamical systems. Parameterization networks are used to ensure the long term integrity of…

Chaotic Dynamics · Physics 2019-03-21 James P. L. Tan

Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling

Reasoning is a core capability of large language models, yet how multi-step reasoning is learned and executed remains unclear. We study this question in a controlled cellular-automata (1dCA) framework that excludes memorisation by using…

Machine Learning · Computer Science 2026-05-08 Ivan Rodkin , Daniil Orel , Konstantin Smirnov , Arman Bolatov , Bilal Elbouardi , Besher Hassan , Yuri Kuratov , Aydar Bulatov , Preslav Nakov , Timothy Baldwin , Artem Shelmanov , Mikhail Burtsev