Related papers: Complexity Scaling Laws for Neural Models using Co…

A Dynamical Model of Neural Scaling Laws

On a variety of tasks, the performance of neural networks predictably improves with training time, dataset size and model size across many orders of magnitude. This phenomenon is known as a neural scaling law. Of fundamental importance is…

Machine Learning · Statistics 2024-06-25 Blake Bordelon , Alexander Atanasov , Cengiz Pehlevan

On the Invariance and Generality of Neural Scaling Laws

Neural scaling laws establish a predictable relationship between model performance and data or compute, offering crucial guidance for resource allocation in new domains and tasks. Yet such laws are most needed precisely where they are…

Machine Learning · Computer Science 2026-05-11 Xing Han , Ziyin Liu , Suchi Saria , Paul Pu Liang

Generalization of Machine Learning for Problem Reduction: A Case Study on Travelling Salesman Problems

Combinatorial optimization plays an important role in real-world problem solving. In the big data era, the dimensionality of a combinatorial optimization problem is usually very large, which poses a significant challenge to existing…

Machine Learning · Computer Science 2020-09-09 Yuan Sun , Andreas Ernst , Xiaodong Li , Jake Weiner

How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines

Neural scaling laws have revolutionized the design and optimization of large-scale AI models by revealing predictable relationships between model size, dataset volume, and computational resources. Early research established power-law…

Computation and Language · Computer Science 2025-05-28 Ayan Sengupta , Yash Goel , Tanmoy Chakraborty

Scaling and Universality in Continuous Length Combinatorial Optimization

We consider combinatorial optimization problems defined over random ensembles, and study how solution cost increases when the optimal solution undergoes a small perturbation delta. For the minimum spanning tree, the increase in cost scales…

Disordered Systems and Neural Networks · Physics 2009-11-10 David Aldous , Allon G. Percus

Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments

Neural scaling laws define a predictable relationship between a model's parameter count and its performance after training in the form of a power law. However, most research to date has not explicitly investigated whether scaling laws can…

Computation and Language · Computer Science 2022-10-19 Maor Ivgi , Yair Carmon , Jonathan Berant

Scaling Laws for Neural Language Models

We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven…

Machine Learning · Computer Science 2020-01-24 Jared Kaplan , Sam McCandlish , Tom Henighan , Tom B. Brown , Benjamin Chess , Rewon Child , Scott Gray , Alec Radford , Jeffrey Wu , Dario Amodei

How to Evaluate Machine Learning Approaches for Combinatorial Optimization: Application to the Travelling Salesman Problem

Combinatorial optimization is the field devoted to the study and practice of algorithms that solve NP-hard problems. As Machine Learning (ML) and deep learning have popularized, several research groups have started to use ML to solve…

Artificial Intelligence · Computer Science 2019-10-01 Antoine François , Quentin Cappart , Louis-Martin Rousseau

An Evolutionary Strategy based on Partial Imitation for Solving Optimization Problems

In this work we introduce an evolutionary strategy to solve combinatorial optimization tasks, i.e. problems characterized by a discrete search space. In particular, we focus on the Traveling Salesman Problem (TSP), i.e. a famous problem…

Disordered Systems and Neural Networks · Physics 2016-08-05 Marco Alberto Javarone

A Solvable Model of Neural Scaling Laws

Large language models with a huge number of parameters, when trained on near internet-sized number of tokens, have been empirically shown to obey neural scaling laws: specifically, their performance behaves predictably as a power law in…

Machine Learning · Computer Science 2022-11-01 Alexander Maloney , Daniel A. Roberts , James Sully

Learning the Travelling Salesperson Problem Requires Rethinking Generalization

End-to-end training of neural network solvers for graph combinatorial optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently, but remain intractable and inefficient beyond graphs with…

Machine Learning · Computer Science 2022-05-26 Chaitanya K. Joshi , Quentin Cappart , Louis-Martin Rousseau , Thomas Laurent

Scaling Laws for Acoustic Models

There is a recent trend in machine learning to increase model quality by growing models to sizes previously thought to be unreasonable. Recent work has shown that autoregressive generative models with cross-entropy objective functions…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-18 Jasha Droppo , Oguz Elibol

A Resource Model For Neural Scaling Law

Neural scaling laws characterize how model performance improves as the model size scales up. Inspired by empirical observations, we introduce a resource model of neural scaling. A task is usually composite hence can be decomposed into many…

Machine Learning · Computer Science 2024-05-16 Jinyeop Song , Ziming Liu , Max Tegmark , Jeff Gore

Travel the Same Path: A Novel TSP Solving Strategy

In this paper, we provide a novel strategy for solving Traveling Salesman Problem, which is a famous combinatorial optimization problem studied intensely in the TCS community. In particular, we consider the imitation learning framework,…

Machine Learning · Computer Science 2022-10-13 Pingbang Hu

Neural Combinatorial Optimization with Reinforcement Learning

This paper presents a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning. We focus on the traveling salesman problem (TSP) and train a recurrent network that, given a set of city…

Artificial Intelligence · Computer Science 2017-01-16 Irwan Bello , Hieu Pham , Quoc V. Le , Mohammad Norouzi , Samy Bengio

Navigating Scaling Laws: Compute Optimality in Adaptive Model Training

In recent years, the state-of-the-art in deep learning has been dominated by very large models that have been pre-trained on vast amounts of data. The paradigm is very simple: investing more computational resources (optimally) leads to…

Machine Learning · Computer Science 2024-05-24 Sotiris Anagnostidis , Gregor Bachmann , Imanol Schlag , Thomas Hofmann

Self-Improvement for Neural Combinatorial Optimization: Sample without Replacement, but Improvement

Current methods for end-to-end constructive neural combinatorial optimization usually train a policy using behavior cloning from expert solutions or policy gradient methods from reinforcement learning. While behavior cloning is…

Machine Learning · Computer Science 2024-11-05 Jonathan Pirnay , Dominik G. Grimm

4+3 Phases of Compute-Optimal Neural Scaling Laws

We consider the solvable neural scaling model with three parameters: data complexity, target complexity, and model-parameter-count. We use this neural scaling model to derive new predictions about the compute-limited, infinite-data scaling…

Machine Learning · Statistics 2025-04-22 Elliot Paquette , Courtney Paquette , Lechao Xiao , Jeffrey Pennington

Unified Neural Network Scaling Laws and Scale-time Equivalence

As neural networks continue to grow in size but datasets might not, it is vital to understand how much performance improvement can be expected: is it more important to scale network size or data volume? Thus, neural network scaling laws,…

Machine Learning · Computer Science 2024-09-10 Akhilan Boopathy , Ila Fiete

A Comparative Review of Parallel Exact, Heuristic, Metaheuristic, and Hybrid Optimization Techniques for the Traveling Salesman Problem

The Traveling Salesman Problem (TSP) is a well-known NP-hard combinatorial optimization problem with wide-ranging applications in logistics, routing, and intelligent systems. Due to its factorial complexity, solving large-scale instances…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-27 Rabab Alkhalifa , Fatima Alkhomayes , Boushra Almazroua , Dana Alhaidan , Maryam Alothman , Jumana Almuhaidib