Related papers: Complexity Scaling Laws for Neural Models using Co…
On a variety of tasks, the performance of neural networks predictably improves with training time, dataset size and model size across many orders of magnitude. This phenomenon is known as a neural scaling law. Of fundamental importance is…
Neural scaling laws establish a predictable relationship between model performance and data or compute, offering crucial guidance for resource allocation in new domains and tasks. Yet such laws are most needed precisely where they are…
Combinatorial optimization plays an important role in real-world problem solving. In the big data era, the dimensionality of a combinatorial optimization problem is usually very large, which poses a significant challenge to existing…
Neural scaling laws have revolutionized the design and optimization of large-scale AI models by revealing predictable relationships between model size, dataset volume, and computational resources. Early research established power-law…
We consider combinatorial optimization problems defined over random ensembles, and study how solution cost increases when the optimal solution undergoes a small perturbation delta. For the minimum spanning tree, the increase in cost scales…
Neural scaling laws define a predictable relationship between a model's parameter count and its performance after training in the form of a power law. However, most research to date has not explicitly investigated whether scaling laws can…
We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven…
Combinatorial optimization is the field devoted to the study and practice of algorithms that solve NP-hard problems. As Machine Learning (ML) and deep learning have popularized, several research groups have started to use ML to solve…
In this work we introduce an evolutionary strategy to solve combinatorial optimization tasks, i.e. problems characterized by a discrete search space. In particular, we focus on the Traveling Salesman Problem (TSP), i.e. a famous problem…
Large language models with a huge number of parameters, when trained on near internet-sized number of tokens, have been empirically shown to obey neural scaling laws: specifically, their performance behaves predictably as a power law in…
End-to-end training of neural network solvers for graph combinatorial optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently, but remain intractable and inefficient beyond graphs with…
There is a recent trend in machine learning to increase model quality by growing models to sizes previously thought to be unreasonable. Recent work has shown that autoregressive generative models with cross-entropy objective functions…
Neural scaling laws characterize how model performance improves as the model size scales up. Inspired by empirical observations, we introduce a resource model of neural scaling. A task is usually composite hence can be decomposed into many…
In this paper, we provide a novel strategy for solving Traveling Salesman Problem, which is a famous combinatorial optimization problem studied intensely in the TCS community. In particular, we consider the imitation learning framework,…
This paper presents a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning. We focus on the traveling salesman problem (TSP) and train a recurrent network that, given a set of city…
In recent years, the state-of-the-art in deep learning has been dominated by very large models that have been pre-trained on vast amounts of data. The paradigm is very simple: investing more computational resources (optimally) leads to…
Current methods for end-to-end constructive neural combinatorial optimization usually train a policy using behavior cloning from expert solutions or policy gradient methods from reinforcement learning. While behavior cloning is…
We consider the solvable neural scaling model with three parameters: data complexity, target complexity, and model-parameter-count. We use this neural scaling model to derive new predictions about the compute-limited, infinite-data scaling…
As neural networks continue to grow in size but datasets might not, it is vital to understand how much performance improvement can be expected: is it more important to scale network size or data volume? Thus, neural network scaling laws,…
The Traveling Salesman Problem (TSP) is a well-known NP-hard combinatorial optimization problem with wide-ranging applications in logistics, routing, and intelligent systems. Due to its factorial complexity, solving large-scale instances…