English
Related papers

Related papers: Measuring Arithmetic Extrapolation Performance

200 papers

Neural networks have to capture mathematical relationships in order to learn various tasks. They approximate these relations implicitly and therefore often do not generalize well. The recently proposed Neural Arithmetic Logic Unit (NALU) is…

Neural and Evolutionary Computing · Computer Science 2020-03-18 Daniel Schlör , Markus Ring , Andreas Hotho

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training. To encourage more systematic numerical extrapolation, we…

Neural and Evolutionary Computing · Computer Science 2018-08-03 Andrew Trask , Felix Hill , Scott Reed , Jack Rae , Chris Dyer , Phil Blunsom

Neural Arithmetic Logic Modules have become a growing area of interest, though remain a niche field. These modules are neural networks which aim to achieve systematic generalisation in learning arithmetic and/or logic operations such as…

Neural and Evolutionary Computing · Computer Science 2022-08-09 Bhumika Mistry , Katayoun Farrahi , Jonathon Hare

The big problem for neural network models which are trained to count instances is that whenever test range goes high training range generalization error increases i.e. they are not good generalizers outside training range. Consider the case…

Machine Learning · Computer Science 2020-06-16 Ashish Rana , Taranveer Singh , Harpreet Singh , Neeraj Kumar , Prashant Singh Rana

Neural networks can approximate complex functions, but they struggle to perform exact arithmetic operations over real numbers. The lack of inductive bias for arithmetic operations leaves neural networks without the underlying logic…

Neural and Evolutionary Computing · Computer Science 2020-01-16 Andreas Madsen , Alexander Rosenberg Johansen

Large Language Models (LLMs) have demonstrated remarkable capabilities in code understanding and generation. However, their effectiveness on non-code Software Engineering (SE) tasks remains underexplored. We present 'Software Engineering…

Software Engineering · Computer Science 2026-02-12 Fabian C. Peña , Steffen Herbold

Conventional Neural Networks can approximate simple arithmetic operations, but fail to generalize beyond the range of numbers that were seen during training. Neural Arithmetic Units aim to overcome this difficulty, but current arithmetic…

Machine Learning · Computer Science 2020-12-18 Niklas Heim , Tomáš Pevný , Václav Šmídl

Activation functions influence behavior and performance of DNNs. Nonlinear activation functions, like Rectified Linear Units (ReLU), Exponential Linear Units (ELU) and Scaled Exponential Linear Units (SELU), outperform the linear…

Neural and Evolutionary Computing · Computer Science 2019-02-05 Alberto Marchisio , Muhammad Abdullah Hanif , Semeen Rehman , Maurizio Martina , Muhammad Shafique

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one…

Computation and Language · Computer Science 2019-02-26 Alex Wang , Amanpreet Singh , Julian Michael , Felix Hill , Omer Levy , Samuel R. Bowman

Selecting the most suitable activation function is a critical factor in the effectiveness of deep learning models, as it influences their learning capacity, stability, and computational efficiency. In recent years, the Gaussian Error Linear…

Machine Learning · Computer Science 2023-08-02 Minhyeok Lee

We propose the Moderate Adaptive Linear Unit (MoLU), a novel activation function for deep neural networks, defined analytically as: f(x)=x \times (1+tanh(x))/2. MoLU combines mathematical elegance with empirical effectiveness, exhibiting…

Machine Learning · Computer Science 2025-07-16 Hankyul Koh , Joon-hyuk Ko , Wonho Jhe

The Domain Mixed Unit (DMU) is a new neural arithmetic unit that learns a single parameter gate that mixes between log-space and linear-space representations while performing either addition (DMU add) or subtraction (DMU sub). Two…

Machine Learning · Computer Science 2025-09-16 Paul Curry

Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation…

Machine Learning · Computer Science 2021-04-08 Ankur Mali , Alexander Ororbia , Daniel Kifer , C. Lee Giles

Given the ubiquitous nature of numbers in text, reasoning with numbers to perform simple calculations is an important skill of AI systems. While many datasets and models have been developed to this end, state-of-the-art AI systems are…

Computation and Language · Computer Science 2022-04-13 Swaroop Mishra , Arindam Mitra , Neeraj Varshney , Bhavdeep Sachdeva , Peter Clark , Chitta Baral , Ashwin Kalyan

Activation function is crucial to the recent successes of deep neural networks. In this paper, we first propose a new activation function, Multiple Parametric Exponential Linear Units (MPELU), aiming to generalize and unify the rectified…

Computer Vision and Pattern Recognition · Computer Science 2017-01-18 Yang Li , Chunxiao Fan , Yong Li , Qiong Wu , Yue Ming

ReLU neural-networks have been in the focus of many recent theoretical works, trying to explain their empirical success. Nonetheless, there is still a gap between current theoretical results and empirical observations, even in the case of…

Machine Learning · Computer Science 2019-06-13 Jonathan Fiat , Eran Malach , Shai Shalev-Shwartz

Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly…

Machine Learning · Computer Science 2022-11-11 Bhumika Mistry , Katayoun Farrahi , Jonathon Hare

Math reasoning is an active area of Large Language Model (LLM) research because it is a hallmark of artificial intelligence and has implications in several domains, including math education. However, few works have explored how math…

Computation and Language · Computer Science 2025-06-19 Bryan R. Christ , Zack Gottesman , Jonathan Kropko , Thomas Hartvigsen

Learning general representations of text is a fundamental problem for many natural language understanding (NLU) tasks. Previously, researchers have proposed to use language model pre-training and multi-task learning to learn robust…

Computation and Language · Computer Science 2019-08-29 Zi-Yi Dou , Keyi Yu , Antonios Anastasopoulos

Natural Language Understanding (NLU) is a branch of Natural Language Processing (NLP) that uses intelligent computer software to understand texts that encode human knowledge. Recent years have witnessed notable progress across various NLU…

Computation and Language · Computer Science 2022-03-01 Xinliang Frederick Zhang
‹ Prev 1 2 3 10 Next ›