Related papers: Measuring Arithmetic Extrapolation Performance

iNALU: Improved Neural Arithmetic Logic Unit

Neural networks have to capture mathematical relationships in order to learn various tasks. They approximate these relations implicitly and therefore often do not generalize well. The recently proposed Neural Arithmetic Logic Unit (NALU) is…

Neural and Evolutionary Computing · Computer Science 2020-03-18 Daniel Schlör , Markus Ring , Andreas Hotho

Neural Arithmetic Logic Units

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training. To encourage more systematic numerical extrapolation, we…

Neural and Evolutionary Computing · Computer Science 2018-08-03 Andrew Trask , Felix Hill , Scott Reed , Jack Rae , Chris Dyer , Phil Blunsom

A Primer for Neural Arithmetic Logic Modules

Neural Arithmetic Logic Modules have become a growing area of interest, though remain a niche field. These modules are neural networks which aim to achieve systematic generalisation in learning arithmetic and/or logic operations such as…

Neural and Evolutionary Computing · Computer Science 2022-08-09 Bhumika Mistry , Katayoun Farrahi , Jonathon Hare

Systematically designing better instance counting models on cell images with Neural Arithmetic Logic Units

The big problem for neural network models which are trained to count instances is that whenever test range goes high training range generalization error increases i.e. they are not good generalizers outside training range. Consider the case…

Machine Learning · Computer Science 2020-06-16 Ashish Rana , Taranveer Singh , Harpreet Singh , Neeraj Kumar , Prashant Singh Rana

Neural Arithmetic Units

Neural networks can approximate complex functions, but they struggle to perform exact arithmetic operations over real numbers. The lack of inductive bias for arithmetic operations leaves neural networks without the underlying logic…

Neural and Evolutionary Computing · Computer Science 2020-01-16 Andreas Madsen , Alexander Rosenberg Johansen

SELU: A Software Engineering Language Understanding Benchmark

Large Language Models (LLMs) have demonstrated remarkable capabilities in code understanding and generation. However, their effectiveness on non-code Software Engineering (SE) tasks remains underexplored. We present 'Software Engineering…

Software Engineering · Computer Science 2026-02-12 Fabian C. Peña , Steffen Herbold

Neural Power Units

Conventional Neural Networks can approximate simple arithmetic operations, but fail to generalize beyond the range of numbers that were seen during training. Neural Arithmetic Units aim to overcome this difficulty, but current arithmetic…

Machine Learning · Computer Science 2020-12-18 Niklas Heim , Tomáš Pevný , Václav Šmídl

A Methodology for Automatic Selection of Activation Functions to Design Hybrid Deep Neural Networks

Activation functions influence behavior and performance of DNNs. Nonlinear activation functions, like Rectified Linear Units (ReLU), Exponential Linear Units (ELU) and Scaled Exponential Linear Units (SELU), outperform the linear…

Neural and Evolutionary Computing · Computer Science 2019-02-05 Alberto Marchisio , Muhammad Abdullah Hanif , Semeen Rehman , Maurizio Martina , Muhammad Shafique

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one…

Computation and Language · Computer Science 2019-02-26 Alex Wang , Amanpreet Singh , Julian Michael , Felix Hill , Omer Levy , Samuel R. Bowman

GELU Activation Function in Deep Learning: A Comprehensive Mathematical Analysis and Performance

Selecting the most suitable activation function is a critical factor in the effectiveness of deep learning models, as it influences their learning capacity, stability, and computational efficiency. In recent years, the Gaussian Error Linear…

Machine Learning · Computer Science 2023-08-02 Minhyeok Lee

Moderate Adaptive Linear Units (MoLU)

We propose the Moderate Adaptive Linear Unit (MoLU), a novel activation function for deep neural networks, defined analytically as: f(x)=x \times (1+tanh(x))/2. MoLU combines mathematical elegance with empirical effectiveness, exhibiting…

Machine Learning · Computer Science 2025-07-16 Hankyul Koh , Joon-hyuk Ko , Wonho Jhe

The Domain Mixed Unit: A New Neural Arithmetic Layer

The Domain Mixed Unit (DMU) is a new neural arithmetic unit that learns a single parameter gate that mixes between log-space and linear-space representations while performing either addition (DMU add) or subtraction (DMU sub). Two…

Machine Learning · Computer Science 2025-09-16 Paul Curry

Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units

Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation…

Machine Learning · Computer Science 2021-04-08 Ankur Mali , Alexander Ororbia , Daniel Kifer , C. Lee Giles

NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks

Given the ubiquitous nature of numbers in text, reasoning with numbers to perform simple calculations is an important skill of AI systems. While many datasets and models have been developed to this end, state-of-the-art AI systems are…

Computation and Language · Computer Science 2022-04-13 Swaroop Mishra , Arindam Mitra , Neeraj Varshney , Bhavdeep Sachdeva , Peter Clark , Chitta Baral , Ashwin Kalyan

Improving Deep Neural Network with Multiple Parametric Exponential Linear Units

Activation function is crucial to the recent successes of deep neural networks. In this paper, we first propose a new activation function, Multiple Parametric Exponential Linear Units (MPELU), aiming to generalize and unify the rectified…

Computer Vision and Pattern Recognition · Computer Science 2017-01-18 Yang Li , Chunxiao Fan , Yong Li , Qiong Wu , Yue Ming

Decoupling Gating from Linearity

ReLU neural-networks have been in the focus of many recent theoretical works, trying to explain their empirical success. Nonetheless, there is still a gap between current theoretical results and empirical observations, even in the case of…

Machine Learning · Computer Science 2019-06-13 Jonathan Fiat , Eran Malach , Shai Shalev-Shwartz

Improving the Robustness of Neural Multiplication Units with Reversible Stochasticity

Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly…

Machine Learning · Computer Science 2022-11-11 Bhumika Mistry , Katayoun Farrahi , Jonathon Hare

Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes

Math reasoning is an active area of Large Language Model (LLM) research because it is a hallmark of artificial intelligence and has implications in several domains, including math education. However, few works have explored how math…

Computation and Language · Computer Science 2025-06-19 Bryan R. Christ , Zack Gottesman , Jonathan Kropko , Thomas Hartvigsen

Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks

Learning general representations of text is a fundamental problem for many natural language understanding (NLU) tasks. Previously, researchers have proposed to use language model pre-training and multi-task learning to learn robust…

Computation and Language · Computer Science 2019-08-29 Zi-Yi Dou , Keyi Yu , Antonios Anastasopoulos

Towards More Robust Natural Language Understanding

Natural Language Understanding (NLU) is a branch of Natural Language Processing (NLP) that uses intelligent computer software to understand texts that encode human knowledge. Recent years have witnessed notable progress across various NLU…

Computation and Language · Computer Science 2022-03-01 Xinliang Frederick Zhang