Related papers: Parallelizable Neural Turing Machines
Neural Turing Machines (NTMs) are an instance of Memory Augmented Neural Networks, a new class of recurrent neural networks which decouple computation from memory by introducing an external memory unit. NTMs have demonstrated superior…
Recurrent neural networks (RNNs) have shown outstanding performance on processing sequence data. However, they suffer from long training time, which demands parallel implementations of the training procedure. Parallelization of the training…
Neural Turing Machines (NTM) contain memory component that simulates "working memory" in the brain to store and retrieve information to ease simple algorithms learning. So far, only linearly organized memory is proposed, and during…
Learning an algorithm from examples is a fundamental problem that has been widely studied. Recently it has been addressed using neural networks, in particular by Neural Turing Machines (NTMs). These are fully differentiable computers that…
Parallel parameterized complexity theory studies how fixed-parameter tractable (fpt) problems can be solved in parallel. Previous theoretical work focused on parallel algorithms that are very fast in principle, but did not take into account…
We introduce a neural stack architecture, including a differentiable parametrized stack operator that approximates stack push and pop operations for suitable choices of parameters that explicitly represents a stack. We prove the stability…
A new neural network architecture (PSCNN) is developed to improve performance and speed of such networks. The architecture has all the advantages of the previous models such as self-organization and possesses some other superior…
Recurrent neural networks (RNNs) have represented for years the state of the art in neural machine translation. Recently, new architectures have been proposed, which can leverage parallel computation on GPUs better than classical RNNs.…
We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to growing amounts of data as well as growing needs for accurate and confident predictions in critical applications. In contrast to other…
We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes. The combined system is analogous to a Turing Machine or Von Neumann architecture but is…
One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by…
Multiple extensions of Recurrent Neural Networks (RNNs) have been proposed recently to address the difficulty of storing information over long time periods. In this paper, we experiment with the capacity of Neural Turing Machines (NTMs) to…
We introduce the Neural Field Turing Machine (NFTM), a differentiable architecture that unifies symbolic computation, physical simulation, and perceptual inference within continuous spatial fields. NFTM combines a neural controller,…
Machine learning potentials have achieved great success in accelerating atomistic simulations. Many of them relying on atom-centered local descriptors are natural for parallelization. More recent message passing neural network (MPNN) models…
Recently, a new recurrent neural network (RNN) named the Legendre Memory Unit (LMU) was proposed and shown to achieve state-of-the-art performance on several benchmark datasets. Here we leverage the linear time-invariant (LTI) memory…
Transformers have become the dominant architecture for sequence modeling by using self-attention to enable expressive and highly parallel processing. However, the resulting quadratic time and memory costs limit efficiency in long-context…
Over the past few years, self-attention is shining in the field of deep learning, especially in the domain of natural language processing(NLP). Its impressive effectiveness, along with ubiquitous implementations, have aroused our interest…
Recurrent neural networks (RNN) have been successfully applied to various sequential decision-making tasks, natural language processing applications, and time-series predictions. Such networks are usually trained through back-propagation…
The architecture of neural Turing machines is differentiable end to end and is trainable with gradient descent methods. Due to their large unfolded depth Neural Turing Machines are hard to train and because of their linear access of…
Recurrent neural networks (RNNs) with deep test-time memorization modules, such as Titans and TTT, represent a promising, linearly-scaling paradigm distinct from Transformers. While these expressive models do not yet match the peak…