English
Related papers

Related papers: Learning Associative Inference Using Fast Weight M…

200 papers

Associative memory using fast weights is a short-term memory mechanism that substantially improves the memory capacity and time scale of recurrent neural networks (RNNs). As recent studies introduced fast weights only to regular RNNs, it is…

Neural and Evolutionary Computing · Computer Science 2018-04-19 T. Anderson Keller , Sharath Nittur Sridhar , Xin Wang

Large language models (LLMs) excel in generating coherent text, but they often struggle with context awareness, leading to inaccuracies in tasks requiring faithful adherence to provided information. We introduce FastMem, a novel method…

Computation and Language · Computer Science 2024-10-08 Junyi Zhu , Shuochen Liu , Yu Yu , Bo Tang , Yibo Yan , Zhiyu Li , Feiyu Xiong , Tong Xu , Matthew B. Blaschko

Dynamic evaluation of language models (LMs) adapts model parameters at test time using gradient information from previous tokens and substantially improves LM performance. However, it requires over 3x more compute than standard inference.…

Computation and Language · Computer Science 2022-12-06 Kevin Clark , Kelvin Guu , Ming-Wei Chang , Panupong Pasupat , Geoffrey Hinton , Mohammad Norouzi

Relation extraction has been widely studied to extract new relational facts from open corpus. Previous relation extraction methods are faced with the problem of wrong labels and noisy data, which substantially decrease the performance of…

Information Retrieval · Computer Science 2018-05-01 Dongdong Yang , Senzhang Wang , Zhoujun Li

Large language models (LLMs) are trained for downstream tasks by updating their parameters (e.g., via RL). However, updating parameters forces them to absorb task-specific information, which can result in catastrophic forgetting and loss of…

Artificial neural networks are powerful models, which have been widely applied into many aspects of machine translation, such as language modeling and translation modeling. Though notable improvements have been made in these areas, the…

Computation and Language · Computer Science 2017-09-25 Yiming Cui , Shijin Wang , Jianfeng Li

We show the formal equivalence of linearised self-attention mechanisms and fast weight controllers from the early '90s, where a ``slow" neural net learns by gradient descent to program the ``fast weights" of another net through sequences of…

Machine Learning · Computer Science 2021-06-10 Imanol Schlag , Kazuki Irie , Jürgen Schmidhuber

Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their substantial computational and memory requirements present challenges, especially for devices…

Although masked language models are highly performant and widely adopted by NLP practitioners, they can not be easily used for autoregressive language modelling (next word prediction and sequence probability estimation). We present an…

Computation and Language · Computer Science 2022-08-08 Vilém Zouhar , Marius Mosbach , Dietrich Klakow

Large Language Models (LLMs) have the capacity to store and recall facts. Through experimentation with open-source models, we observe that this ability to retrieve facts can be easily manipulated by changing contexts, even without altering…

Computation and Language · Computer Science 2024-12-02 Yibo Jiang , Goutham Rajendran , Pradeep Ravikumar , Bryon Aragam

We investigate a new method to augment recurrent neural networks with extra memory without increasing the number of network parameters. The system has an associative memory based on complex-valued vectors and is closely related to…

Neural and Evolutionary Computing · Computer Science 2016-05-20 Ivo Danihelka , Greg Wayne , Benigno Uria , Nal Kalchbrenner , Alex Graves

Reasoning and inference are central to human and artificial intelligence. Modeling inference in human language is very challenging. With the availability of large annotated data (Bowman et al., 2015), it has recently become feasible to…

Computation and Language · Computer Science 2020-03-04 Qian Chen , Xiaodan Zhu , Zhenhua Ling , Si Wei , Hui Jiang , Diana Inkpen

Recent advances in artificial neural networks for machine learning, and language modeling in particular, have established a family of recurrent neural network (RNN) architectures that, unlike conventional RNNs with vector-form hidden…

Machine Learning · Computer Science 2026-03-19 Kazuki Irie , Samuel J. Gershman

Long short-term memory (LSTM) is normally used in recurrent neural network (RNN) as basic recurrent unit. However,conventional LSTM assumes that the state at current time step depends on previous time step. This assumption constraints the…

Machine Learning · Computer Science 2017-11-01 Fei Tao , Gang Liu

Although LLM agents can leverage tools for complex tasks, they still need memory to maintain cross-turn consistency and accumulate reusable information in long-horizon interactions. However, retrieval-based external memory systems incur low…

Artificial Intelligence · Computer Science 2026-04-23 Jiaquan Zhang , Chaoning Zhang , Shuxu Chen , Zhenzhen Huang , Pengcheng Zheng , Zhicheng Wang , Ping Guo , Fan Mo , Sung-Ho Bae , Jie Zou , Jiwei Wei , Yang Yang

Associative learning--forming links between co-occurring items--is fundamental to human cognition, reshaping internal representations in complex ways. Testing hypotheses on how representational changes occur in biological systems is…

Machine Learning · Computer Science 2025-10-27 Camila Kolling , Vy Ai Vo , Mariya Toneva

Natural language inference (NLI) is a fundamentally important task in natural language processing that has many applications. The recently released Stanford Natural Language Inference (SNLI) corpus has made it possible to develop and…

Computation and Language · Computer Science 2016-11-11 Shuohang Wang , Jing Jiang

Short-term memory in standard, general-purpose, sequence-processing recurrent neural networks (RNNs) is stored as activations of nodes or "neurons." Generalising feedforward NNs to such RNNs is mathematically straightforward and natural,…

Neural and Evolutionary Computing · Computer Science 2022-11-18 Kazuki Irie , Jürgen Schmidhuber

Factorization Machines (FMs) are a supervised learning approach that enhances the linear regression model by incorporating the second-order feature interactions. Despite effectiveness, FM can be hindered by its modelling of all feature…

Machine Learning · Computer Science 2017-08-17 Jun Xiao , Hao Ye , Xiangnan He , Hanwang Zhang , Fei Wu , Tat-Seng Chua

Typical neural networks with external memory do not effectively separate capacity for episodic and working memory as is required for reasoning in humans. Applying knowledge gained from psychological studies, we designed a new model called…

Machine Learning · Computer Science 2018-10-01 T. S. Jayram , Younes Bouhadjar , Ryan L. McAvoy , Tomasz Kornuta , Alexis Asseman , Kamil Rocki , Ahmet S. Ozcan
‹ Prev 1 2 3 10 Next ›