Related papers: Pointer Networks

Describing Multimedia Content using Attention-based Encoder--Decoder Networks

Whereas deep neural networks were first mostly used for classification tasks, they are rapidly expanding in the realm of structured output problems, where the observed target is composed of multiple random variables that have a rich joint…

Neural and Evolutionary Computing · Computer Science 2016-11-15 Kyunghyun Cho , Aaron Courville , Yoshua Bengio

End-to-End Neural Sentence Ordering Using Pointer Network

Sentence ordering is one of important tasks in NLP. Previous works mainly focused on improving its performance by using pair-wise strategy. However, it is nontrivial for pair-wise models to incorporate the contextual sentence information.…

Computation and Language · Computer Science 2016-11-28 Jingjing Gong , Xinchi Chen , Xipeng Qiu , Xuanjing Huang

Pointer Networks Trained Better via Evolutionary Algorithms

Pointer Network (PtrNet) is a specific neural network for solving Combinatorial Optimization Problems (COPs). While PtrNets offer real-time feed-forward inference for complex COPs instances, its quality of the results tends to be less…

Neural and Evolutionary Computing · Computer Science 2024-03-12 Muyao Zhong , Shengcai Liu , Bingdong Li , Haobo Fu , Ke Tang , Peng Yang

Variational Structured Attention Networks for Deep Visual Representation Learning

Convolutional neural networks have enabled major progresses in addressing pixel-level prediction tasks such as semantic segmentation, depth estimation, surface normal prediction and so on, benefiting from their powerful capabilities in…

Computer Vision and Pattern Recognition · Computer Science 2021-12-16 Guanglei Yang , Paolo Rota , Xavier Alameda-Pineda , Dan Xu , Mingli Ding , Elisa Ricci

Neural Random-Access Machines

In this paper, we propose and investigate a new neural network architecture called Neural Random Access Machine. It can manipulate and dereference pointers to an external variable-size random-access memory. The model is trained from pure…

Machine Learning · Computer Science 2016-02-11 Karol Kurach , Marcin Andrychowicz , Ilya Sutskever

A Convolutional Attention Network for Extreme Summarization of Source Code

Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the…

Machine Learning · Computer Science 2016-05-26 Miltiadis Allamanis , Hao Peng , Charles Sutton

Pointer Networks with Q-Learning for Combinatorial Optimization

We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets) to enhance the optimality of attention-based sequence generation, focusing on…

Machine Learning · Computer Science 2024-10-25 Alessandro Barro

Latent Alignment and Variational Attention

Neural attention has become central to many state-of-the-art models in natural language processing and related domains. Attention networks are an easy-to-train and effective method for softly simulating alignment; however, the approach does…

Machine Learning · Statistics 2018-11-09 Yuntian Deng , Yoon Kim , Justin Chiu , Demi Guo , Alexander M. Rush

Pointer: Linear-Complexity Long-Range Modeling without Pre-training

We introduce Pointer, a novel architecture that achieves linear $O(NK)$ complexity for long-range sequence modeling while maintaining superior performance without requiring pre-training. Unlike standard attention mechanisms that compute…

Computation and Language · Computer Science 2025-08-05 Zixi Li

Multitask Pointer Network for Multi-Representational Parsing

We propose a transition-based approach that, by training a single model, can efficiently parse any input sentence with both constituent and dependency trees, supporting both continuous/projective and discontinuous/non-projective syntactic…

Computation and Language · Computer Science 2022-12-26 Daniel Fernández-González , Carlos Gómez-Rodríguez

Pointer Graph Networks

Graph neural networks (GNNs) are typically applied to static graphs that are assumed to be known upfront. This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the…

Machine Learning · Statistics 2020-10-20 Petar Veličković , Lars Buesing , Matthew C. Overlan , Razvan Pascanu , Oriol Vinyals , Charles Blundell

Self-attention vector output similarities reveal how machines pay attention

The self-attention mechanism has significantly advanced the field of natural language processing, facilitating the development of advanced language-learning machines. Although its utility is widely acknowledged, the precise mechanisms of…

Computation and Language · Computer Science 2026-02-04 Tal Halevi , Yarden Tzach , Ronit D. Gross , Shalom Rosner , Ido Kanter

Survey of reasoning using Neural networks

Reason and inference require process as well as memory skills by humans. Neural networks are able to process tasks like image recognition (better than humans) but in memory aspects are still limited (by attention mechanism, size). Recurrent…

Machine Learning · Computer Science 2017-03-03 Amit Sahu

Provable Benefits of Task-Specific Prompts for In-context Learning

The in-context learning capabilities of modern language models have motivated a deeper mathematical understanding of sequence models. A line of recent work has shown that linear attention models can emulate projected gradient descent…

Computation and Language · Computer Science 2025-03-06 Xiangyu Chang , Yingcong Li , Muti Kara , Samet Oymak , Amit K. Roy-Chowdhury

Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization

Central to the success of artificial neural networks is their ability to generalize. But does neural network generalization primarily rely on seeing highly similar training examples (memorization)? Or are neural networks capable of…

Machine Learning · Computer Science 2022-02-22 Chiyuan Zhang , Maithra Raghu , Jon Kleinberg , Samy Bengio

Survey on the attention based RNN model and its applications in computer vision

The recurrent neural networks (RNN) can be used to solve the sequence to sequence problem, where both the input and the output have sequential structures. Usually there are some implicit relations between the structures. However, it is hard…

Computer Vision and Pattern Recognition · Computer Science 2016-01-27 Feng Wang , David M. J. Tax

Learning Geometric Combinatorial Optimization Problems using Self-attention and Domain Knowledge

Combinatorial optimization problems (COPs) are an important research topic in various fields. In recent times, there have been many attempts to solve COPs using deep learning-based approaches. We propose a novel neural network model that…

Computational Geometry · Computer Science 2023-04-17 Jaeseung Lee , Woojin Choi , Jibum Kim

Port-Hamiltonian Approach to Neural Network Training

Neural networks are discrete entities: subdivided into discrete layers and parametrized by weights which are iteratively optimized via difference equations. Recent work proposes networks with layer outputs which are no longer quantized but…

Neural and Evolutionary Computing · Computer Science 2019-09-09 Stefano Massaroli , Michael Poli , Federico Califano , Angela Faragasso , Jinkyoo Park , Atsushi Yamashita , Hajime Asama

Deep Learning as a Mixed Convex-Combinatorial Optimization Problem

As neural networks grow deeper and wider, learning networks with hard-threshold activations is becoming increasingly important, both for network quantization, which can drastically reduce time and energy requirements, and for creating large…

Machine Learning · Computer Science 2018-04-18 Abram L. Friesen , Pedro Domingos

Physics-informed attention-based neural network for solving non-linear partial differential equations

Physics-Informed Neural Networks (PINNs) have enabled significant improvements in modelling physical processes described by partial differential equations (PDEs). PINNs are based on simple architectures, and learn the behavior of complex…

Machine Learning · Computer Science 2021-05-18 Ruben Rodriguez-Torrado , Pablo Ruiz , Luis Cueto-Felgueroso , Michael Cerny Green , Tyler Friesen , Sebastien Matringe , Julian Togelius