Related papers: FastTrees: Parallel Latent Tree-Induction for Fast…

Inducing Constituency Trees through Neural Machine Translation

Latent tree learning(LTL) methods learn to parse sentences using only indirect supervision from a downstream task. Recent advances in latent tree learning have made it possible to recover moderately high quality tree structures by training…

Computation and Language · Computer Science 2019-09-24 Phu Mon Htut , Kyunghyun Cho , Samuel R. Bowman

The Importance of the Current Input in Sequence Modeling

The last advances in sequence modeling are mainly based on deep learning approaches. The current state of the art involves the use of variations of the standard LSTM architecture, combined with several tricks that improve the final…

Computation and Language · Computer Science 2021-12-23 Christian Oliva , Luis F. Lago-Fernández

Latent Tree Language Model

In this paper we introduce Latent Tree Language Model (LTLM), a novel approach to language modeling that encodes syntax and semantics of a given sentence as a tree of word roles. The learning phase iteratively updates the trees by moving…

Computation and Language · Computer Science 2016-09-06 Tomas Brychcin

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Natural language is hierarchically structured: smaller units (e.g., phrases) are nested within larger units (e.g., clauses). When a larger constituent ends, all of the smaller constituents that are nested within it must also be closed.…

Computation and Language · Computer Science 2019-05-09 Yikang Shen , Shawn Tan , Alessandro Sordoni , Aaron Courville

TSLM: Tree-Structured Language Modeling for Divergent Thinking

Language models generate reasoning sequentially, preventing them from decoupling irrelevant exploration paths during search. We introduce Tree-Structured Language Modeling (TSLM), which uses special tokens to encode branching structure,…

Computation and Language · Computer Science 2026-02-02 Doyoung Kim , Jaehyeok Doo , Minjoon Seo

FastMTP: Accelerating LLM Inference with Enhanced Multi-Token Prediction

As large language models (LLMs) become increasingly powerful, the sequential nature of autoregressive generation creates a fundamental throughput bottleneck that limits the practical deployment. While Multi-Token Prediction (MTP) has…

Machine Learning · Computer Science 2025-09-24 Yuxuan Cai , Xiaozhuan Liang , Xinghua Wang , Jin Ma , Haijin Liang , Jinwen Luo , Xinyu Zuo , Lisheng Duan , Yuyang Yin , Xi Chen

From Nodes to Networks: Evolving Recurrent Neural Networks

Gated recurrent networks such as those composed of Long Short-Term Memory (LSTM) nodes have recently been used to improve state of the art in many sequential processing tasks such as speech recognition and machine translation. However, the…

Neural and Evolutionary Computing · Computer Science 2018-06-11 Aditya Rawal , Risto Miikkulainen

Grammar Induction with Neural Language Models: An Unusual Replication

A substantial thread of recent work on latent tree learning has attempted to develop neural network models with parse-valued latent variables and train them on non-parsing tasks, in the hope of having them discover interpretable tree…

Computation and Language · Computer Science 2018-08-31 Phu Mon Htut , Kyunghyun Cho , Samuel R. Bowman

Latent Tree Learning with Ordered Neurons: What Parses Does It Produce?

Recent latent tree learning models can learn constituency parsing without any exposure to human-annotated tree structures. One such model is ON-LSTM (Shen et al., 2019), which is trained on language modelling and has near-state-of-the-art…

Computation and Language · Computer Science 2020-10-13 Yian Zhang

Extending Memory for Language Modelling

Breakthroughs in deep learning and memory networks have made major advances in natural language understanding. Language is sequential and information carried through the sequence can be captured through memory networks. Learning the…

Computation and Language · Computer Science 2023-05-22 Anupiya Nugaliyadde

FastSeq: Make Sequence Generation Faster

Transformer-based models have made tremendous impacts in natural language generation. However the inference speed is a bottleneck due to large model size and intensive computing involved in auto-regressive decoding process. We develop…

Computation and Language · Computer Science 2021-07-14 Yu Yan , Fei Hu , Jiusheng Chen , Nikhil Bhendawade , Ting Ye , Yeyun Gong , Nan Duan , Desheng Cui , Bingyu Chi , Ruofei Zhang

Convolutional Sequence to Sequence Learning

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to…

Computation and Language · Computer Science 2017-07-26 Jonas Gehring , Michael Auli , David Grangier , Denis Yarats , Yann N. Dauphin

Enhanced LSTM for Natural Language Inference

Reasoning and inference are central to human and artificial intelligence. Modeling inference in human language is very challenging. With the availability of large annotated data (Bowman et al., 2015), it has recently become feasible to…

Computation and Language · Computer Science 2020-03-04 Qian Chen , Xiaodan Zhu , Zhenhua Ling , Si Wei , Hui Jiang , Diana Inkpen

Top-down Tree Long Short-Term Memory Networks

Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have been successfully applied to a variety of sequence modeling tasks. In this paper we develop Tree Long Short-Term Memory…

Computation and Language · Computer Science 2016-04-05 Xingxing Zhang , Liang Lu , Mirella Lapata

Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences

Recurrent Neural Networks (RNNs) have become the state-of-the-art choice for extracting patterns from temporal sequences. However, current RNN models are ill-suited to process irregularly sampled data triggered by events generated in…

Machine Learning · Computer Science 2016-11-01 Daniel Neil , Michael Pfeiffer , Shih-Chii Liu

Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs

We introduce a neural network that represents sentences by composing their words according to induced binary parse trees. We use Tree-LSTM as our composition function, applied along a tree structure found by a fully differentiable natural…

Computation and Language · Computer Science 2020-01-16 Jean Maillard , Stephen Clark , Dani Yogatama

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of…

Computation and Language · Computer Science 2015-06-02 Kai Sheng Tai , Richard Socher , Christopher D. Manning

Sequence-to-Sequence Learning with Latent Neural Grammars

Sequence-to-sequence learning with neural networks has become the de facto standard for sequence prediction tasks. This approach typically models the local distribution over the next word with a powerful neural network that can condition on…

Computation and Language · Computer Science 2021-11-17 Yoon Kim

Neural Lattice-to-Sequence Models for Uncertain Inputs

The input to a neural sequence-to-sequence model is often determined by an up-stream system, e.g. a word segmenter, part of speech tagger, or speech recognizer. These up-stream models are potentially error-prone. Representing inputs through…

Computation and Language · Computer Science 2017-07-24 Matthias Sperber , Graham Neubig , Jan Niehues , Alex Waibel

Long Short-Term Memory Over Tree Structures

The chain-structured long short-term memory (LSTM) has showed to be effective in a wide range of problems such as speech recognition and machine translation. In this paper, we propose to extend it to tree structures, in which a memory cell…

Computation and Language · Computer Science 2015-03-18 Xiaodan Zhu , Parinaz Sobhani , Hongyu Guo