Related papers: Modeling Programs Hierarchically with Stack-Augmen…

Hierarchical Representation in Neural Language Models: Suppression and Recovery of Expectations

Deep learning sequence models have led to a marked increase in performance for a range of Natural Language Processing tasks, but it remains an open question whether they are able to induce proper hierarchical generalizations for…

Computation and Language · Computer Science 2019-06-11 Ethan Wilcox , Roger Levy , Richard Futrell

Revisiting the Hierarchical Multiscale LSTM

Hierarchical Multiscale LSTM (Chung et al., 2016a) is a state-of-the-art language model that learns interpretable structure from character-level input. Such models can provide fertile ground for (cognitive) computational linguistics…

Computation and Language · Computer Science 2018-07-11 Ákos Kádár , Marc-Alexandre Côté , Grzegorz Chrupała , Afra Alishahi

Multi-Model Probabilistic Programming

Probabilistic programming makes it easy to represent a probabilistic model as a program. Building an individual model, however, is only one step of probabilistic modeling. The broader challenge of probabilistic modeling is in understanding…

Programming Languages · Computer Science 2022-08-15 Ryan Bernstein

Bearing Syntactic Fruit with Stack-Augmented Neural Networks

Any finite set of training data is consistent with an infinite number of hypothetical algorithms that could have generated it. Studies have shown that when human children learn language, they consistently favor hypotheses based on…

Computation and Language · Computer Science 2025-11-06 Brian DuSell , Ryan Cotterell

Improving the Robustness to Data Inconsistency between Training and Testing for Code Completion by Hierarchical Language Model

In the field of software engineering, applying language models to the token sequence of source code is the state-of-art approach to build a code recommendation system. The syntax tree of source code has hierarchical structures. Ignoring the…

Software Engineering · Computer Science 2022-11-29 Yixiao Yang

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

We propose a technique for learning representations of parser states in transition-based dependency parsers. Our primary innovation is a new control structure for sequence-to-sequence neural networks---the stack LSTM. Like the conventional…

Computation and Language · Computer Science 2015-06-01 Chris Dyer , Miguel Ballesteros , Wang Ling , Austin Matthews , Noah A. Smith

Cell-aware Stacked LSTMs for Modeling Sentences

We propose a method of stacking multiple long short-term memory (LSTM) layers for modeling sentences. In contrast to the conventional stacked LSTMs where only hidden states are fed as input to the next layer, the suggested architecture…

Computation and Language · Computer Science 2019-11-04 Jihun Choi , Taeuk Kim , Sang-goo Lee

Stacking Small Language Models for Generalizability

Recent advances show that large language models (LLMs) generalize strong performance across different natural language benchmarks. However, the large size of LLMs makes training and inference expensive and impractical to run in…

Computation and Language · Computer Science 2024-10-22 Laurence Liang

A Statistical Framework for Model Selection in LSTM Networks

Long Short-Term Memory (LSTM) neural network models have become the cornerstone for sequential data modeling in numerous applications, ranging from natural language processing to time series forecasting. Despite their success, the problem…

Machine Learning · Statistics 2026-05-26 Fahad Mostafa

Exploiting Syntactic Structure for Natural Language Modeling

The thesis presents an attempt at using the syntactic structure in natural language for improved language models for speech recognition. The structured language model merges techniques in automatic parsing and language modeling using an…

Computation and Language · Computer Science 2007-05-23 Ciprian Chelba

LSTM Neural Reordering Feature for Statistical Machine Translation

Artificial neural networks are powerful models, which have been widely applied into many aspects of machine translation, such as language modeling and translation modeling. Though notable improvements have been made in these areas, the…

Computation and Language · Computer Science 2017-09-25 Yiming Cui , Shijin Wang , Jianfeng Li

Long-Short Range Context Neural Networks for Language Modeling

The goal of language modeling techniques is to capture the statistical and structural properties of natural languages from training corpora. This task typically involves the learning of short range dependencies, which generally model the…

Computation and Language · Computer Science 2017-08-23 Youssef Oualil , Mittul Singh , Clayton Greenberg , Dietrich Klakow

Overestimation of Syntactic Representationin Neural Language Models

With the advent of powerful neural language models over the last few years, research attention has increasingly focused on what aspects of language they represent that make them so successful. Several testing methodologies have been…

Computation and Language · Computer Science 2023-05-26 Jordan Kodner , Nitish Gupta

Autonomous Structural Memory Manipulation for Large Language Models Using Hierarchical Embedding Augmentation

Transformative innovations in model architectures have introduced hierarchical embedding augmentation as a means to redefine the representation of tokens through multi-level semantic structures, offering enhanced adaptability to complex…

Computation and Language · Computer Science 2025-08-11 Derek Yotheringhay , Alistair Kirkland , Humphrey Kirkbride , Josiah Whitesteeple

Multi-cell LSTM Based Neural Language Model

Language models, being at the heart of many NLP problems, are always of great interest to researchers. Neural language models come with the advantage of distributed representations and long range contexts. With its particular dynamics that…

Neural and Evolutionary Computing · Computer Science 2018-11-19 Thomas Cherian , Akshay Badola , Vineet Padmanabhan

Improve Language Modelling for Code Completion through Statement Level Language Model based on Statement Embedding Generated by BiLSTM

Language models such as RNN, LSTM or other variants have been widely used as generative models in natural language processing. In last few years, taking source code as natural languages, parsing source code into a token sequence and using a…

Software Engineering · Computer Science 2019-10-28 Yixiao Yang

HPC-Coder: Modeling Parallel Programs using Large Language Models

Parallel programs in high performance computing (HPC) continue to grow in complexity and scale in the exascale era. The diversity in hardware and parallel programming models make developing, optimizing, and maintaining parallel software…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-15 Daniel Nichols , Aniruddha Marathe , Harshitha Menon , Todd Gamblin , Abhinav Bhatele

Enhanced LSTM for Natural Language Inference

Reasoning and inference are central to human and artificial intelligence. Modeling inference in human language is very challenging. With the availability of large annotated data (Bowman et al., 2015), it has recently become feasible to…

Computation and Language · Computer Science 2020-03-04 Qian Chen , Xiaodan Zhu , Zhenhua Ling , Si Wei , Hui Jiang , Diana Inkpen

Language Modeling through Long Term Memory Network

Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), and Memory Networks which contain memory are popularly used to learn patterns in sequential data. Sequential data has long sequences that hold relationships. RNN can…

Computation and Language · Computer Science 2019-04-22 Anupiya Nugaliyadde , Kok Wai Wong , Ferdous Sohel , Hong Xie

Recurrent Neural Networks with Mixed Hierarchical Structures for Natural Language Processing

Hierarchical structures exist in both linguistics and Natural Language Processing (NLP) tasks. How to design RNNs to learn hierarchical representations of natural languages remains a long-standing challenge. In this paper, we define two…

Computation and Language · Computer Science 2021-06-07 Zhaoxin Luo , Michael Zhu