Related papers: Behavior Gated Language Models

Language Modeling with Gated Convolutional Networks

The pre-dominant approach to language modeling to date is based on recurrent neural networks. Their success on this task is often linked to their ability to capture unbounded context. In this paper we develop a finite context approach…

Computation and Language · Computer Science 2017-09-12 Yann N. Dauphin , Angela Fan , Michael Auli , David Grangier

When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars

The ability to acquire latent semantics is one of the key properties that determines the performance of language models. One convenient approach to invoke this ability is to prepend metadata (e.g. URLs, domains, and styles) at the beginning…

Computation and Language · Computer Science 2025-07-29 Rei Higuchi , Ryotaro Kawata , Naoki Nishikawa , Kazusato Oko , Shoichiro Yamaguchi , Sosuke Kobayashi , Seiya Tokui , Kohei Hayashi , Daisuke Okanohara , Taiji Suzuki

Modeling Pathology-Like Behavioral Patterns in Language Models Through Behavioral Fine-Tuning

Large language models are increasingly used as computational tools for modeling human-like behavior. We introduce a behavioral induction framework that modifies model policies through fine-tuning on structured decision-making tasks: using…

Computation and Language · Computer Science 2026-05-22 Nicola Milano , Davide Marocco

Language Models as Models of Language

This chapter critically examines the potential contributions of modern language models to theoretical linguistics. Despite their focus on engineering goals, these models' ability to acquire sophisticated linguistic knowledge from mere…

Computation and Language · Computer Science 2024-08-15 Raphaël Millière

Neural Conversation Models and How to Rein Them in: A Survey of Failures and Fixes

Recent conditional language models are able to continue any kind of text source in an often seemingly fluent way. This fact encouraged research in the area of open-domain conversational systems that are based on powerful language models and…

Computation and Language · Computer Science 2023-08-14 Fabian Galetzka , Anne Beyer , David Schlangen

How Context Affects Language Models' Factual Predictions

When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering. However, storing…

Computation and Language · Computer Science 2020-05-12 Fabio Petroni , Patrick Lewis , Aleksandra Piktus , Tim Rocktäschel , Yuxiang Wu , Alexander H. Miller , Sebastian Riedel

Speaker Information Can Guide Models to Better Inductive Biases: A Case Study On Predicting Code-Switching

Natural language processing (NLP) models trained on people-generated data can be unreliable because, without any constraints, they can learn from spurious correlations that are not relevant to the task. We hypothesize that enriching models…

Computation and Language · Computer Science 2022-03-18 Alissa Ostapenko , Shuly Wintner , Melinda Fricke , Yulia Tsvetkov

Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis

In recent years, pretrained language models have revolutionized the NLP world, while achieving state of the art performance in various downstream tasks. However, in many cases, these models do not perform well when labeled data is scarce…

Computation and Language · Computer Science 2022-04-06 Liat Ein-Dor , Ilya Shnayderman , Artem Spector , Lena Dankin , Ranit Aharonov , Noam Slonim

What Context Features Can Transformer Language Models Use?

Transformer-based language models benefit from conditioning on contexts of hundreds to thousands of previous tokens. What aspects of these contexts contribute to accurate model prediction? We describe a series of experiments that measure…

Computation and Language · Computer Science 2021-06-17 Joe O'Connor , Jacob Andreas

A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP

Building a dialogue system that can communicate naturally with humans is a challenging yet interesting problem of agent-based computing. The rapid growth in this area is usually hindered by the long-standing problem of data scarcity as…

Computation and Language · Computer Science 2021-04-23 Munazza Zaib , Quan Z. Sheng , Wei Emma Zhang

Disaggregation Reveals Hidden Training Dynamics: The Case of Agreement Attraction

Language models generally produce grammatical text, but they are more likely to make errors in certain contexts. Drawing on paradigms from psycholinguistics, we carry out a fine-grained analysis of those errors in different syntactic…

Computation and Language · Computer Science 2025-10-30 James A. Michaelov , Catherine Arnett

Dialog Context Language Modeling with Recurrent Neural Networks

In this work, we propose contextual language models that incorporate dialog level discourse information into language modeling. Previous works on contextual language model treat preceding utterances as a sequence of inputs, without…

Computation and Language · Computer Science 2017-01-17 Bing Liu , Ian Lane

Exploring BERT's Sensitivity to Lexical Cues using Tests from Semantic Priming

Models trained to estimate word probabilities in context have become ubiquitous in natural language processing. How do these models use lexical cues in context to inform their word probabilities? To answer this question, we present a case…

Computation and Language · Computer Science 2021-04-23 Kanishka Misra , Allyson Ettinger , Julia Taylor Rayz

Enhancing Target-Guided Proactive Dialogue Systems via Conversational Scenario Modeling and Intent-Keyword Bridging

A target-guided proactive dialogue system aims to steer conversations proactively toward pre-defined targets, such as designated keywords or specific topics. During guided conversations, dynamically modeling conversational scenarios and…

Computation and Language · Computer Science 2026-05-13 Maodong Li , Yancui Li , Fang Kong

Promoting Open-domain Dialogue Generation through Learning Pattern Information between Contexts and Responses

Recently, utilizing deep neural networks to build the opendomain dialogue models has become a hot topic. However, the responses generated by these models suffer from many problems such as responses not being contextualized and tend to…

Computation and Language · Computer Science 2023-09-07 Mengjuan Liu , Chenyang Liu , Yunfan Yang , Jiang Liu , Mohan Jing

Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling

Recurrent neural networks have been very successful at predicting sequences of words in tasks such as language modeling. However, all such models are based on the conventional classification framework, where the model is trained against…

Machine Learning · Computer Science 2017-03-14 Hakan Inan , Khashayar Khosravi , Richard Socher

Lost in the Middle: How Language Models Use Long Contexts

While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context. We analyze the performance of language models on two tasks that require identifying relevant…

Computation and Language · Computer Science 2023-11-22 Nelson F. Liu , Kevin Lin , John Hewitt , Ashwin Paranjape , Michele Bevilacqua , Fabio Petroni , Percy Liang

Context Matters in Semantically Controlled Language Generation for Task-oriented Dialogue Systems

This work combines information about the dialogue history encoded by pre-trained model with a meaning representation of the current system utterance to realize contextual language generation in task-oriented dialogues. We utilize the…

Computation and Language · Computer Science 2021-11-30 Ye Liu , Wolfgang Maier , Wolfgang Minker , Stefan Ultes

Emotional Neural Language Generation Grounded in Situational Contexts

Emotional language generation is one of the keys to human-like artificial intelligence. Humans use different type of emotions depending on the situation of the conversation. Emotions also play an important role in mediating the engagement…

Computation and Language · Computer Science 2019-11-27 Sashank Santhanam , Samira Shaikh

Language Model Priming for Cross-Lingual Event Extraction

We present a novel, language-agnostic approach to "priming" language models for the task of event extraction, providing particularly effective performance in low-resource and zero-shot cross-lingual settings. With priming, we augment the…

Computation and Language · Computer Science 2021-09-28 Steven Fincke , Shantanu Agarwal , Scott Miller , Elizabeth Boschee