Related papers: Reference-Centric Models for Grounded Collaborativ…

Frame of Reference: Addressing the Challenges of Common Ground Representation in Situational Dialogs

Common ground plays a critical role in situated spoken dialogs, where interlocutors must establish and maintain shared references to entities, events, and relations to sustain coherent interaction in a shared space and over time. With the…

Computation and Language · Computer Science 2026-04-08 Biswesh Mohapatra , Théo Charlot , Giovanni Duca , Mayank Palan , Laurent Romary , Justine Cassell

Learning to Speak and Act in a Fantasy Text Adventure Game

We introduce a large scale crowdsourced text adventure game as a research platform for studying grounded dialogue. In it, agents can perceive, emote, and act whilst conducting dialogue with other agents. Models and humans can both act as…

Computation and Language · Computer Science 2019-03-08 Jack Urbanek , Angela Fan , Siddharth Karamcheti , Saachi Jain , Samuel Humeau , Emily Dinan , Tim Rocktäschel , Douwe Kiela , Arthur Szlam , Jason Weston

Grounding Language in Multi-Perspective Referential Communication

We introduce a task and dataset for referring expression generation and comprehension in multi-agent embodied environments. In this task, two agents in a shared scene must take into account one another's visual perspective, which may be…

Computation and Language · Computer Science 2024-10-08 Zineng Tang , Lingjun Mao , Alane Suhr

You Truly Understand What I Need: Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona

To build a conversational agent that interacts fluently with humans, previous studies blend knowledge or personal profile into the pre-trained language model. However, the model that considers knowledge and persona at the same time is still…

Computation and Language · Computer Science 2023-01-09 Jungwoo Lim , Myunghoon Kang , Yuna Hur , Seungwon Jung , Jinsung Kim , Yoonna Jang , Dongyub Lee , Hyesung Ji , Donghoon Shin , Seungryong Kim , Heuiseok Lim

Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units

Successful conversations often rest on common understanding, where all parties are on the same page about the information being shared. This process, known as conversational grounding, is crucial for building trustworthy dialog systems that…

Computation and Language · Computer Science 2024-03-26 Biswesh Mohapatra , Seemab Hassan , Laurent Romary , Justine Cassell

Decision-Theoretic Question Generation for Situated Reference Resolution: An Empirical Study and Computational Model

Dialogue agents that interact with humans in situated environments need to manage referential ambiguity across multiple modalities and ask for help as needed. However, it is not clear what kinds of questions such agents should ask nor how…

Computation and Language · Computer Science 2021-10-14 Felix Gervits , Gordon Briggs , Antonio Roque , Genki A. Kadomatsu , Dean Thurston , Matthias Scheutz , Matthew Marge

Representation Learning for Grounded Spatial Reasoning

The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment. We consider the task of spatial reasoning in a simulated environment, where an agent can act and receive…

Computation and Language · Computer Science 2017-11-15 Michael Janner , Karthik Narasimhan , Regina Barzilay

Grounded Agreement Games: Emphasizing Conversational Grounding in Visual Dialogue Settings

Where early work on dialogue in Computational Linguistics put much emphasis on dialogue structure and its relation to the mental states of the dialogue participants (e.g., Allen 1979, Grosz & Sidner 1986), current work mostly reduces…

Computation and Language · Computer Science 2019-08-30 David Schlangen

Symbolic Planning and Code Generation for Grounded Dialogue

Large language models (LLMs) excel at processing and generating both text and code. However, LLMs have had limited applicability in grounded task-oriented dialogue as they are difficult to steer toward task objectives and fail to handle…

Computation and Language · Computer Science 2023-10-27 Justin T. Chiu , Wenting Zhao , Derek Chen , Saujas Vaduguru , Alexander M. Rush , Daniel Fried

MM-Conv: A Multimodal Dataset and Benchmark for Context-Aware Grounding in 3D Dialogue

Grounding language in the physical world requires AI systems to interpret references that emerge dynamically during conversation. While current vision-language models (VLMs) excel at static image tasks, they struggle to resolve ambiguous…

Computer Vision and Pattern Recognition · Computer Science 2026-05-22 Anna Deichler , Jim O'Regan , Fethiye Irmak Dogan , Lubos Marcinek , Anna Klezovich , Iolanda Leite , Jonas Beskow

A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions

Recent models achieve promising results in visually grounded dialogues. However, existing datasets often contain undesirable biases and lack sophisticated linguistic analyses, which make it difficult to understand how well current models…

Computation and Language · Computer Science 2020-10-08 Takuma Udagawa , Takato Yamazaki , Akiko Aizawa

A Natural Language Corpus of Common Grounding under Continuous and Partially-Observable Context

Common grounding is the process of creating, repairing and updating mutual understandings, which is a critical aspect of sophisticated human communication. However, traditional dialogue systems have limited capability of establishing common…

Computation and Language · Computer Science 2019-07-09 Takuma Udagawa , Akiko Aizawa

Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts

Dialogue participants often refer to entities or situations repeatedly within a conversation, which contributes to its cohesiveness. Subsequent references exploit the common ground accumulated by the interlocutors and hence have several…

Computation and Language · Computer Science 2020-11-10 Ece Takmaz , Mario Giulianelli , Sandro Pezzelle , Arabella Sinclair , Raquel Fernández

Collecting Visually-Grounded Dialogue with A Game Of Sorts

An idealized, though simplistic, view of the referring expression production and grounding process in (situated) dialogue assumes that a speaker must merely appropriately specify their expression so that the target referent may be…

Computation and Language · Computer Science 2023-09-12 Bram Willemsen , Dmytro Kalpakchi , Gabriel Skantze

Referring to the recently seen: reference and perceptual memory in situated dialog

From theoretical linguistic and cognitive perspectives, situated dialog systems are interesting as they provide ideal test-beds for investigating the interaction between language and perception. At the same time there are a growing number…

Human-Computer Interaction · Computer Science 2019-03-26 John D. Kelleher , Simon Dobnik

Key-Value Retrieval Networks for Task-Oriented Dialogue

Neural task-oriented dialogue systems often struggle to smoothly interface with a knowledge base. In this work, we seek to address this problem by proposing a new neural dialogue agent that is able to effectively sustain grounded,…

Computation and Language · Computer Science 2017-07-17 Mihail Eric , Christopher D. Manning

Solving Dialogue Grounding Embodied Task in a Simulated Environment using Further Masked Language Modeling

Enhancing AI systems with efficient communication skills that align with human understanding is crucial for their effective assistance to human users. Proactive initiatives from the system side are needed to discern specific circumstances…

Computation and Language · Computer Science 2023-06-22 Weijie Jack Zhang

Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat

We propose a grounded dialogue state encoder which addresses a foundational issue on how to integrate visual grounding with dialogue system components. As a test-bed, we focus on the GuessWhat?! game, a two-player game where the goal is to…

Computation and Language · Computer Science 2019-03-18 Ravi Shekhar , Aashish Venkatesh , Tim Baumgärtner , Elia Bruni , Barbara Plank , Raffaella Bernardi , Raquel Fernández

RMM: A Recursive Mental Model for Dialog Navigation

Language-guided robots must be able to both ask humans questions and understand answers. Much existing work focuses only on the latter. In this paper, we go beyond instruction following and introduce a two-agent task where one agent…

Computation and Language · Computer Science 2020-10-07 Homero Roman Roman , Yonatan Bisk , Jesse Thomason , Asli Celikyilmaz , Jianfeng Gao

Using Machine Mental Imagery for Representing Common Ground in Situated Dialogue

Situated dialogue requires speakers to maintain a reliable representation of shared context rather than reasoning only over isolated utterances. Current conversational agents often struggle with this requirement, especially when the common…

Computation and Language · Computer Science 2026-04-24 Biswesh Mohapatra , Giovanni Duca , Laurent Romary , Justine Cassell