Related papers: Topic-based Evaluation for Conversational Bots

Contextual Topic Modeling For Dialog Systems

Accurate prediction of conversation topics can be a valuable signal for creating coherent and engaging dialog systems. In this work, we focus on context-aware topic classification methods for identifying topics in free-form human-chatbot…

Computation and Language · Computer Science 2018-10-22 Chandra Khatri , Rahul Goel , Behnam Hedayatnia , Angeliki Metanillou , Anushree Venkatesh , Raefer Gabriel , Arindam Mandal

On Evaluating and Comparing Open Domain Dialog Systems

Conversational agents are exploding in popularity. However, much work remains in the area of non goal-oriented conversations, despite significant growth in research interest over recent years. To advance the state of the art in…

Computation and Language · Computer Science 2018-12-31 Anu Venkatesh , Chandra Khatri , Ashwin Ram , Fenfei Guo , Raefer Gabriel , Ashish Nagar , Rohit Prasad , Ming Cheng , Behnam Hedayatnia , Angeliki Metallinou , Rahul Goel , Shaohua Yang , Anirudh Raju

Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols

As conversational AI-based dialogue management has increasingly become a trending topic, the need for a standardized and reliable evaluation procedure grows even more pressing. The current state of affairs suggests various evaluation…

Computation and Language · Computer Science 2020-06-12 Sarah E. Finch , Jinho D. Choi

Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems

The lack of time-efficient and reliable evaluation methods hamper the development of conversational dialogue systems (chatbots). Evaluations requiring humans to converse with chatbots are time and cost-intensive, put high cognitive demands…

Artificial Intelligence · Computer Science 2020-10-06 Jan Deriu , Don Tuggener , Pius von Däniken , Jon Ander Campos , Alvaro Rodrigo , Thiziri Belkacem , Aitor Soroa , Eneko Agirre , Mark Cieliebak

MEDAL: A Framework for Benchmarking LLMs as Multilingual Open-Domain Dialogue Evaluators

Evaluating the quality of open-domain chatbots has become increasingly reliant on LLMs acting as automatic judges. However, existing meta-evaluation benchmarks are static, outdated, and lacking in multilingual coverage, limiting their…

Computation and Language · Computer Science 2026-01-23 John Mendonça , Alon Lavie , Isabel Trancoso

Knowledge-Grounded Dialogue Flow Management for Social Robots and Conversational Agents

The article proposes a system for knowledge-based conversation designed for Social Robots and other conversational agents. The proposed system relies on an Ontology for the description of all concepts that may be relevant conversation…

Robotics · Computer Science 2022-08-23 Lucrezia Grassi , Carmine Tommaso Recchiuto , Antonio Sgorbissa

MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation

Chatbots are designed to carry out human-like conversations across different domains, such as general chit-chat, knowledge exchange, and persona-grounded conversations. To measure the quality of such conversational agents, a dialogue…

Computation and Language · Computer Science 2022-01-19 Chen Zhang , Luis Fernando D'Haro , Thomas Friedrichs , Haizhou Li

Let's move on: Topic Change in Robot-Facilitated Group Discussions

Robot-moderated group discussions have the potential to facilitate engaging and productive interactions among human participants. Previous work on topic management in conversational agents has predominantly focused on human engagement and…

Robotics · Computer Science 2025-04-04 Georgios Hadjiantonis , Sarah Gillet , Marynel Vázquez , Iolanda Leite , Fethiye Irmak Dogan

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total…

Computation and Language · Computer Science 2020-01-27 Sarik Ghazarian , Ralph Weischedel , Aram Galstyan , Nanyun Peng

What Went Wrong? Explaining Overall Dialogue Quality through Utterance-Level Impacts

Improving user experience of a dialogue system often requires intensive developer effort to read conversation logs, run statistical analyses, and intuit the relative importance of system shortcomings. This paper presents a novel approach to…

Computation and Language · Computer Science 2021-11-02 James D. Finch , Sarah E. Finch , Jinho D. Choi

Comprehensive Framework for Evaluating Conversational AI Chatbots

Conversational AI chatbots are transforming industries by streamlining customer service, automating transactions, and enhancing user engagement. However, evaluating these systems remains a challenge, particularly in financial services,…

Computers and Society · Computer Science 2025-02-11 Shailja Gupta , Rajesh Ranjan , Surya Narayan Singh

Advancing the State of the Art in Open Domain Dialog Systems through the Alexa Prize

Building open domain conversational systems that allow users to have engaging conversations on topics of their choice is a challenging task. Alexa Prize was launched in 2016 to tackle the problem of achieving natural, sustained, coherent…

Computation and Language · Computer Science 2018-12-31 Chandra Khatri , Behnam Hedayatnia , Anu Venkatesh , Jeff Nunn , Yi Pan , Qing Liu , Han Song , Anna Gottardi , Sanjeev Kwatra , Sanju Pancholi , Ming Cheng , Qinglang Chen , Lauren Stubel , Karthik Gopalakrishnan , Kate Bland , Raefer Gabriel , Arindam Mandal , Dilek Hakkani-Tur , Gene Hwang , Nate Michel , Eric King , Rohit Prasad

Evaluating Coherence in Dialogue Systems using Entailment

Evaluating open-domain dialogue systems is difficult due to the diversity of possible correct answers. Automatic metrics such as BLEU correlate weakly with human annotations, resulting in a significant bias across different models and…

Computation and Language · Computer Science 2020-04-02 Nouha Dziri , Ehsan Kamalloo , Kory W. Mathewson , Osmar Zaiane

Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations

Building socialbots that can have deep, engaging open-domain conversations with humans is one of the grand challenges of artificial intelligence (AI). To this end, bots need to be able to leverage world knowledge spanning several domains…

Computation and Language · Computer Science 2023-08-24 Karthik Gopalakrishnan , Behnam Hedayatnia , Qinlang Chen , Anna Gottardi , Sanjeev Kwatra , Anu Venkatesh , Raefer Gabriel , Dilek Hakkani-Tur

Psychological Metrics for Dialog System Evaluation

We present metrics for evaluating dialog systems through a psychologically-grounded "human" lens in which conversational agents express a diversity of both states (e.g., emotion) and traits (e.g., personality), just as people do. We present…

Computation and Language · Computer Science 2023-09-19 Salvatore Giorgi , Shreya Havaldar , Farhan Ahmed , Zuhaib Akhtar , Shalaka Vaidya , Gary Pan , Lyle H. Ungar , H. Andrew Schwartz , Joao Sedoc

A Comprehensive Assessment of Dialog Evaluation Metrics

Automatic evaluation metrics are a crucial component of dialog systems research. Standard language evaluation metrics are known to be ineffective for evaluating dialog. As such, recent research has proposed a number of novel,…

Computation and Language · Computer Science 2021-07-09 Yi-Ting Yeh , Maxine Eskenazi , Shikib Mehri

Speech Sentiment and Customer Satisfaction Estimation in Socialbot Conversations

For an interactive agent, such as task-oriented spoken dialog systems or chatbots, measuring and adapting to Customer Satisfaction (CSAT) is critical in order to understand user perception of an agent's behavior and increase user engagement…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-31 Yelin Kim , Joshua Levy , Yang Liu

ConCET: Entity-Aware Topic Classification for Open-Domain Conversational Agents

Identifying the topic (domain) of each user's utterance in open-domain conversational systems is a crucial step for all subsequent language understanding and response tasks. In particular, for complex domains, an utterance is often routed…

Computation and Language · Computer Science 2020-05-29 Ali Ahmadvand , Harshita Sahijwani , Jason Ingyu Choi , Eugene Agichtein

Contextual Dialogue Act Classification for Open-Domain Conversational Agents

Classifying the general intent of the user utterance in a conversation, also known as Dialogue Act (DA), e.g., open-ended question, statement of opinion, or request for an opinion, is a key step in Natural Language Understanding (NLU) for…

Computation and Language · Computer Science 2020-05-29 Ali Ahmadvand , Jason Ingyu Choi , Eugene Agichtein

Learning an Unreferenced Metric for Online Dialogue Evaluation

Evaluating the quality of a dialogue interaction between two agents is a difficult task, especially in open-domain chit-chat style dialogue. There have been recent efforts to develop automatic dialogue evaluation metrics, but most of them…

Computation and Language · Computer Science 2020-05-05 Koustuv Sinha , Prasanna Parthasarathi , Jasmine Wang , Ryan Lowe , William L. Hamilton , Joelle Pineau