Related papers: Speaker Sensitive Response Evaluation Model

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

We investigate evaluation metrics for dialogue response generation systems where supervised labels, such as task completion, are not available. Recent works in response generation have adopted metrics from machine translation to compare a…

Computation and Language · Computer Science 2017-01-04 Chia-Wei Liu , Ryan Lowe , Iulian V. Serban , Michael Noseworthy , Laurent Charlin , Joelle Pineau

Rethinking Response Evaluation from Interlocutor's Eye for Open-Domain Dialogue Systems

Open-domain dialogue systems have started to engage in continuous conversations with humans. Those dialogue systems are required to be adjusted to the human interlocutor and evaluated in terms of their perspective. However, it is…

Computation and Language · Computer Science 2024-01-05 Yuma Tsuta , Naoki Yoshinaga , Shoetsu Sato , Masashi Toyoda

Learning an Unreferenced Metric for Online Dialogue Evaluation

Evaluating the quality of a dialogue interaction between two agents is a difficult task, especially in open-domain chit-chat style dialogue. There have been recent efforts to develop automatic dialogue evaluation metrics, but most of them…

Computation and Language · Computer Science 2020-05-05 Koustuv Sinha , Prasanna Parthasarathi , Jasmine Wang , Ryan Lowe , William L. Hamilton , Joelle Pineau

What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation

Accurate automatic evaluation metrics for open-domain dialogs are in high demand. Existing model-based metrics for system response evaluation are trained on human annotated data, which is cumbersome to collect. In this work, we propose to…

Computation and Language · Computer Science 2022-03-29 Sarik Ghazarian , Behnam Hedayatnia , Alexandros Papangelis , Yang Liu , Dilek Hakkani-Tur

Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses

Automatically evaluating the quality of dialogue responses for unstructured domains is a challenging problem. Unfortunately, existing automatic evaluation metrics are biased and correlate very poorly with human judgements of response…

Computation and Language · Computer Science 2018-01-18 Ryan Lowe , Michael Noseworthy , Iulian V. Serban , Nicolas Angelard-Gontier , Yoshua Bengio , Joelle Pineau

User Response and Sentiment Prediction for Automatic Dialogue Evaluation

Automatic evaluation is beneficial for open-domain dialog system development. However, standard word-overlap metrics (BLEU, ROUGE) do not correlate well with human judgements of open-domain dialog systems. In this work we propose to use the…

Computation and Language · Computer Science 2022-02-18 Sarik Ghazarian , Behnam Hedayatnia , Alexandros Papangelis , Yang Liu , Dilek Hakkani-Tur

Evaluating Dialogue Generation Systems via Response Selection

Existing automatic evaluation metrics for open-domain dialogue response generation systems correlate poorly with human evaluation. We focus on evaluating response generation systems via response selection. To evaluate systems properly via…

Computation and Language · Computer Science 2020-04-30 Shiki Sato , Reina Akama , Hiroki Ouchi , Jun Suzuki , Kentaro Inui

Measuring and Improving Semantic Diversity of Dialogue Generation

Response diversity has become an important criterion for evaluating the quality of open-domain dialogue generation models. However, current evaluation metrics for response diversity often fail to capture the semantic diversity of generated…

Computation and Language · Computer Science 2022-10-25 Seungju Han , Beomsu Kim , Buru Chang

Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems

Building an open-domain conversational agent is a challenging problem. Current evaluation methods, mostly post-hoc judgments of static conversation, do not capture conversation quality in a realistic interactive context. In this paper, we…

Computation and Language · Computer Science 2019-11-05 Asma Ghandeharioun , Judy Hanwen Shen , Natasha Jaques , Craig Ferguson , Noah Jones , Agata Lapedriza , Rosalind Picard

Designing Precise and Robust Dialogue Response Evaluators

Automatic dialogue response evaluator has been proposed as an alternative to automated metrics and human evaluation. However, existing automatic evaluators achieve only moderate correlation with human judgement and they are not robust. In…

Computation and Language · Computer Science 2020-04-27 Tianyu Zhao , Divesh Lala , Tatsuya Kawahara

Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset

One challenge for dialogue agents is recognizing feelings in the conversation partner and replying accordingly, a key communicative skill. While it is straightforward for humans to recognize and acknowledge others' feelings in a…

Computation and Language · Computer Science 2019-08-30 Hannah Rashkin , Eric Michael Smith , Margaret Li , Y-Lan Boureau

Achieving Reliable Human Assessment of Open-Domain Dialogue Systems

Evaluation of open-domain dialogue systems is highly challenging and development of better techniques is highlighted time and again as desperately needed. Despite substantial efforts to carry out reliable live evaluation of systems in…

Computation and Language · Computer Science 2022-03-14 Tianbo Ji , Yvette Graham , Gareth J. F. Jones , Chenyang Lyu , Qun Liu

On the Use of Linguistic Features for the Evaluation of Generative Dialogue Systems

Automatically evaluating text-based, non-task-oriented dialogue systems (i.e., `chatbots') remains an open problem. Previous approaches have suffered challenges ranging from poor correlation with human judgment to poor generalization and…

Computation and Language · Computer Science 2021-04-14 Ian Berlot-Attwell , Frank Rudzicz

Grounding in social media: An approach to building a chit-chat dialogue model

Building open-domain dialogue systems capable of rich human-like conversational ability is one of the fundamental challenges in language generation. However, even with recent advancements in the field, existing open-domain generative models…

Computation and Language · Computer Science 2022-06-14 Ritvik Choudhary , Daisuke Kawahara

An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation

We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation. Training paradigm of pre-training and fine-tuning is employed to conduct the parameter…

Computation and Language · Computer Science 2020-03-10 Piji Li

A Systematic Evaluation of Response Selection for Open Domain Dialogue

Recent progress on neural approaches for language processing has triggered a resurgence of interest on building intelligent open-domain chatbots. However, even the state-of-the-art neural chatbots cannot produce satisfying responses for…

Computation and Language · Computer Science 2022-08-10 Behnam Hedayatnia , Di Jin , Yang Liu , Dilek Hakkani-Tur

Promoting Open-domain Dialogue Generation through Learning Pattern Information between Contexts and Responses

Recently, utilizing deep neural networks to build the opendomain dialogue models has become a hot topic. However, the responses generated by these models suffer from many problems such as responses not being contextualized and tend to…

Computation and Language · Computer Science 2023-09-07 Mengjuan Liu , Chenyang Liu , Yunfan Yang , Jiang Liu , Mohan Jing

Automatic Construction of Discourse Corpora for Dialogue Translation

In this paper, a novel approach is proposed to automatically construct parallel discourse corpus for dialogue machine translation. Firstly, the parallel subtitle data and its corresponding monolingual movie script data are crawled and…

Computation and Language · Computer Science 2016-05-24 Longyue Wang , Xiaojun Zhang , Zhaopeng Tu , Andy Way , Qun Liu

Towards a Sentiment-Aware Conversational Agent

In this paper, we propose an end-to-end sentiment-aware conversational agent based on two models: a reply sentiment prediction model, which leverages the context of the dialogue to predict an appropriate sentiment for the agent to express…

Computation and Language · Computer Science 2022-07-26 Isabel Dias , Ricardo Rei , Patrícia Pereira , Luisa Coheur

Generating Dialogue Responses from a Semantic Latent Space

Existing open-domain dialogue generation models are usually trained to mimic the gold response in the training set using cross-entropy loss on the vocabulary. However, a good response does not need to resemble the gold response, since there…

Computation and Language · Computer Science 2020-10-06 Wei-Jen Ko , Avik Ray , Yilin Shen , Hongxia Jin