Related papers: SMRT Chatbots: Improving Non-Task-Oriented Dialog …

Simulated Multiple Reference Training Improves Low-Resource Machine Translation

Many valid translations exist for a given sentence, yet machine translation (MT) is trained with a single reference translation, exacerbating data sparsity in low-resource settings. We introduce Simulated Multiple Reference Training (SMRT),…

Computation and Language · Computer Science 2021-04-23 Huda Khayrallah , Brian Thompson , Matt Post , Philipp Koehn

Improving Multi-turn Dialogue Consistency with Self-Recall Thinking

Large language model (LLM) based multi-turn dialogue systems often struggle to track dependencies across non-adjacent turns, undermining both consistency and scalability. As conversations lengthen, essential information becomes sparse and…

Computation and Language · Computer Science 2026-05-15 Renning Pang , Tian Lan , Leyuan Liu , Xiaoming Huang , Piao Tong , Xiaosong Zhang

Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems

Recently, data-driven task-oriented dialogue systems have achieved promising performance in English. However, developing dialogue systems that support low-resource languages remains a long-standing challenge due to the absence of…

Computation and Language · Computer Science 2019-11-22 Zihan Liu , Genta Indra Winata , Zhaojiang Lin , Peng Xu , Pascale Fung

Scheduled Multi-task Learning for Neural Chat Translation

Neural Chat Translation (NCT) aims to translate conversational text into different languages. Existing methods mainly focus on modeling the bilingual dialogue characteristics (e.g., coherence) to improve chat translation via multi-task…

Computation and Language · Computer Science 2022-05-11 Yunlong Liang , Fandong Meng , Jinan Xu , Yufeng Chen , Jie Zhou

A Multi-task Multi-stage Transitional Training Framework for Neural Chat Translation

Neural chat translation (NCT) aims to translate a cross-lingual chat between speakers of different languages. Existing context-aware NMT models cannot achieve satisfactory performances due to the following inherent problems: 1) limited…

Computation and Language · Computer Science 2023-01-30 Chulun Zhou , Yunlong Liang , Fandong Meng , Jie Zhou , Jinan Xu , Hongji Wang , Min Zhang , Jinsong Su

Multi-Stage Pre-training Enhanced by ChatGPT for Multi-Scenario Multi-Domain Dialogue Summarization

Dialogue summarization involves a wide range of scenarios and domains. However, existing methods generally only apply to specific scenarios or domains. In this study, we propose a new pre-trained model specifically designed for…

Computation and Language · Computer Science 2023-10-17 Weixiao Zhou , Gengyao Li , Xianfu Cheng , Xinnian Liang , Junnan Zhu , Feifei Zhai , Zhoujun Li

Towards Multilingual Automatic Dialogue Evaluation

The main limiting factor in the development of robust multilingual dialogue evaluation metrics is the lack of multilingual data and the limited availability of open sourced multilingual dialogue systems. In this work, we propose a…

Computation and Language · Computer Science 2023-09-01 John Mendonça , Alon Lavie , Isabel Trancoso

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue

Large Language Models (LLMs) have demonstrated superior abilities in tasks such as chatting, reasoning, and question-answering. However, standard LLMs may ignore crucial paralinguistic information, such as sentiment, emotion, and speaking…

Computation and Language · Computer Science 2024-01-18 Guan-Ting Lin , Prashanth Gurunath Shivakumar , Ankur Gandhe , Chao-Han Huck Yang , Yile Gu , Shalini Ghosh , Andreas Stolcke , Hung-yi Lee , Ivan Bulyko

How to Choose How to Choose Your Chatbot: A Massively Multi-System MultiReference Data Set for Dialog Metric Evaluation

We release MMSMR, a Massively Multi-System MultiReference dataset to enable future work on metrics and evaluation for dialog. Automatic metrics for dialogue evaluation should be robust proxies for human judgments; however, the verification…

Computation and Language · Computer Science 2024-11-20 Huda Khayrallah , Zuhaib Akhtar , Edward Cohen , Jyothir S , João Sedoc

Dialogue-oriented Pre-training

Pre-trained language models (PrLM) has been shown powerful in enhancing a broad range of downstream tasks including various dialogue related ones. However, PrLMs are usually trained on general plain text with common language model (LM)…

Computation and Language · Computer Science 2021-08-03 Yi Xu , Hai Zhao

Structural Self-Supervised Objectives for Transformers

This thesis focuses on improving the pre-training of natural language models using unsupervised raw data to make them more efficient and aligned with downstream applications. In the first part, we introduce three alternative pre-training…

Computation and Language · Computer Science 2023-09-18 Luca Di Liello

Segment-Based Interactive Machine Translation for Pre-trained Models

Pre-trained large language models (LLM) are starting to be widely used in many applications. In this work, we explore the use of these models in interactive machine translation (IMT) environments. In particular, we have chosen mBART…

Computation and Language · Computer Science 2024-07-10 Angel Navarro , Francisco Casacuberta

Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models

Building a persona-based conversation agent is challenging owing to the lack of large amounts of speaker-specific conversation data for model training. This paper addresses the problem by proposing a multi-task learning approach to training…

Computation and Language · Computer Science 2017-10-23 Yi Luan , Chris Brockett , Bill Dolan , Jianfeng Gao , Michel Galley

Non-Fluent Synthetic Target-Language Data Improve Neural Machine Translation

When the amount of parallel sentences available to train a neural machine translation is scarce, a common practice is to generate new synthetic training samples from them. A number of approaches have been proposed to produce synthetic…

Computation and Language · Computer Science 2024-01-30 Víctor M. Sánchez-Cartagena , Miquel Esplà-Gomis , Juan Antonio Pérez-Ortiz , Felipe Sánchez-Martínez

MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation

Chatbots are designed to carry out human-like conversations across different domains, such as general chit-chat, knowledge exchange, and persona-grounded conversations. To measure the quality of such conversational agents, a dialogue…

Computation and Language · Computer Science 2022-01-19 Chen Zhang , Luis Fernando D'Haro , Thomas Friedrichs , Haizhou Li

Exploring Unsupervised Pretraining Objectives for Machine Translation

Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT), by drastically reducing the need for large parallel data. Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence…

Computation and Language · Computer Science 2021-06-11 Christos Baziotis , Ivan Titov , Alexandra Birch , Barry Haddow

LLM Roleplay: Simulating Human-Chatbot Interaction

The development of chatbots requires collecting a large number of human-chatbot dialogues to reflect the breadth of users' sociodemographic backgrounds and conversational goals. However, the resource requirements to conduct the respective…

Computation and Language · Computer Science 2024-10-15 Hovhannes Tamoyan , Hendrik Schuff , Iryna Gurevych

SMART: Self-supervised Multi-task pretrAining with contRol Transformers

Self-supervised pretraining has been extensively studied in language and vision domains, where a unified model can be easily adapted to various downstream tasks by pretraining representations without explicit labels. When it comes to…

Machine Learning · Computer Science 2023-01-25 Yanchao Sun , Shuang Ma , Ratnesh Madaan , Rogerio Bonatti , Furong Huang , Ashish Kapoor

ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback

Large language models (LLMs) like ChatGPT have exhibited remarkable abilities on a wide range of natural language processing~(NLP) tasks, including various machine translation abilities accomplished during chat. However, these models are…

Computation and Language · Computer Science 2023-11-03 Wenxiang Jiao , Jen-tse Huang , Wenxuan Wang , Zhiwei He , Tian Liang , Xing Wang , Shuming Shi , Zhaopeng Tu

SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization

Multi-turn dialogues are characterized by their extended length and the presence of turn-taking conversations. Traditional language models often overlook the distinct features of these dialogues by treating them as regular text. In this…

Computation and Language · Computer Science 2024-02-01 Sangwoo Cho , Kaiqiang Song , Chao Zhao , Xiaoyang Wang , Dong Yu