English
Related papers

Related papers: Preference Alignment Improves Language Model-Based…

200 papers

In recent years, text-to-speech (TTS) has seen impressive advancements through large-scale language models, achieving human-level speech quality. Integrating human feedback has proven effective for enhancing robustness in these systems.…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-03 Kangxiang Xia , Xinfa Zhu , Jixun Yao , Lei Xie

Large language models (LLMs) demonstrate impressive performance but lack the flexibility to adapt to human preferences quickly without retraining. In this work, we introduce Test-time Preference Optimization (TPO), a framework that aligns…

Computation and Language · Computer Science 2025-01-23 Yafu Li , Xuyang Hu , Xiaoye Qu , Linjie Li , Yu Cheng

Our goal is to enable large language models (LLMs) to balance multiple human preference dimensions; such as helpfulness, safety, and verbosity, through principled and controllable alignment. Existing preference optimization methods,…

Machine Learning · Computer Science 2026-02-03 Mete Erdogan

Aligning text-to-speech (TTS) system outputs with human feedback through preference optimization has been shown to effectively improve the robustness and naturalness of language model-based TTS models. Current approaches primarily require…

Computation and Language · Computer Science 2026-04-28 Rikuto Kotoge , Yuichi Sasaki

Aligning the output of Large Language Models (LLMs) with human preferences (e.g., by means of reinforcement learning with human feedback, or RLHF) is essential for ensuring their effectiveness in real-world scenarios. Despite significant…

Artificial Intelligence · Computer Science 2024-10-23 Pietro Bernardelle , Gianluca Demartini

Developing high-quality text-to-speech (TTS) systems for low-resource languages is challenging due to the scarcity of paired text and speech data. In contrast, automatic speech recognition (ASR) models for such languages are often more…

The rapid advancement of large language models (LLMs) has facilitated their transformation into conversational chatbots that can grasp contextual nuances and generate pertinent sentences, closely mirroring human values through advanced…

Computation and Language · Computer Science 2024-07-19 Janghwan Lee , Seongmin Park , Sukjin Hong , Minsoo Kim , Du-Seong Chang , Jungwook Choi

Enhancing the conformity of large language models (LLMs) to human preferences remains an ongoing research challenge. Recently, offline approaches such as Direct Preference Optimization (DPO) have gained prominence as attractive options due…

Machine Learning · Computer Science 2024-09-05 Kaihui Chen , Hao Yi , Qingyang Li , Tianyu Qi , Yulan Hu , Fuzheng Zhang , Yong Liu

For aligning large language models (LLMs), prior work has leveraged reinforcement learning via human feedback (RLHF) or variations of direct preference optimization (DPO). While DPO offers a simpler framework based on maximum likelihood…

Artificial Intelligence · Computer Science 2025-05-27 Anirudhan Badrinath , Prabhat Agarwal , Jiajing Xu

Effective training of language models (LMs) for mathematical reasoning tasks demands high-quality supervised fine-tuning data. Besides obtaining annotations from human experts, a common alternative is sampling from larger and more powerful…

Computation and Language · Computer Science 2024-07-26 Tianduo Wang , Shichen Li , Wei Lu

Integrating human feedback to align text-to-speech (TTS) system outputs with human preferences has proven to be an effective approach for enhancing the robustness of language model-based TTS systems. Current approaches primarily focus on…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-29 Jixun Yao , Yuguang Yang , Yu Pan , Yuan Feng , Ziqian Ning , Jianhao Ye , Hongbin Zhou , Lei Xie

In the domain of complex reasoning tasks, such as mathematical reasoning, recent advancements have proposed the use of Direct Preference Optimization (DPO) to suppress output of dispreferred responses, thereby enhancing the long-chain…

Computation and Language · Computer Science 2025-10-27 Weibin Liao , Xu Chu , Yasha Wang

Automatic text simplification (ATS) aims to enhance language accessibility for various target groups, particularly persons with intellectual disabilities. Recent advancements in generative AI, especially large language models (LLMs), have…

Computation and Language · Computer Science 2025-07-03 Yingqiang Gao , Kaede Johnson , David Froehlich , Luisa Carrer , Sarah Ebling

Direct Preference Optimization (DPO) and its variants have become the de facto standards for aligning large language models (LLMs) with human preferences or specific goals. However, DPO requires high-quality preference data and suffers from…

Machine Learning · Computer Science 2024-11-12 Zhuotong Chen , Fang Liu , Jennifer Zhu , Wanyu Du , Yanjun Qi

Neural metrics for machine translation (MT) evaluation have become increasingly prominent due to their superior correlation with human judgments compared to traditional lexical metrics. Researchers have therefore utilized neural metrics…

Computation and Language · Computer Science 2025-11-21 Hippolyte Gisserot-Boukhlef , Ricardo Rei , Emmanuel Malherbe , Céline Hudelot , Pierre Colombo , Nuno M. Guerreiro

Preference alignment in Large Language Models (LLMs) has significantly improved their ability to adhere to human instructions and intentions. However, existing direct alignment algorithms primarily focus on relative preferences and often…

Machine Learning · Computer Science 2025-05-13 Shenao Zhang , Zhihan Liu , Boyi Liu , Yufeng Zhang , Yingxiang Yang , Yongfei Liu , Liyu Chen , Tao Sun , Zhaoran Wang

Modern zero-shot text-to-speech (TTS) systems, despite using extensive pre-training, often struggle in challenging scenarios such as tongue twisters, repeated words, code-switching, and cross-lingual synthesis, leading to intelligibility…

Sound · Computer Science 2025-06-09 Xueyao Zhang , Yuancheng Wang , Chaoren Wang , Ziniu Li , Zhuo Chen , Zhizheng Wu

The last year has witnessed the rapid progress of large language models (LLMs) across diverse domains. Among them, CodeLLMs have garnered particular attention because they can not only assist in completing various programming tasks but also…

Artificial Intelligence · Computer Science 2024-10-25 Yibo Miao , Bofei Gao , Shanghaoran Quan , Junyang Lin , Daoguang Zan , Jiaheng Liu , Jian Yang , Tianyu Liu , Zhijie Deng

Aligning large language models (LLMs) with human intent is critical for enhancing their performance across a variety of tasks. Standard alignment techniques, such as Direct Preference Optimization (DPO), often rely on the binary…

Computation and Language · Computer Science 2025-04-01 Yuxiang Guo , Lu Yin , Bo Jiang , Jiaqi Zhang

In the field of large language models (LLMs), aligning models with the diverse preferences of users is a critical challenge. Direct Preference Optimization (DPO) has played a key role in this area. It works by using pairs of preferences…

Computation and Language · Computer Science 2024-05-29 Yueqin Yin , Zhendong Wang , Yi Gu , Hai Huang , Weizhu Chen , Mingyuan Zhou
‹ Prev 1 2 3 10 Next ›