Related papers: AutoMixAlign: Adaptive Data Mixing for Multi-Task …

Dynamic Gradient Alignment for Online Data Mixing

The composition of training data mixtures is critical for effectively training large language models (LLMs), as it directly impacts their performance on downstream tasks. Our goal is to identify an optimal data mixture to specialize an LLM…

Machine Learning · Computer Science 2024-10-04 Simin Fan , David Grangier , Pierre Ablin

AMA: Adaptive Memory via Multi-Agent Collaboration

The rapid evolution of Large Language Model (LLM) agents has necessitated robust memory systems to support cohesive long-term interaction and complex reasoning. Benefiting from the strong capabilities of LLMs, recent research focus has…

Artificial Intelligence · Computer Science 2026-04-16 Weiquan Huang , Zixuan Wang , Hehai Lin , Sudong Wang , Bo Xu , Qian Li , Beier Zhu , Linyi Yang , Chengwei Qin

MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging

Optimizing data mixtures is essential for unlocking the full potential of large language models (LLMs), yet identifying the optimal composition remains computationally prohibitive due to reliance on heuristic trials or expensive proxy…

Machine Learning · Computer Science 2026-01-27 Jiapeng Wang , Changxin Tian , Kunlong Chen , Ziqi Liu , Jiaxin Mao , Wayne Xin Zhao , Zhiqiang Zhang , Jun Zhou

Pareto Multi-Objective Alignment for Language Models

Large language models (LLMs) are increasingly deployed in real-world applications that require careful balancing of multiple, often conflicting, objectives, such as informativeness versus conciseness, or helpfulness versus creativity.…

Machine Learning · Computer Science 2025-08-12 Qiang He , Setareh Maghsudi

Improving Model Alignment Through Collective Intelligence of Open-Source LLMS

Building helpful and harmless large language models (LLMs) requires effective model alignment approach based on human instructions and feedback, which necessitates high-quality human-labeled data. Constructing such datasets is often…

Computation and Language · Computer Science 2025-05-07 Junlin Wang , Roy Xie , Shang Zhu , Jue Wang , Ben Athiwaratkun , Bhuwan Dhingra , Shuaiwen Leon Song , Ce Zhang , James Zou

Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging

Achieving balanced alignment of large language models (LLMs) in terms of Helpfulness, Honesty, and Harmlessness (3H optimization) constitutes a cornerstone of responsible AI. Existing methods like data mixture strategies face limitations,…

Computation and Language · Computer Science 2026-02-03 Jinluan Yang , Dingnan Jin , Anke Tang , Li Shen , Didi Zhu , Zhengyu Chen , Ziyu Zhao , Daixin Wang , Qing Cui , Zhiqiang Zhang , Jun Zhou , Fei Wu , Kun Kuang

Leveraging Robust Optimization for LLM Alignment under Distribution Shifts

Preference alignment methods are increasingly critical for steering large language models (LLMs) to generate outputs consistent with human values. While recent approaches often rely on synthetic data generated by LLMs for scalability and…

Computation and Language · Computer Science 2025-10-21 Mingye Zhu , Yi Liu , Zheren Fu , Yongdong Zhang , Zhendong Mao

MIRA: A Method of Federated MultI-Task Learning for LaRge LAnguage Models

In this paper, we introduce a method for fine-tuning Large Language Models (LLMs), inspired by Multi-Task learning in a federated manner. Our approach leverages the structure of each client's model and enables a learning scheme that…

Machine Learning · Computer Science 2024-10-22 Ahmed Elbakary , Chaouki Ben Issaid , Tamer ElBatt , Karim Seddik , Mehdi Bennis

Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning

Large Language Models (LLMs) have been adopted and deployed worldwide for a broad variety of applications. However, ensuring their safe use remains a significant challenge. Preference training and safety measures often overfit to harms…

Computation and Language · Computer Science 2024-10-15 Aakanksha , Arash Ahmadian , Seraphina Goldfarb-Tarrant , Beyza Ermis , Marzieh Fadaee , Sara Hooker

DAMA: Data- and Model-aware Alignment of Multi-modal LLMs

Direct Preference Optimization (DPO) has shown effectiveness in aligning multi-modal large language models (MLLM) with human preferences. However, existing methods exhibit an imbalanced responsiveness to the data of varying hardness,…

Computer Vision and Pattern Recognition · Computer Science 2025-02-12 Jinda Lu , Junkang Wu , Jinghan Li , Xiaojun Jia , Shuo Wang , YiFan Zhang , Junfeng Fang , Xiang Wang , Xiangnan He

CM-Align: Consistency-based Multilingual Alignment for Large Language Models

Current large language models (LLMs) generally show a significant performance gap in alignment between English and other languages. To bridge this gap, existing research typically leverages the model's responses in English as a reference to…

Computation and Language · Computer Science 2025-09-16 Xue Zhang , Yunlong Liang , Fandong Meng , Songming Zhang , Yufeng Chen , Jinan Xu , Jie Zhou

Data Selection for LLM Alignment Using Fine-Grained Preferences

Large language models (LLMs) alignment aims to ensure that the behavior of LLMs meets human preferences. While collecting data from multiple fine-grained, aspect-specific preferences becomes more and more feasible, existing alignment…

Machine Learning · Computer Science 2026-03-03 Jia Zhang , Yao Liu , Chen-Xi Zhang , Yi Liu , Yi-Xuan Jin , Lan-Zhe Guo , Yu-Feng Li

Improving Multi-agent Coordination by Learning to Estimate Contention

We present a multi-agent learning algorithm, ALMA-Learning, for efficient and fair allocations in large-scale systems. We circumvent the traditional pitfalls of multi-agent learning (e.g., the moving target problem, the curse of…

Multiagent Systems · Computer Science 2021-06-22 Panayiotis Danassis , Florian Wiedemair , Boi Faltings

MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time

Large Language Models (LLMs) acquire extensive knowledge and remarkable abilities from extensive text corpora, making them powerful tools for various applications. To make LLMs more usable, aligning them with human preferences is essential.…

Computation and Language · Computer Science 2024-10-21 Mozhi Zhang , Pengyu Wang , Chenkun Tan , Mianqiu Huang , Dong Zhang , Yaqian Zhou , Xipeng Qiu

AdaMerging: Adaptive Model Merging for Multi-Task Learning

Multi-task learning (MTL) aims to empower a model to tackle multiple tasks simultaneously. A recent development known as task arithmetic has revealed that several models, each fine-tuned for distinct tasks, can be directly merged into a…

Machine Learning · Computer Science 2024-05-29 Enneng Yang , Zhenyi Wang , Li Shen , Shiwei Liu , Guibing Guo , Xingwei Wang , Dacheng Tao

ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models

Aligning general-purpose large language models (LLMs) to downstream tasks often incurs significant training adjustment costs. Prior research has explored various avenues to enhance alignment efficiency, primarily through minimal-data…

Computation and Language · Computer Science 2025-06-19 Hao Chen , Haoze Li , Zhiqing Xiao , Lirong Gao , Qi Zhang , Xiaomeng Hu , Ningtao Wang , Xing Fu , Junbo Zhao

Probabilistic Token Alignment for Large Language Model Fusion

Training large language models (LLMs) from scratch can yield models with unique functionalities and strengths, but it is costly and often leads to redundant capabilities. A more cost-effective alternative is to fuse existing pre-trained…

Computation and Language · Computer Science 2025-09-23 Runjia Zeng , James Chenhao Liang , Cheng Han , Zhiwen Cao , Jiahao Liu , Xiaojun Quan , Yingjie Victor Chen , Lifu Huang , Tong Geng , Qifan Wang , Dongfang Liu

Towards Efficient and Effective Alignment of Large Language Models

Large language models (LLMs) exhibit remarkable capabilities across diverse tasks, yet aligning them efficiently and effectively with human expectations remains a critical challenge. This thesis advances LLM alignment by introducing novel…

Computation and Language · Computer Science 2025-06-12 Yuxin Jiang

MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding

Vision-language alignment in multi-modal large language models (MLLMs) relies on supervised fine-tuning (SFT) or reinforcement learning (RL). To align multi-modal large language models (MLLMs) in the post-training stage, supervised…

Computer Vision and Pattern Recognition · Computer Science 2026-02-24 Xin Jin , Siyuan Li , Siyong Jian , Kai Yu , Huan Wang

Inference-Aware Meta-Alignment of LLMs via Non-Linear GRPO

Aligning large language models (LLMs) to diverse human preferences is fundamentally challenging since criteria can often conflict with each other. Inference-time alignment methods have recently gained popularity as they allow LLMs to be…

Machine Learning · Statistics 2026-02-03 Shokichi Takakura , Akifumi Wachi , Rei Higuchi , Kohei Miyaguchi , Taiji Suzuki