Related papers: Transfer Learning for Sequence Generation: from Si…

Attention Strategies for Multi-Source Sequence-to-Sequence Learning

Modeling attention in neural multi-source sequence-to-sequence learning remains a relatively unexplored area, despite its usefulness in tasks that incorporate multiple source languages or modalities. We propose two novel approaches to…

Computation and Language · Computer Science 2017-04-24 Jindřich Libovický , Jindřich Helcl

Learning to Transfer Prompts for Text Generation

Pretrained language models (PLMs) have made remarkable progress in text generation tasks via fine-tuning. While, it is challenging to fine-tune PLMs in a data-scarce situation. Therefore, it is non-trivial to develop a general and…

Computation and Language · Computer Science 2022-05-17 Junyi Li , Tianyi Tang , Jian-Yun Nie , Ji-Rong Wen , Wayne Xin Zhao

Multi-Source Syntactic Neural Machine Translation

We introduce a novel multi-source technique for incorporating source syntax into neural machine translation using linearized parses. This is achieved by employing separate encoders for the sequential and parsed versions of the same source…

Computation and Language · Computer Science 2018-08-31 Anna Currey , Kenneth Heafield

Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning

Fine-tuning pre-trained generative language models to down-stream language generation tasks has shown promising results. However, this comes with the cost of having a single, large model for each task, which is not ideal in low-memory/power…

Computation and Language · Computer Science 2020-09-22 Zhaojiang Lin , Andrea Madotto , Pascale Fung

muNet: Evolving Pretrained Deep Neural Networks into Scalable Auto-tuning Multitask Systems

Most uses of machine learning today involve training a model from scratch for a particular task, or sometimes starting with a model pretrained on a related task and then fine-tuning on a downstream task. Both approaches offer limited…

Machine Learning · Computer Science 2022-05-26 Andrea Gesmundo , Jeff Dean

Summarize and Generate to Back-translate: Unsupervised Translation of Programming Languages

Back-translation is widely known for its effectiveness in neural machine translation when there is little to no parallel data. In this approach, a source-to-target model is coupled with a target-to-source model trained in parallel. The…

Computation and Language · Computer Science 2023-02-14 Wasi Uddin Ahmad , Saikat Chakraborty , Baishakhi Ray , Kai-Wei Chang

Identification of Negative Transfers in Multitask Learning Using Surrogate Models

Multitask learning is widely used in practice to train a low-resource target task by augmenting it with multiple related source tasks. Yet, naively combining all the source tasks with a target task does not always improve the prediction…

Machine Learning · Computer Science 2023-12-29 Dongyue Li , Huy L. Nguyen , Hongyang R. Zhang

Diverse Pretrained Context Encodings Improve Document Translation

We propose a new architecture for adapting a sentence-level sequence-to-sequence transformer by incorporating multiple pretrained document context signals and assess the impact on translation performance of (1) different pretraining…

Computation and Language · Computer Science 2021-08-02 Domenic Donato , Lei Yu , Chris Dyer

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

Prompt tuning, in which a base pretrained model is adapted to each task via conditioning on learned prompt vectors, has emerged as a promising approach for efficiently adapting large language models to multiple downstream tasks. However,…

Computation and Language · Computer Science 2023-03-07 Zhen Wang , Rameswar Panda , Leonid Karlinsky , Rogerio Feris , Huan Sun , Yoon Kim

MSRS: Evaluating Multi-Source Retrieval-Augmented Generation

Retrieval-augmented systems are typically evaluated in settings where information required to answer the query can be found within a single source or the answer is short-form or factoid-based. However, many real-world applications demand…

Computation and Language · Computer Science 2025-08-29 Rohan Phanse , Yijie Zhou , Kejian Shi , Wencai Zhang , Yixin Liu , Yilun Zhao , Arman Cohan

Multi-Source Transfer Learning for Deep Model-Based Reinforcement Learning

A crucial challenge in reinforcement learning is to reduce the number of interactions with the environment that an agent requires to master a given task. Transfer learning proposes to address this issue by re-using knowledge from previously…

Machine Learning · Computer Science 2023-04-28 Remo Sasso , Matthia Sabatelli , Marco A. Wiering

MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation

Pose-guided person image generation usually involves using paired source-target images to supervise the training, which significantly increases the data preparation effort and limits the application of the models. To deal with this problem,…

Computer Vision and Pattern Recognition · Computer Science 2021-04-12 Tianxiang Ma , Bo Peng , Wei Wang , Jing Dong

Unified Segment-to-Segment Framework for Simultaneous Sequence Generation

Simultaneous sequence generation is a pivotal task for real-time scenarios, such as streaming speech recognition, simultaneous machine translation and simultaneous speech translation, where the target sequence is generated while receiving…

Computation and Language · Computer Science 2023-12-01 Shaolei Zhang , Yang Feng

MSG: Multi-Stream Generative Policies for Sample-Efficient Robotic Manipulation

Generative robot policies such as Flow Matching offer flexible, multi-modal policy learning but are sample-inefficient. Although object-centric policies improve sample efficiency, it does not resolve this limitation. In this work, we…

Robotics · Computer Science 2026-04-01 Jan Ole von Hartz , Lukas Schweizer , Joschka Boedecker , Abhinav Valada

Pretrained Language Models for Dialogue Generation with Multiple Input Sources

Large-scale pretrained language models have achieved outstanding performance on natural language understanding tasks. However, it is still under investigating how to apply them to dialogue generation tasks, especially those with responses…

Computation and Language · Computer Science 2020-10-16 Yu Cao , Wei Bi , Meng Fang , Dacheng Tao

Multi-Stage Transfer Learning with an Application to Selection Process

In multi-stage processes, decisions happen in an ordered sequence of stages. Many of them have the structure of dual funnel problem: as the sample size decreases from one stage to the other, the information increases. A related example is a…

Machine Learning · Computer Science 2020-06-03 Andre Mendes , Julian Togelius , Leandro dos Santos Coelho

Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding

With the great success of pre-trained models, the pretrain-then-finetune paradigm has been widely adopted on downstream tasks for source code understanding. However, compared to costly training a large-scale model from scratch, how to…

Software Engineering · Computer Science 2022-03-16 Deze Wang , Zhouyang Jia , Shanshan Li , Yue Yu , Yun Xiong , Wei Dong , Xiangke Liao

MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning

In sequence to sequence learning, the self-attention mechanism proves to be highly effective, and achieves significant improvements in many tasks. However, the self-attention mechanism is not without its own flaws. Although self-attention…

Computation and Language · Computer Science 2019-11-22 Guangxiang Zhao , Xu Sun , Jingjing Xu , Zhiyuan Zhang , Liangchen Luo

Generating Sequences by Learning to Self-Correct

Sequence generation applications require satisfying semantic constraints, such as ensuring that programs are correct, using certain keywords, or avoiding undesirable content. Language models, whether fine-tuned or prompted with few-shot…

Computation and Language · Computer Science 2022-11-02 Sean Welleck , Ximing Lu , Peter West , Faeze Brahman , Tianxiao Shen , Daniel Khashabi , Yejin Choi

Cross-Lingual Natural Language Generation via Pre-Training

In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and…

Computation and Language · Computer Science 2019-11-25 Zewen Chi , Li Dong , Furu Wei , Wenhui Wang , Xian-Ling Mao , Heyan Huang