Related papers: Parallel-SFT: Improving Zero-Shot Cross-Programmin…

Cross-lingual Transfer in Programming Languages: An Extensive Empirical Study

Large language models (LLMs) have achieved state-of-the-art performance in various software engineering tasks, including error detection, clone detection, and code translation, primarily leveraging high-resource programming languages like…

Computation and Language · Computer Science 2025-06-11 Razan Baltaji , Saurabh Pujar , Louis Mandel , Martin Hirzel , Luca Buratti , Lav Varshney

Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Zero-shot cross-lingual transfer is a central task in multilingual NLP, allowing models trained in languages with more sufficient training resources to generalize to other low-resource languages. Earlier efforts on this task use parallel…

Computation and Language · Computer Science 2023-09-21 Fei Wang , Kuan-Hao Huang , Kai-Wei Chang , Muhao Chen

Zero-shot Cross-lingual Transfer without Parallel Corpus

Recently, although pre-trained language models have achieved great success on multilingual NLP (Natural Language Processing) tasks, the lack of training data on many tasks in low-resource languages still limits their performance. One…

Computation and Language · Computer Science 2023-10-10 Yuyang Zhang , Xiaofeng Han , Baojun Wang

Multi-Source Cross-Lingual Model Transfer: Learning What to Share

Modern NLP applications have enjoyed a great boost utilizing neural networks models. Such deep neural models, however, are not applicable to most human languages due to the lack of annotated training data for various NLP tasks.…

Computation and Language · Computer Science 2019-06-06 Xilun Chen , Ahmed Hassan Awadallah , Hany Hassan , Wei Wang , Claire Cardie

Unlocking the Potential of Model Merging for Low-Resource Languages

Adapting large language models (LLMs) to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT). However, this CT-then-SFT approach struggles with limited data in the context of low-resource…

Computation and Language · Computer Science 2025-02-10 Mingxu Tao , Chen Zhang , Quzhe Huang , Tianyao Ma , Songfang Huang , Dongyan Zhao , Yansong Feng

Cross-Lingual Optimization for Language Transfer in Large Language Models

Adapting large language models to other languages typically employs supervised fine-tuning (SFT) as a standard approach. However, it often suffers from an overemphasis on English performance, a phenomenon that is especially pronounced in…

Computation and Language · Computer Science 2025-05-21 Jungseob Lee , Seongtae Hong , Hyeonseok Moon , Heuiseok Lim

mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models

Recent multilingual pretrained language models (mPLMs) have been shown to encode strong language-specific signals, which are not explicitly provided during pretraining. It remains an open question whether it is feasible to employ mPLMs to…

Computation and Language · Computer Science 2024-07-08 Peiqin Lin , Chengzhi Hu , Zheyu Zhang , André F. T. Martins , Hinrich Schütze

Unlock the Correlation between Supervised Fine-Tuning and Reinforcement Learning in Training Code Large Language Models

Automatic code generation has been a longstanding research topic. With the advancement of general-purpose large language models (LLMs), the ability to code stands out as one important measure to the model's reasoning performance. Usually, a…

Software Engineering · Computer Science 2024-12-18 Jie Chen , Xintian Han , Yu Ma , Xun Zhou , Liang Xiang

On Transferability of Prompt Tuning for Natural Language Processing

Prompt tuning (PT) is a promising parameter-efficient method to utilize extremely large pre-trained language models (PLMs), which can achieve comparable performance to full-parameter fine-tuning by only tuning a few soft prompts. However,…

Computation and Language · Computer Science 2023-12-19 Yusheng Su , Xiaozhi Wang , Yujia Qin , Chi-Min Chan , Yankai Lin , Huadong Wang , Kaiyue Wen , Zhiyuan Liu , Peng Li , Juanzi Li , Lei Hou , Maosong Sun , Jie Zhou

Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks

In zero-shot cross-lingual transfer, a supervised NLP task trained on a corpus in one language is directly applicable to another language without any additional training. A source of cross-lingual transfer can be as straightforward as…

Computation and Language · Computer Science 2021-01-27 Hyunjin Choi , Judong Kim , Seongho Joe , Seungjai Min , Youngjune Gwon

Quality or Quantity? On Data Scale and Diversity in Adapting Large Language Models for Low-Resource Translation

Despite the recent popularity of Large Language Models (LLMs) in Machine Translation (MT), their performance in low-resource languages (LRLs) still lags significantly behind Neural Machine Translation (NMT) models. In this work, we explore…

Computation and Language · Computer Science 2024-10-07 Vivek Iyer , Bhavitvya Malik , Pavel Stepachev , Pinzhen Chen , Barry Haddow , Alexandra Birch

Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer

The use of Large Language Models (LLMs) for program code generation has gained substantial attention, but their biases and limitations with non-English prompts challenge global inclusivity. This paper investigates the complexities of…

Computation and Language · Computer Science 2025-05-13 Mingda Li , Abhijit Mishra , Utkarsh Mujumdar

Self-Translate-Train: Enhancing Cross-Lingual Transfer of Large Language Models via Inherent Capability

Zero-shot cross-lingual transfer by fine-tuning multilingual pretrained models shows promise for low-resource languages, but often suffers from misalignment of internal representations between languages. We hypothesize that even when the…

Computation and Language · Computer Science 2024-09-18 Ryokan Ri , Shun Kiyono , Sho Takase

Parallel Scaling Law: Unveiling Reasoning Generalization through A Cross-Linguistic Perspective

Recent advancements in Reinforcement Post-Training (RPT) have significantly enhanced the capabilities of Large Reasoning Models (LRMs), sparking increased interest in the generalization of RL-based reasoning. While existing work has…

Computation and Language · Computer Science 2025-10-03 Wen Yang , Junhong Wu , Chong Li , Chengqing Zong , Jiajun Zhang

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

Large Language Models (LLMs) are powerful but often too slow and costly for real-world use during inference. Looped transformers save on parameters by reusing the same weights for multiple computational steps, or "loops." However, this…

Computation and Language · Computer Science 2025-10-30 Bohong Wu , Mengzhao Chen , Xiang Luo , Shen Yan , Qifan Yu , Fan Xia , Tianqi Zhang , Hongrui Zhan , Zheng Zhong , Xun Zhou , Siyuan Qiao , Xingyan Bin

PAFT: A Parallel Training Paradigm for Effective LLM Fine-Tuning

Large language models (LLMs) have shown remarkable abilities in diverse natural language processing (NLP) tasks. The LLMs generally undergo supervised fine-tuning (SFT) followed by preference alignment to be usable in downstream…

Computation and Language · Computer Science 2024-06-27 Shiva Kumar Pentyala , Zhichao Wang , Bin Bi , Kiran Ramnath , Xiang-Bo Mao , Regunathan Radhakrishnan , Sitaram Asur , Na , Cheng

Zero-Shot Cross-Lingual Transfer with Meta Learning

Learning what to share between tasks has been a topic of great importance recently, as strategic sharing of knowledge has been shown to improve downstream task performance. This is particularly important for multilingual applications, as…

Computation and Language · Computer Science 2020-10-06 Farhad Nooralahzadeh , Giannis Bekoulis , Johannes Bjerva , Isabelle Augenstein

One-To-Many Multilingual End-to-end Speech Translation

Nowadays, training end-to-end neural models for spoken language translation (SLT) still has to confront with extreme data scarcity conditions. The existing SLT parallel corpora are indeed orders of magnitude smaller than those available for…

Computation and Language · Computer Science 2019-10-09 Mattia Antonino Di Gangi , Matteo Negri , Marco Turchi

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Parallel thinking has emerged as a novel approach for enhancing the reasoning capabilities of large language models (LLMs) by exploring multiple reasoning paths concurrently. However, activating such capabilities through training remains…

Computation and Language · Computer Science 2025-09-15 Tong Zheng , Hongming Zhang , Wenhao Yu , Xiaoyang Wang , Runpeng Dai , Rui Liu , Huiwen Bao , Chengsong Huang , Heng Huang , Dong Yu

Cross-lingual Pre-training Based Transfer for Zero-shot Neural Machine Translation

Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the…

Computation and Language · Computer Science 2019-12-04 Baijun Ji , Zhirui Zhang , Xiangyu Duan , Min Zhang , Boxing Chen , Weihua Luo