English
Related papers

Related papers: Parallel-SFT: Improving Zero-Shot Cross-Programmin…

200 papers

Large language models (LLMs) have achieved state-of-the-art performance in various software engineering tasks, including error detection, clone detection, and code translation, primarily leveraging high-resource programming languages like…

Computation and Language · Computer Science 2025-06-11 Razan Baltaji , Saurabh Pujar , Louis Mandel , Martin Hirzel , Luca Buratti , Lav Varshney

Zero-shot cross-lingual transfer is a central task in multilingual NLP, allowing models trained in languages with more sufficient training resources to generalize to other low-resource languages. Earlier efforts on this task use parallel…

Computation and Language · Computer Science 2023-09-21 Fei Wang , Kuan-Hao Huang , Kai-Wei Chang , Muhao Chen

Recently, although pre-trained language models have achieved great success on multilingual NLP (Natural Language Processing) tasks, the lack of training data on many tasks in low-resource languages still limits their performance. One…

Computation and Language · Computer Science 2023-10-10 Yuyang Zhang , Xiaofeng Han , Baojun Wang

Modern NLP applications have enjoyed a great boost utilizing neural networks models. Such deep neural models, however, are not applicable to most human languages due to the lack of annotated training data for various NLP tasks.…

Computation and Language · Computer Science 2019-06-06 Xilun Chen , Ahmed Hassan Awadallah , Hany Hassan , Wei Wang , Claire Cardie

Adapting large language models (LLMs) to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT). However, this CT-then-SFT approach struggles with limited data in the context of low-resource…

Computation and Language · Computer Science 2025-02-10 Mingxu Tao , Chen Zhang , Quzhe Huang , Tianyao Ma , Songfang Huang , Dongyan Zhao , Yansong Feng

Adapting large language models to other languages typically employs supervised fine-tuning (SFT) as a standard approach. However, it often suffers from an overemphasis on English performance, a phenomenon that is especially pronounced in…

Computation and Language · Computer Science 2025-05-21 Jungseob Lee , Seongtae Hong , Hyeonseok Moon , Heuiseok Lim

Recent multilingual pretrained language models (mPLMs) have been shown to encode strong language-specific signals, which are not explicitly provided during pretraining. It remains an open question whether it is feasible to employ mPLMs to…

Computation and Language · Computer Science 2024-07-08 Peiqin Lin , Chengzhi Hu , Zheyu Zhang , André F. T. Martins , Hinrich Schütze

Automatic code generation has been a longstanding research topic. With the advancement of general-purpose large language models (LLMs), the ability to code stands out as one important measure to the model's reasoning performance. Usually, a…

Software Engineering · Computer Science 2024-12-18 Jie Chen , Xintian Han , Yu Ma , Xun Zhou , Liang Xiang

Prompt tuning (PT) is a promising parameter-efficient method to utilize extremely large pre-trained language models (PLMs), which can achieve comparable performance to full-parameter fine-tuning by only tuning a few soft prompts. However,…

Computation and Language · Computer Science 2023-12-19 Yusheng Su , Xiaozhi Wang , Yujia Qin , Chi-Min Chan , Yankai Lin , Huadong Wang , Kaiyue Wen , Zhiyuan Liu , Peng Li , Juanzi Li , Lei Hou , Maosong Sun , Jie Zhou

In zero-shot cross-lingual transfer, a supervised NLP task trained on a corpus in one language is directly applicable to another language without any additional training. A source of cross-lingual transfer can be as straightforward as…

Computation and Language · Computer Science 2021-01-27 Hyunjin Choi , Judong Kim , Seongho Joe , Seungjai Min , Youngjune Gwon

Despite the recent popularity of Large Language Models (LLMs) in Machine Translation (MT), their performance in low-resource languages (LRLs) still lags significantly behind Neural Machine Translation (NMT) models. In this work, we explore…

Computation and Language · Computer Science 2024-10-07 Vivek Iyer , Bhavitvya Malik , Pavel Stepachev , Pinzhen Chen , Barry Haddow , Alexandra Birch

The use of Large Language Models (LLMs) for program code generation has gained substantial attention, but their biases and limitations with non-English prompts challenge global inclusivity. This paper investigates the complexities of…

Computation and Language · Computer Science 2025-05-13 Mingda Li , Abhijit Mishra , Utkarsh Mujumdar

Zero-shot cross-lingual transfer by fine-tuning multilingual pretrained models shows promise for low-resource languages, but often suffers from misalignment of internal representations between languages. We hypothesize that even when the…

Computation and Language · Computer Science 2024-09-18 Ryokan Ri , Shun Kiyono , Sho Takase

Recent advancements in Reinforcement Post-Training (RPT) have significantly enhanced the capabilities of Large Reasoning Models (LRMs), sparking increased interest in the generalization of RL-based reasoning. While existing work has…

Computation and Language · Computer Science 2025-10-03 Wen Yang , Junhong Wu , Chong Li , Chengqing Zong , Jiajun Zhang

Large Language Models (LLMs) are powerful but often too slow and costly for real-world use during inference. Looped transformers save on parameters by reusing the same weights for multiple computational steps, or "loops." However, this…

Computation and Language · Computer Science 2025-10-30 Bohong Wu , Mengzhao Chen , Xiang Luo , Shen Yan , Qifan Yu , Fan Xia , Tianqi Zhang , Hongrui Zhan , Zheng Zhong , Xun Zhou , Siyuan Qiao , Xingyan Bin

Large language models (LLMs) have shown remarkable abilities in diverse natural language processing (NLP) tasks. The LLMs generally undergo supervised fine-tuning (SFT) followed by preference alignment to be usable in downstream…

Computation and Language · Computer Science 2024-06-27 Shiva Kumar Pentyala , Zhichao Wang , Bin Bi , Kiran Ramnath , Xiang-Bo Mao , Regunathan Radhakrishnan , Sitaram Asur , Na , Cheng

Learning what to share between tasks has been a topic of great importance recently, as strategic sharing of knowledge has been shown to improve downstream task performance. This is particularly important for multilingual applications, as…

Computation and Language · Computer Science 2020-10-06 Farhad Nooralahzadeh , Giannis Bekoulis , Johannes Bjerva , Isabelle Augenstein

Nowadays, training end-to-end neural models for spoken language translation (SLT) still has to confront with extreme data scarcity conditions. The existing SLT parallel corpora are indeed orders of magnitude smaller than those available for…

Computation and Language · Computer Science 2019-10-09 Mattia Antonino Di Gangi , Matteo Negri , Marco Turchi

Parallel thinking has emerged as a novel approach for enhancing the reasoning capabilities of large language models (LLMs) by exploring multiple reasoning paths concurrently. However, activating such capabilities through training remains…

Computation and Language · Computer Science 2025-09-15 Tong Zheng , Hongming Zhang , Wenhao Yu , Xiaoyang Wang , Runpeng Dai , Rui Liu , Huiwen Bao , Chengsong Huang , Heng Huang , Dong Yu

Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the…

Computation and Language · Computer Science 2019-12-04 Baijun Ji , Zhirui Zhang , Xiangyu Duan , Min Zhang , Boxing Chen , Weihua Luo
‹ Prev 1 2 3 10 Next ›