English
Related papers

Related papers: Augmenting Parameter-Efficient Pre-trained Languag…

200 papers

Large language models (LLMs) show best-in-class performance across a wide range of natural language processing applications. Training these models is an extremely computationally expensive task; frontier Artificial Intelligence (AI)…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-10 Alexander Interrante-Grant , Carla Varela-Rosa , Suhaas Narayan , Chris Connelly , Albert Reuther

Language model pre-training has proven to be useful in many language understanding tasks. In this paper, we investigate whether it is still helpful to add the self-training method in the pre-training step and the fine-tuning step. Towards…

Computation and Language · Computer Science 2023-02-17 Tong Guo

Fine-tuning large language models (LLMs) with limited data poses a practical challenge in low-resource languages, specialized domains, and constrained deployment settings. While pre-trained LLMs provide strong foundations, effective…

Computation and Language · Computer Science 2025-10-29 Marton Szep , Daniel Rueckert , Rüdiger von Eisenhart-Rothe , Florian Hinterwimmer

Large Language Models (LLMs) are transforming cybersecurity by enabling intelligent, adaptive, and automated approaches to threat detection, vulnerability assessment, and incident response. With their advanced language understanding and…

Cryptography and Security · Computer Science 2025-07-21 Niveen O. Jaffal , Mohammed Alkhanafseh , David Mohaisen

Large pre-trained language models (PLMs) have demonstrated strong performance on natural language understanding (NLU) tasks through fine-tuning. However, fine-tuned models still suffer from overconfident predictions, especially in…

Computation and Language · Computer Science 2023-05-31 Guande He , Jianfei Chen , Jun Zhu

Large Language Models (LLMs) have quickly risen to prominence due to their ability to perform at or close to the state-of-the-art in a variety of fields while handling natural language. An important field of research is the application of…

Cryptography and Security · Computer Science 2024-02-28 Gabriel de Jesus Coelho da Silva , Carlos Becker Westphall

Safety for Large Language Models (LLMs) has been an ongoing research focus since their emergence and is even more relevant nowadays with the increasing capacity of those models. Currently, there are several guardrails in place for all…

Computation and Language · Computer Science 2025-12-25 Eduard Stefan Dinuta , Iustin Sirbu , Traian Rebedea

Fueled by their remarkable ability to tackle diverse tasks across multiple domains, large language models (LLMs) have grown at an unprecedented rate, with some recent models containing trillions of parameters. This growth is accompanied by…

Machine Learning · Computer Science 2025-05-30 Athanasios Glentis , Jiaxiang Li , Qiulin Shang , Andi Han , Ioannis Tsaknakis , Quan Wei , Mingyi Hong

Language models (LMs) have demonstrated remarkable capabilities in NLP, yet adapting them efficiently and robustly to specific tasks remains challenging. As their scale and complexity grow, fine-tuning LMs on labelled data often…

Computation and Language · Computer Science 2025-06-27 Zhengyan Shi

Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach,…

The use of Large Language Models (LLM) by providers of cybersecurity and digital infrastructures of all kinds is an ongoing development. It is suggested and on an experimental basis used to write the code for the systems, and potentially…

Computers and Society · Computer Science 2025-12-19 Kaspar Rosager Ludvigsen

Large language models are trained on massive scrapes of the web, as required by current scaling laws. Most progress is made for English, given its abundance of high-quality pretraining data. For most other languages, however, such high…

Computation and Language · Computer Science 2025-02-07 Skyler Seto , Maartje ter Hoeve , Richard He Bai , Natalie Schluter , David Grangier

The pre-training phase of language models often begins with randomly initialized parameters. With the current trends in scaling models, training their large number of parameters can be extremely slow and costly. In contrast, small language…

Computation and Language · Computer Science 2024-09-23 Mohammad Samragh , Iman Mirzadeh , Keivan Alizadeh Vahid , Fartash Faghri , Minsik Cho , Moin Nabi , Devang Naik , Mehrdad Farajtabar

Large Language Models (LLMs) have become a milestone in the field of artificial intelligence and natural language processing. However, their large-scale deployment remains constrained by the need for significant computational resources.…

Computation and Language · Computer Science 2025-08-07 Julián Camilo Velandia Gutiérrez

Because state-of-the-art language models are expensive to train, most practitioners must make use of one of the few publicly available language models or language model APIs. This consolidation of trust increases the potency of backdoor…

Cryptography and Security · Computer Science 2023-07-28 Nikhil Kandpal , Matthew Jagielski , Florian Tramèr , Nicholas Carlini

The recent surge of generative AI has been fueled by the generative power of diffusion probabilistic models and the scalable capabilities of large language models. Despite their potential, it remains elusive whether diffusion language…

Computation and Language · Computer Science 2025-02-25 Jiasheng Ye , Zaixiang Zheng , Yu Bao , Lihua Qian , Quanquan Gu

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based…

Fine-tuning pre-trained generative language models to down-stream language generation tasks has shown promising results. However, this comes with the cost of having a single, large model for each task, which is not ideal in low-memory/power…

Computation and Language · Computer Science 2020-09-22 Zhaojiang Lin , Andrea Madotto , Pascale Fung

Model merging has emerged as a promising technique for enhancing large language models, though its application in large-scale pre-training remains relatively unexplored. In this paper, we present a comprehensive investigation of model…

With the rapid development and widespread use of advanced network systems, software vulnerabilities pose a significant threat to secure communications and networking. Learning-based vulnerability detection systems, particularly those…

Cryptography and Security · Computer Science 2024-10-04 Weiliang Qi , Jiahao Cao , Darsh Poddar , Sophia Li , Xinda Wang
‹ Prev 1 2 3 10 Next ›