English
Related papers

Related papers: Unsupervised Data Validation Methods for Efficient…

200 papers

Deep neural networks and huge language models are becoming omnipresent in natural language applications. As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in…

Computation and Language · Computer Science 2021-04-12 Michael A. Hedderich , Lukas Lange , Heike Adel , Jannik Strötgen , Dietrich Klakow

Recent studies show that large language models (LLMs) are powerful tools for working with natural language, bringing advances in many areas of computational linguistics. However, these models face challenges when applied to low-resource…

Computation and Language · Computer Science 2024-12-09 Zhaojun Ding , Zhengliang Liu , Hanqi Jiang , Yizhu Gao , Xiaoming Zhai , Tianming Liu , Ninghao Liu

The advent of Large Language Models (LLMs) has significantly advanced the field of automated code generation. LLMs rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource…

Software Engineering · Computer Science 2025-02-03 Alessandro Giagnorio , Alberto Martin-Lopez , Gabriele Bavota

Large language models (LLMs) demonstrate unprecedented capabilities and define the state of the art for almost all natural language processing (NLP) tasks and also for essentially all Language Technology (LT) applications. LLMs can only be…

Computation and Language · Computer Science 2025-02-19 Georg Rehm , Annika Grützner-Zahn , Fabio Barth

The performance of NLP methods for severely under-resourced languages cannot currently hope to match the state of the art in NLP methods for well resourced languages. We explore the extent to which pretrained large language models (LLMs)…

Computation and Language · Computer Science 2024-02-20 Michela Lorandi , Anya Belz

The advent of deep learning has led to a significant gain in machine translation. However, most of the studies required a large parallel dataset which is scarce and expensive to construct and even unavailable for some languages. This paper…

Computation and Language · Computer Science 2023-04-04 Viet H. Pham , Thang M. Pham , Giang Nguyen , Long Nguyen , Dien Dinh

Language models (LMs) have demonstrated remarkable capabilities in NLP, yet adapting them efficiently and robustly to specific tasks remains challenging. As their scale and complexity grow, fine-tuning LMs on labelled data often…

Computation and Language · Computer Science 2025-06-27 Zhengyan Shi

Fine-tuning large language models (LLMs) with limited data poses a practical challenge in low-resource languages, specialized domains, and constrained deployment settings. While pre-trained LLMs provide strong foundations, effective…

Computation and Language · Computer Science 2025-10-29 Marton Szep , Daniel Rueckert , Rüdiger von Eisenhart-Rothe , Florian Hinterwimmer

Post-training of Large Language Models (LLMs) is crucial for unlocking their task generalization potential and domain-specific capabilities. However, the current LLM post-training paradigm faces significant data challenges, including the…

Computation and Language · Computer Science 2025-10-31 Junyu Luo , Bohan Wu , Xiao Luo , Zhiping Xiao , Yiqiao Jin , Rong-Cheng Tu , Nan Yin , Yifan Wang , Jingyang Yuan , Wei Ju , Ming Zhang

In recent years, large language models (LLMs) have achieved remarkable success in natural language processing (NLP). LLMs require an extreme amount of parameters to attain high performance. As models grow into the trillion-parameter range,…

Computation and Language · Computer Science 2024-09-10 Zhyar Rzgar K Rostam , Sándor Szénási , Gábor Kertész

Real-world applications of natural language processing (NLP) are challenging. NLP models rely heavily on supervised machine learning and require large amounts of annotated data. These resources are often based on language data available in…

Computation and Language · Computer Science 2020-11-10 Farhad Nooralahzadeh

Large Language Models (LLMs) have become a milestone in the field of artificial intelligence and natural language processing. However, their large-scale deployment remains constrained by the need for significant computational resources.…

Computation and Language · Computer Science 2025-08-07 Julián Camilo Velandia Gutiérrez

Learned metrics such as BLEURT have in recent years become widely employed to evaluate the quality of machine translation systems. Training such metrics requires data which can be expensive and difficult to acquire, particularly for…

Computation and Language · Computer Science 2023-02-08 Amirkeivan Mohtashami , Mauro Verzetti , Paul K. Rubenstein

Text-to-Speech (TTS) synthesis using deep learning relies on voice quality. Modern TTS models are advanced, but they need large amount of data. Given the growing computational complexity of these models and the scarcity of large,…

Sound · Computer Science 2023-10-10 Ze Liu

Recent speech technologies have led to produce high quality synthesised speech due to recent advances in neural Text to Speech (TTS). However, such TTS models depend on extensive amounts of data that can be costly to produce and is hardly…

Computation and Language · Computer Science 2024-09-04 Asma Amalas , Mounir Ghogho , Mohamed Chetouani , Rachid Oulad Haj Thami

Neural machine translation (NMT) approaches have improved the state of the art in many machine translation settings over the last couple of years, but they require large amounts of training data to produce sensible output. We demonstrate…

Computation and Language · Computer Science 2017-08-22 Robert Östling , Jörg Tiedemann

The increasing size and complexity of pre-trained language models have demonstrated superior performance in many applications, but they usually require large training datasets to be adequately trained. Insufficient training sets could…

Computation and Language · Computer Science 2025-02-03 Yaping Chai , Haoran Xie , Joe S. Qin

In recent years, pretrained neural language models (PNLMs) have taken the field of natural language processing by storm, achieving new benchmarks and state-of-the-art performances. These models often rely heavily on annotated data, which…

Computation and Language · Computer Science 2023-02-06 Hoang Van

Generative language modelling has surged in popularity with the emergence of services such as ChatGPT and Google Gemini. While these models have demonstrated transformative potential in productivity and communication, they overwhelmingly…

Computation and Language · Computer Science 2025-07-09 Josh McGiff , Nikola S. Nikolov

Natural language generation (NLG) is a critical component in conversational systems, owing to its role of formulating a correct and natural text response. Traditionally, NLG components have been deployed using template-based solutions.…

‹ Prev 1 2 3 10 Next ›