English
Related papers

Related papers: Pretrained Generative Language Models as General L…

200 papers

Fine-tuning pre-trained generative language models to down-stream language generation tasks has shown promising results. However, this comes with the cost of having a single, large model for each task, which is not ideal in low-memory/power…

Computation and Language · Computer Science 2020-09-22 Zhaojiang Lin , Andrea Madotto , Pascale Fung

Recent pretrained language models extend from millions to billions of parameters. Thus the need to fine-tune an extremely large pretrained model with a limited training corpus arises in various downstream tasks. In this paper, we propose a…

Computation and Language · Computer Science 2021-09-14 Runxin Xu , Fuli Luo , Zhiyuan Zhang , Chuanqi Tan , Baobao Chang , Songfang Huang , Fei Huang

Foundation models have received much attention due to their effectiveness across a broad range of downstream applications. Though there is a big convergence in terms of architecture, most pretrained models are typically still developed for…

Computation and Language · Computer Science 2022-06-14 Yaru Hao , Haoyu Song , Li Dong , Shaohan Huang , Zewen Chi , Wenhui Wang , Shuming Ma , Furu Wei

Pre-trained language models have recently emerged as a powerful tool for fine-tuning a variety of language tasks. Ideally, when models are pre-trained on large amount of data, they are expected to gain implicit knowledge. In this paper, we…

Computation and Language · Computer Science 2023-06-22 Mohamad Ballout , Ulf Krumnack , Gunther Heidemann , Kai-Uwe Kühnberger

Language model (LM) pre-training is useful in many language processing tasks. But can pre-trained LMs be further leveraged for more general machine learning problems? We propose an approach for using LMs to scaffold learning and…

Sequence-to-sequence learning with neural networks has become the de facto standard for sequence prediction tasks. This approach typically models the local distribution over the next word with a powerful neural network that can condition on…

Computation and Language · Computer Science 2021-11-17 Yoon Kim

Continued pre-training of small language models offers a promising path for domain adaptation with limited computational resources. I've investigated this approach within educational domains, evaluating it as a resource-efficient…

Computation and Language · Computer Science 2025-04-15 Salman Faroz

Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate high-fidelity synthetic data. In speech, text-to-speech…

Audio and Speech Processing · Electrical Eng. & Systems 2024-03-27 Alexander H. Liu , Matt Le , Apoorv Vyas , Bowen Shi , Andros Tjandra , Wei-Ning Hsu

The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks. These computationally expensive models have begun to be applied to…

Computers and Society · Computer Science 2019-12-03 Benjamin Clavié , Kobi Gal

Large language models (LLMs) are a basic infrastructure for modern natural language processing. Many commercial and open-source LLMs exist for English, e.g., ChatGPT, Llama, Falcon, and Mistral. As these models are trained on mostly English…

Computation and Language · Computer Science 2024-10-10 Domen Vreš , Martin Božič , Aljaž Potočnik , Tomaž Martinčič , Marko Robnik-Šikonja

Large pretrained language models (PLMs) are often domain- or task-adapted via fine-tuning or prompting. Finetuning requires modifying all of the parameters and having enough data to avoid overfitting while prompting requires no training and…

Computation and Language · Computer Science 2022-07-11 Zejiang Hou , Julian Salazar , George Polovets

Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators. However, as our systematic examination reveals, it is still challenging for such models to generate coherent…

Computation and Language · Computer Science 2021-04-15 Bowen Tan , Zichao Yang , Maruan AI-Shedivat , Eric P. Xing , Zhiting Hu

Aligning language models (LMs) with preferences is an important problem in natural language generation. A key challenge is that preferences are typically provided at the sequence level while LM training and generation both occur at the…

Computation and Language · Computer Science 2025-01-09 Shentao Yang , Shujian Zhang , Congying Xia , Yihao Feng , Caiming Xiong , Mingyuan Zhou

The recent surge of generative AI has been fueled by the generative power of diffusion probabilistic models and the scalable capabilities of large language models. Despite their potential, it remains elusive whether diffusion language…

Computation and Language · Computer Science 2025-02-25 Jiasheng Ye , Zaixiang Zheng , Yu Bao , Lihua Qian , Quanquan Gu

Pre-trained language models have shown remarkable success in improving various downstream NLP tasks due to their ability to capture dependencies in textual data and generate natural responses. In this paper, we leverage the power of…

Computation and Language · Computer Science 2020-06-30 Hung Le , Steven C. H. Hoi

We report a flexible language-model based deep learning strategy, applied here to solve complex forward and inverse problems in protein modeling, based on an attention neural network that integrates transformer and graph convolutional…

Biomolecules · Quantitative Biology 2023-10-20 Markus J. Buehler

Large Language Models (LLMs), trained on extensive web-scale corpora, have demonstrated remarkable abilities across diverse tasks, especially as they are scaled up. Nevertheless, even state-of-the-art models struggle in certain cases,…

Computation and Language · Computer Science 2025-01-16 Irina Bigoulaeva , Harish Tayyar Madabushi , Iryna Gurevych

Pre-trained models have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. Recent works such as T5 and GPT-3 have shown that scaling up pre-trained language models can improve their generalization…

Transformer-based language models pre-trained on large amounts of text data have proven remarkably successful in learning generic transferable linguistic representations. Here we study whether structural guidance leads to more human-like…

Computation and Language · Computer Science 2021-08-03 Peng Qian , Tahira Naseem , Roger Levy , Ramón Fernandez Astudillo

Solving symbolic mathematics has always been of in the arena of human ingenuity that needs compositional reasoning and recurrence. However, recent studies have shown that large-scale language models such as transformers are universal and…

Machine Learning · Statistics 2023-03-15 Kimia Noorbakhsh , Modar Sulaiman , Mahdi Sharifi , Kallol Roy , Pooyan Jamshidi
‹ Prev 1 2 3 10 Next ›