English
Related papers

Related papers: Do Language Models Plagiarize?

200 papers

Recent studies have raised concerns about the potential threats large language models (LLMs) pose to academic integrity and copyright protection. Yet, their investigation is predominantly focused on literal copies of original texts. Also,…

Computation and Language · Computer Science 2025-02-18 Jooyoung Lee , Toshini Agrawal , Adaku Uchendu , Thai Le , Jinghui Chen , Dongwon Lee

Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized training data verbatim. This is undesirable because memorization violates privacy (exposing…

Machine Learning · Computer Science 2023-03-07 Nicholas Carlini , Daphne Ippolito , Matthew Jagielski , Katherine Lee , Florian Tramer , Chiyuan Zhang

To enhance the quality of generated stories, recent story generation models have been investigating the utilization of higher-level attributes like plots or commonsense knowledge. The application of prompt-based learning with large language…

Computation and Language · Computer Science 2023-07-25 Zhuohan Xie , Trevor Cohn , Jey Han Lau

While recent research increasingly showcases the remarkable capabilities of Large Language Models (LLMs), it is equally crucial to examine their associated risks. Among these, privacy and security vulnerabilities are particularly…

Computation and Language · Computer Science 2026-01-21 Ali Satvaty , Suzan Verberne , Fatih Turkmen

In light of recent legal allegations brought by publishers, newspapers, and other creators of copyrighted corpora against large language model developers who use their copyrighted materials for training or fine-tuning purposes, we propose a…

Computation and Language · Computer Science 2024-08-05 Devam Mondal , Carlo Lipizzi

Large Language Models (LLMs) are advancing at a remarkable pace, with myriad applications under development. Unlike most earlier machine learning models, they are no longer built for one specific application but are designed to excel in a…

Computation and Language · Computer Science 2023-10-31 Valentin Hartmann , Anshuman Suri , Vincent Bindschaedler , David Evans , Shruti Tople , Robert West

Recently, large language models such as GPT-2 have shown themselves to be extremely adept at text generation and have also been able to achieve high-quality results in many downstream NLP tasks such as text classification, sentiment…

Computation and Language · Computer Science 2019-11-22 Sam Witteveen , Martin Andrews

Large Language Models (LLMs) are claimed to be capable of Natural Language Inference (NLI), necessary for applied tasks like question answering and summarization. We present a series of behavioral studies on several LLM families (LLaMA,…

Computation and Language · Computer Science 2023-10-24 Nick McKenna , Tianyi Li , Liang Cheng , Mohammad Javad Hosseini , Mark Johnson , Mark Steedman

The rapid progress of Natural Language Processing (NLP) technologies has led to the widespread availability and effectiveness of text generation tools such as ChatGPT and Claude. While highly useful, these technologies also pose significant…

Computation and Language · Computer Science 2024-10-10 Chao Zhou , Cheng Qiu , Lizhen Liang , Daniel E. Acuna

Large language models (LLMs) have shown great capabilities in various tasks but also exhibited memorization of training data, raising tremendous privacy and copyright concerns. While prior works have studied memorization during…

Artificial Intelligence · Computer Science 2024-02-26 Shenglai Zeng , Yaxin Li , Jie Ren , Yiding Liu , Han Xu , Pengfei He , Yue Xing , Shuaiqiang Wang , Jiliang Tang , Dawei Yin

State-of-the-art language models (LMs) are notoriously susceptible to generating hallucinated information. Such inaccurate outputs not only undermine the reliability of these models but also limit their use and raise serious concerns about…

Computation and Language · Computer Science 2024-03-21 Ayush Agrawal , Mirac Suzgun , Lester Mackey , Adam Tauman Kalai

Citation practices are crucial in shaping the structure of scientific knowledge, yet they are often influenced by contemporary norms and biases. The emergence of Large Language Models (LLMs) introduces a new dynamic to these practices.…

Digital Libraries · Computer Science 2024-08-27 Andres Algaba , Carmen Mazijn , Vincent Holst , Floriano Tori , Sylvia Wenmackers , Vincent Ginis

Pre-trained Language Models (PLMs) have shown impressive results in various Natural Language Generation (NLG) tasks, such as powering chatbots and generating stories. However, an ethical concern arises due to their potential to produce…

Computation and Language · Computer Science 2024-06-04 Kaixin Lan , Tao Fang , Derek F. Wong , Yabo Xu , Lidia S. Chao , Cecilia G. Zhao

The recent success of large language models for text generation poses a severe threat to academic integrity, as plagiarists can generate realistic paraphrases indistinguishable from original work. However, the role of large autoregressive…

Computation and Language · Computer Science 2024-02-09 Jan Philip Wahle , Terry Ruas , Frederic Kirstein , Bela Gipp

Language models may memorize more than just facts, including entire chunks of texts seen during training. Fair use exemptions to copyright laws typically allow for limited use of copyrighted material without permission from the copyright…

Computation and Language · Computer Science 2023-10-24 Antonia Karamolegkou , Jiaang Li , Li Zhou , Anders Søgaard

Dominant pre-trained language models (PLMs) have demonstrated the potential risk of memorizing and outputting the training data. While this concern has been discussed mainly in English, it is also practically important to focus on…

Computation and Language · Computer Science 2024-08-16 Shotaro Ishihara , Hiromu Takahashi

Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data. This phenomenon raises critical questions about model behavior, privacy risks,…

Machine Learning · Computer Science 2025-12-15 Alexander Xiong , Xuandong Zhao , Aneesh Pappu , Dawn Song

Memorization in large language models (LLMs) is a growing concern. LLMs have been shown to easily reproduce parts of their training data, including copyrighted work. This is an important problem to solve, as it may violate existing…

Computation and Language · Computer Science 2024-11-19 Felix B Mueller , Rebekka Görge , Anna K Bernzen , Janna C Pirk , Maximilian Poretschkin

Language enables humans to share knowledge, reason about the world, and pass on strategies for survival and innovation across generations. At the heart of this process is not just the ability to communicate but also the remarkable…

Computation and Language · Computer Science 2026-02-25 Jan Philip Wahle

In recent years, Large Language Models (LLMs) have gained significant popularity due to their ability to generate human-like text and their potential applications in various fields, such as Software Engineering. LLMs for Code are commonly…

Software Engineering · Computer Science 2023-03-01 Ali Al-Kaswan , Maliheh Izadi
‹ Prev 1 2 3 10 Next ›