English
Related papers

Related papers: Private Text Classification

200 papers

Classification of personal text messages has many useful applications in surveillance, e-commerce, and mental health care, to name a few. Giving applications access to personal texts can easily lead to (un)intentional privacy violations. We…

Cryptography and Security · Computer Science 2021-03-15 Devin Reich , Ariel Todoki , Rafael Dowsley , Martine De Cock , Anderson C. A. Nascimento

Embeddings, which compress information in raw text into semantics-preserving low-dimensional vectors, have been widely adopted for their efficacy. However, recent research has shown that embeddings can potentially leak private information…

Computation and Language · Computer Science 2022-10-07 Garam Lee , Minsoo Kim , Jai Hyun Park , Seung-won Hwang , Jung Hee Cheon

We address the problem of how to "obfuscate" texts by removing stylistic clues which can identify authorship, whilst preserving (as much as possible) the content of the text. In this paper we combine ideas from "generalised differential…

Cryptography and Security · Computer Science 2019-02-06 Natasha Fernandes , Mark Dras , Annabelle McIver

Using language models as a remote service entails sending private information to an untrusted provider. In addition, potential eavesdroppers can intercept the messages, thereby exposing the information. In this work, we explore the…

Computation and Language · Computer Science 2024-07-04 Re'em Harel , Yair Elboher , Yuval Pinter

An important use of private data is to build machine learning classifiers. While there is a burgeoning literature on differentially private classification algorithms, we find that they are not practical in real applications due to two…

Machine Learning · Computer Science 2014-11-24 Ben Stoddard , Yan Chen , Ashwin Machanavajjhala

In this work, we investigate binary classification under the constraints of both differential privacy and fairness. We first propose an algorithm based on the decoupling technique for learning a classifier with only fairness guarantee. This…

Machine Learning · Computer Science 2024-05-21 Hrad Ghoukasian , Shahab Asoodeh

As the issues of privacy and trust are receiving increasing attention within the research community, various attempts have been made to anonymize textual data. A significant subset of these approaches incorporate differentially private…

Cryptography and Security · Computer Science 2022-05-05 Justus Mattern , Benjamin Weggenmann , Florian Kerschbaum

Written text often provides sufficient clues to identify the author, their gender, age, and other important attributes. Consequently, the authorship of training and evaluation corpora can have unforeseen impacts, including differing model…

Computation and Language · Computer Science 2018-05-17 Yitong Li , Timothy Baldwin , Trevor Cohn

Privacy is an important concern when building statistical models on data containing personal information. Differential privacy offers a strong definition of privacy and can be used to solve several privacy concerns (Dwork et al., 2014).…

Cryptography and Security · Computer Science 2021-02-03 Satyapriya Krishna , Rahul Gupta , Christophe Dupuy

With the use of personal devices connected to the Internet for tasks such as searches and shopping becoming ubiquitous, ensuring the privacy of the users of such services has become a requirement in order to build and maintain customer…

Cryptography and Security · Computer Science 2021-07-19 Ricardo Silva Carvalho , Theodore Vasiloudis , Oluwaseyi Feyisetan

Text classification has become widely used in various natural language processing applications like sentiment analysis. Current applications often use large transformer-based language models to classify input texts. However, there is a lack…

Computation and Language · Computer Science 2022-09-22 Ruisi Zhang , Seira Hidano , Farinaz Koushanfar

Recent data-extraction attacks have exposed that language models can memorize some training samples verbatim. This is a vulnerability that can compromise the privacy of the model's training data. In this work, we introduce SubMix: a…

Machine Learning · Computer Science 2022-01-05 Antonio Ginart , Laurens van der Maaten , James Zou , Chuan Guo

We consider the binary classification problem in a setup that preserves the privacy of the original sample. We provide a privacy mechanism that is locally differentially private and then construct a classifier based on the private sample…

Statistics Theory · Mathematics 2019-12-11 Thomas Berrett , Cristina Butucea

This article deals with adversarial attacks towards deep learning systems for Natural Language Processing (NLP), in the context of privacy protection. We study a specific type of attack: an attacker eavesdrops on the hidden representations…

Computation and Language · Computer Science 2018-08-29 Maximin Coavoux , Shashi Narayan , Shay B. Cohen

Organisations disclose their privacy practices by posting privacy policies on their website. Even though users often care about their digital privacy, they often don't read privacy policies since they require a significant investment in…

Information Retrieval · Computer Science 2024-04-02 Mukund Srinath , Shomir Wilson , C. Lee Giles

Texts convey sophisticated knowledge. However, texts also convey sensitive information. Despite the success of general-purpose language models and domain-specific mechanisms with differential privacy (DP), existing text sanitization…

Computation and Language · Computer Science 2021-06-03 Xiang Yue , Minxin Du , Tianhao Wang , Yaliang Li , Huan Sun , Sherman S. M. Chow

State-of-the-art important passage retrieval methods obtain very good results, but do not take into account privacy issues. In this paper, we present a privacy preserving method that relies on creating secure representations of documents.…

Hierarchical text classification consists in classifying text documents into a hierarchy of classes and sub-classes. Although artificial neural networks have proved useful to perform this task, unfortunately they can leak training data…

Cryptography and Security · Computer Science 2021-12-10 Dominik Wunderlich , Daniel Bernau , Francesco Aldà , Javier Parra-Arnau , Thorsten Strufe

The growing use of large language models has increased interest in sharing textual data in a privacy-preserving manner. One prominent line of work addresses this challenge through text rewriting under Local Differential Privacy (LDP), where…

Cryptography and Security · Computer Science 2026-03-25 Weijun Li , Arnaud Grivet Sébert , Qiongkai Xu , Annabelle McIver , Mark Dras

Contextual word representations generated by language models (LMs) learn spurious associations present in the training corpora. Recent findings reveal that adversaries can exploit these associations to reverse-engineer the private…

Computation and Language · Computer Science 2021-12-08 Geetanjali Bihani
‹ Prev 1 2 3 10 Next ›