Related papers: Private Text Classification

Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation: An Application to Hate-Speech Detection

Classification of personal text messages has many useful applications in surveillance, e-commerce, and mental health care, to name a few. Giving applications access to personal texts can easily lead to (un)intentional privacy violations. We…

Cryptography and Security · Computer Science 2021-03-15 Devin Reich , Ariel Todoki , Rafael Dowsley , Martine De Cock , Anderson C. A. Nascimento

Privacy-Preserving Text Classification on BERT Embeddings with Homomorphic Encryption

Embeddings, which compress information in raw text into semantics-preserving low-dimensional vectors, have been widely adopted for their efficacy. However, recent research has shown that embeddings can potentially leak private information…

Computation and Language · Computer Science 2022-10-07 Garam Lee , Minsoo Kim , Jai Hyun Park , Seung-won Hwang , Jung Hee Cheon

Generalised Differential Privacy for Text Document Processing

We address the problem of how to "obfuscate" texts by removing stylistic clues which can identify authorship, whilst preserving (as much as possible) the content of the text. In this paper we combine ideas from "generalised differential…

Cryptography and Security · Computer Science 2019-02-06 Natasha Fernandes , Mark Dras , Annabelle McIver

Protecting Privacy in Classifiers by Token Manipulation

Using language models as a remote service entails sending private information to an untrusted provider. In addition, potential eavesdroppers can intercept the messages, thereby exposing the information. In this work, we explore the…

Computation and Language · Computer Science 2024-07-04 Re'em Harel , Yair Elboher , Yuval Pinter

Differentially Private Algorithms for Empirical Machine Learning

An important use of private data is to build machine learning classifiers. While there is a burgeoning literature on differentially private classification algorithms, we find that they are not practical in real applications due to two…

Machine Learning · Computer Science 2014-11-24 Ben Stoddard , Yan Chen , Ashwin Machanavajjhala

Differentially Private Fair Binary Classifications

In this work, we investigate binary classification under the constraints of both differential privacy and fairness. We first propose an algorithm based on the decoupling technique for learning a classifier with only fairness guarantee. This…

Machine Learning · Computer Science 2024-05-21 Hrad Ghoukasian , Shahab Asoodeh

The Limits of Word Level Differential Privacy

As the issues of privacy and trust are receiving increasing attention within the research community, various attempts have been made to anonymize textual data. A significant subset of these approaches incorporate differentially private…

Cryptography and Security · Computer Science 2022-05-05 Justus Mattern , Benjamin Weggenmann , Florian Kerschbaum

Towards Robust and Privacy-preserving Text Representations

Written text often provides sufficient clues to identify the author, their gender, age, and other important attributes. Consequently, the authorship of training and evaluation corpora can have unforeseen impacts, including differing model…

Computation and Language · Computer Science 2018-05-17 Yitong Li , Timothy Baldwin , Trevor Cohn

ADePT: Auto-encoder based Differentially Private Text Transformation

Privacy is an important concern when building statistical models on data containing personal information. Differential privacy offers a strong definition of privacy and can be used to solve several privacy concerns (Dwork et al., 2014).…

Cryptography and Security · Computer Science 2021-02-03 Satyapriya Krishna , Rahul Gupta , Christophe Dupuy

BRR: Preserving Privacy of Text Data Efficiently on Device

With the use of personal devices connected to the Internet for tasks such as searches and shopping becoming ubiquitous, ensuring the privacy of the users of such services has become a requirement in order to build and maintain customer…

Cryptography and Security · Computer Science 2021-07-19 Ricardo Silva Carvalho , Theodore Vasiloudis , Oluwaseyi Feyisetan

Text Revealer: Private Text Reconstruction via Model Inversion Attacks against Transformers

Text classification has become widely used in various natural language processing applications like sentiment analysis. Current applications often use large transformer-based language models to classify input texts. However, there is a lack…

Computation and Language · Computer Science 2022-09-22 Ruisi Zhang , Seira Hidano , Farinaz Koushanfar

Submix: Practical Private Prediction for Large-Scale Language Models

Recent data-extraction attacks have exposed that language models can memorize some training samples verbatim. This is a vulnerability that can compromise the privacy of the model's training data. In this work, we introduce SubMix: a…

Machine Learning · Computer Science 2022-01-05 Antonio Ginart , Laurens van der Maaten , James Zou , Chuan Guo

Classification under local differential privacy

We consider the binary classification problem in a setup that preserves the privacy of the original sample. We provide a privacy mechanism that is locally differentially private and then construct a classifier based on the private sample…

Statistics Theory · Mathematics 2019-12-11 Thomas Berrett , Cristina Butucea

Privacy-preserving Neural Representations of Text

This article deals with adversarial attacks towards deep learning systems for Natural Language Processing (NLP), in the context of privacy protection. We study a specific type of attack: an attacker eavesdrops on the hidden representations…

Computation and Language · Computer Science 2018-08-29 Maximin Coavoux , Shashi Narayan , Shay B. Cohen

Privacy at Scale: Introducing the PrivaSeer Corpus of Web Privacy Policies

Organisations disclose their privacy practices by posting privacy policies on their website. Even though users often care about their digital privacy, they often don't read privacy policies since they require a significant investment in…

Information Retrieval · Computer Science 2024-04-02 Mukund Srinath , Shomir Wilson , C. Lee Giles

Differential Privacy for Text Analytics via Natural Text Sanitization

Texts convey sophisticated knowledge. However, texts also convey sensitive information. Despite the success of general-purpose language models and domain-specific mechanisms with differential privacy (DP), existing text sanitization…

Computation and Language · Computer Science 2021-06-03 Xiang Yue , Minxin Du , Tianhao Wang , Yaliang Li , Huan Sun , Sherman S. M. Chow

Privacy-Preserving Important Passage Retrieval

State-of-the-art important passage retrieval methods obtain very good results, but do not take into account privacy issues. In this paper, we present a privacy preserving method that relies on creating secure representations of documents.…

Information Retrieval · Computer Science 2016-08-10 Luis Marujo , José Portêlo , David Martins de Matos , João P. Neto , Anatole Gershman , Jaime Carbonell , Isabel Trancoso , Bhiksha Raj

On the privacy-utility trade-off in differentially private hierarchical text classification

Hierarchical text classification consists in classifying text documents into a hierarchy of classes and sub-classes. Although artificial neural networks have proved useful to perform this task, unfortunately they can leak training data…

Cryptography and Security · Computer Science 2021-12-10 Dominik Wunderlich , Daniel Bernau , Francesco Aldà , Javier Parra-Arnau , Thorsten Strufe

Beyond Theoretical Bounds: Empirical Privacy Loss Calibration for Text Rewriting Under Local Differential Privacy

The growing use of large language models has increased interest in sharing textual data in a privacy-preserving manner. One prominent line of work addresses this challenge through text rewriting under Local Differential Privacy (LDP), where…

Cryptography and Security · Computer Science 2026-03-25 Weijun Li , Arnaud Grivet Sébert , Qiongkai Xu , Annabelle McIver , Mark Dras

Interpretable Privacy Preservation of Text Representations Using Vector Steganography

Contextual word representations generated by language models (LMs) learn spurious associations present in the training corpora. Recent findings reveal that adversaries can exploit these associations to reverse-engineer the private…

Computation and Language · Computer Science 2021-12-08 Geetanjali Bihani