Related papers: Reliability Testing for Natural Language Processin…

On the Interplay between Fairness and Explainability

In order to build reliable and trustworthy NLP applications, models need to be both fair across different demographics and explainable. Usually these two objectives, fairness and explainability, are optimized and/or examined independently…

Computation and Language · Computer Science 2023-11-14 Stephanie Brandl , Emanuele Bugliarello , Ilias Chalkidis

Robust Natural Language Processing: Recent Advances, Challenges, and Future Directions

Recent natural language processing (NLP) techniques have accomplished high performance on benchmark datasets, primarily due to the significant improvement in the performance of deep learning. The advances in the research community have led…

Computation and Language · Computer Science 2022-10-24 Marwan Omar , Soohyeon Choi , DaeHun Nyang , David Mohaisen

Fairness Certification for Natural Language Processing and Large Language Models

Natural Language Processing (NLP) plays an important role in our daily lives, particularly due to the enormous progress of Large Language Models (LLM). However, NLP has many fairness-critical use cases, e.g., as an expert system in…

Computation and Language · Computer Science 2024-01-04 Vincent Freiberger , Erik Buchmann

Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness

Do larger and more performant models resolve NLP's longstanding robustness issues? We investigate this question using over 20 models of different sizes spanning different architectural choices and pretraining objectives. We conduct…

Computation and Language · Computer Science 2024-04-04 Ashim Gupta , Rishanth Rajendhran , Nathan Stringham , Vivek Srikumar , Ana Marasović

Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets

With the ever-growing amounts of textual data from a large variety of languages, domains, and genres, it has become standard to evaluate NLP algorithms on multiple datasets in order to ensure consistent performance across heterogeneous…

Computation and Language · Computer Science 2017-09-28 Rotem Dror , Gili Baumer , Marina Bogomolov , Roi Reichart

Is Robustness Transferable across Languages in Multilingual Neural Machine Translation?

Robustness, the ability of models to maintain performance in the face of perturbations, is critical for developing reliable NLP systems. Recent studies have shown promising results in improving the robustness of models through adversarial…

Artificial Intelligence · Computer Science 2023-11-01 Leiyu Pan , Supryadi , Deyi Xiong

Advancing Fairness in Natural Language Processing: From Traditional Methods to Explainability

The burgeoning field of Natural Language Processing (NLP) stands at a critical juncture where the integration of fairness within its frameworks has become an imperative. This PhD thesis addresses the need for equity and transparency in NLP…

Computation and Language · Computer Science 2024-10-17 Fanny Jourdan

Measure and Improve Robustness in NLP Models: A Survey

As NLP models achieved state-of-the-art performances over benchmarks and gained wide applications, it has been increasingly important to ensure the safe deployment of these models in the real world, e.g., making sure the models are robust…

Computation and Language · Computer Science 2022-05-11 Xuezhi Wang , Haohan Wang , Diyi Yang

A Systematic Review of Reproducibility Research in Natural Language Processing

Against the background of what has been termed a reproducibility crisis in science, the NLP field is becoming increasingly interested in, and conscientious about, the reproducibility of its results. The past few years have seen an…

Computation and Language · Computer Science 2021-03-23 Anya Belz , Shubham Agarwal , Anastasia Shimorina , Ehud Reiter

Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP

Modern NLP systems exhibit a range of biases, which a growing literature on model debiasing attempts to correct. However current progress is hampered by a plurality of definitions of bias, means of quantification, and oftentimes vague…

Computation and Language · Computer Science 2023-02-14 Xudong Han , Timothy Baldwin , Trevor Cohn

Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?

Natural language processing (NLP) models often replicate or amplify social bias from training data, raising concerns about fairness. At the same time, their black-box nature makes it difficult for users to recognize biased predictions and…

Computation and Language · Computer Science 2026-02-12 Yifan Wang , Mayank Jobanputra , Ji-Ung Lee , Soyoung Oh , Isabel Valera , Vera Demberg

Robustness in Large Language Models: A Survey of Mitigation Strategies and Evaluation Metrics

Large Language Models (LLMs) have emerged as a promising cornerstone for the development of natural language processing (NLP) and artificial intelligence (AI). However, ensuring the robustness of LLMs remains a critical challenge. To…

Computation and Language · Computer Science 2025-11-07 Pankaj Kumar , Subhankar Mishra

Establishing Trustworthiness: Rethinking Tasks and Model Evaluation

Language understanding is a multi-faceted cognitive capability, which the Natural Language Processing (NLP) community has striven to model computationally for decades. Traditionally, facets of linguistic intelligence have been…

Computation and Language · Computer Science 2023-10-24 Robert Litschko , Max Müller-Eberstein , Rob van der Goot , Leon Weber , Barbara Plank

Understanding Model Robustness to User-generated Noisy Texts

Sensitivity of deep-neural models to input noise is known to be a challenging problem. In NLP, model performance often deteriorates with naturally occurring noise, such as spelling errors. To mitigate this issue, models may leverage…

Computation and Language · Computer Science 2021-11-18 Jakub Náplava , Martin Popel , Milan Straka , Jana Straková

A Survey on Bias and Fairness in Natural Language Processing

As NLP models become more integrated with the everyday lives of people, it becomes important to examine the social effect that the usage of these systems has. While these models understand language and have increased accuracy on difficult…

Computation and Language · Computer Science 2022-04-21 Rajas Bansal

From Literature to Practice: Exploring Fairness Testing Tools for the Software Industry Adoption

In today's world, we need to ensure that AI systems are fair and unbiased. Our study looked at tools designed to test the fairness of software to see if they are practical and easy for software developers to use. We found that while some…

Software Engineering · Computer Science 2024-09-05 Thanh Nguyen , Luiz Fernando de Lima , Maria Teresa Badassarre , Ronnie de Souza Santos

Challenges in Applying Explainability Methods to Improve the Fairness of NLP Models

Motivations for methods in explainable artificial intelligence (XAI) often include detecting, quantifying and mitigating bias, and contributing to making machine learning models fairer. However, exactly how an XAI method can help in…

Computation and Language · Computer Science 2022-06-09 Esma Balkir , Svetlana Kiritchenko , Isar Nejadgholi , Kathleen C. Fraser

Predicting Performance for Natural Language Processing Tasks

Given the complexity of combinations of tasks, languages, and domains in natural language processing (NLP) research, it is computationally prohibitive to exhaustively test newly proposed models on each possible experimental setting. In this…

Computation and Language · Computer Science 2020-05-05 Mengzhou Xia , Antonios Anastasopoulos , Ruochen Xu , Yiming Yang , Graham Neubig

Accountability of Robust and Reliable AI-Enabled Systems: A Preliminary Study and Roadmap

This vision paper presents initial research on assessing the robustness and reliability of AI-enabled systems, and key factors in ensuring their safety and effectiveness in practical applications, including a focus on accountability. By…

Software Engineering · Computer Science 2025-06-23 Filippo Scaramuzza , Damian A. Tamburri , Willem-Jan van den Heuvel

Faithfulness Tests for Natural Language Explanations

Explanations of neural models aim to reveal a model's decision-making process for its predictions. However, recent work shows that current methods giving explanations such as saliency maps or counterfactuals can be misleading, as they are…

Computation and Language · Computer Science 2023-07-03 Pepa Atanasova , Oana-Maria Camburu , Christina Lioma , Thomas Lukasiewicz , Jakob Grue Simonsen , Isabelle Augenstein