Related papers: Reducing Differential Item Functioning via Process…
Differential item functioning (DIF) is a widely used statistical notion for identifying items that may disadvantage specific groups of test-takers. These groups are often defined by non-manipulable characteristics, e.g., gender,…
Ensuring fairness in instruments like survey questionnaires or educational tests is crucial. One way to address this is by a Differential Item Functioning (DIF) analysis, which examines if different subgroups respond differently to a…
This paper proposes a method for assessing differential item functioning (DIF) in item response theory (IRT) models. The method does not require pre-specification of anchor items, which is its main virtue. It is developed in two main steps,…
Differential item functioning (DIF) arises alongside latent population heterogeneity in many applications, and both must be accounted for when assessing measurement invariance. In many practical settings, however, the comparison groups are…
Computer-based assessments routinely generate detailed interaction logs -- commonly referred to as process data -- that record every action a respondent performs during task completion, yet systematic preprocessing guidance, integrated…
Detection of differential item functioning by use of the logistic modelling approach has a long tradition. One big advantage of the approach is that it can be used to investigate non-uniform DIF as well as uniform DIF. The classical…
In the item response theory (IRT) literature, differential test functioning (DTF) has been conceptualized in terms of how the test response function differs over groups of respondents. This paper presents an alternative approach to DTF that…
Establishing the invariance property of an instrument is a key step for establishing its measurement validity. Measurement invariance is typically assessed by differential item functioning (DIF) analysis, i.e., detecting DIF items whose…
Differential item functioning (DIF) detection is an important yet understudied problem in computerized adaptive testing (CAT). In this article, we proposed a two-level logistic model to improve DIF detection in CAT by explicitly accounting…
We fine-tuned and compared several encoder-based Transformer large language models (LLM) to predict differential item functioning (DIF) from the item text. We then applied explainable artificial intelligence (XAI) methods to these models to…
A new method for the identification of differential item functioning (DIF) by using recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of…
Various methods to detect differential item functioning (DIF) in item response models are available. However, most of the methods assume that the responses are binary, for ordered response categories available methods are scarce. In the…
Measurement non-invariance arises when the psychometric properties of a scale differ across subgroups, undermining the validity of group comparisons. At the item level, such non-invariance manifests as differential item functioning (DIF),…
This study introduces a novel nonparametric approach for detecting Differential Item Functioning (DIF) in binary items through direct comparison of Item Response Curves (IRCs). Building on prior work on nonparametric comparison of…
Process mining is a multi-purpose tool enabling organizations to improve their processes. One of the primary purposes of process mining is finding the root causes of performance or compliance problems in processes. The usual way of doing so…
Few health-related constructs or measures have received a critical evaluation in terms of measurement equivalence, such as self-reported health survey data. Differential item functioning (DIF) analysis is crucial for evaluating measurement…
Recommendation fairness has recently attracted much attention. In the real world, recommendation systems are driven by user behavior, and since users with the same sensitive feature (e.g., gender and age) tend to have the same patterns,…
As Large Language Models (LLMs) have risen in prominence over the past few years, there has been concern over the potential biases in LLMs inherited from the training data. Previous studies have examined how LLMs exhibit implicit bias, such…
Most fair machine learning methods either highly rely on the sensitive information of the training samples or require a large modification on the target models, which hinders their practical application. To address this issue, we propose a…
Machine learning models are vulnerable to biases that result in unfair treatment of individuals from different populations. Recent work that aims to test a model's fairness at the individual level either relies on domain knowledge to choose…