Related papers: Finding Statistically Significant Attribute Intera…
In this work we present the novel ASTRID method for investigating which attribute interactions classifiers exploit when making predictions. Attribute interactions in classification tasks mean that two or more attributes together provide…
Knowledge of the association information between the attributes in a data set provides insight into the underlying structure of the data and explains the relationships (independence, synergy, redundancy) between the attributes and class (if…
Recently, graphs have been widely used to represent many different kinds of real world data or observations such as social networks, protein-protein networks, road networks, and so on. In many cases, each node in a graph is associated with…
Our aim is to detect mechanistic interaction between the effects of two causal factors on a binary response, as an aid to identifying situations where the effects are mediated by a common mechanism. We propose a formalization of mechanistic…
Qualitative interactions occur when a treatment effect or measure of association varies in sign by sub-population. Of particular interest in many biomedical settings are absence/presence qualitative interactions, which occur when an effect…
The search for higher-order feature interactions that are statistically significantly associated with a class variable is of high relevance in fields such as Genetics or Healthcare, but the combinatorial explosion of the candidate space…
This work proposes and evaluates a novel approach to determine interesting categorical attributes for lists of entities. Once identified, such categories are of immense value to allow constraining (filtering) a current view of a user to…
A methodology is proposed to automatically detect significant symbol associations in genomic databases. A new statistical test is proposed to assess the significance of a group of symbols when found in several genesets of a given database.…
With the growing size of data sets, feature selection becomes increasingly important. Taking interactions of original features into consideration will lead to extremely high dimension, especially when the features are categorical and…
Classification and clustering are both important topics in statistical learning. A natural question herein is whether predefined classes are really different from one another, or whether clusters are really there. Specifically, we may be…
When faced with a new dataset, most practitioners begin by performing exploratory data analysis to discover interesting patterns and characteristics within data. Techniques such as association rule mining are commonly applied to uncover…
The analysis of the interaction matrix between two distinct sets is essential across diverse fields, from pharmacovigilance to transcriptomics. Not all interactions are equally informative: a marker gene associated with a few specific…
Effectively analyzing spatiotemporal data plays a central role in understanding real-world phenomena and informing decision-making. Capturing the interaction between spatial and temporal dimensions also helps explain the underlying…
Set classification problems arise when classification tasks are based on sets of observations as opposed to individual observations. In set classification, a classification rule is trained with $N$ sets of observations, where each set is…
Internet data has surfaced as a primary source for investigation of different aspects of human behavior. A crucial step in such studies is finding a suitable cohort (i.e., a set of users) that shares a common trait of interest to…
Discriminative patterns are association patterns that occur with disproportionate frequency in some classes versus others, and have been studied under names such as emerging patterns and contrast sets. Such patterns have demonstrated…
Given two relations containing multiple measurements - possibly with uncertainties - our objective is to find which sets of attributes from the first have a corresponding set on the second, using exclusively a sample of the data. This…
Variable selection for optimal treatment regime in a clinical trial or an observational study is getting more attention. Most existing variable selection techniques focused on selecting variables that are important for prediction, therefore…
Selecting subsets of features that differentiate between two conditions is a key task in a broad range of scientific domains. In many applications, the features of interest form clusters with similar effects on the data at hand. To recover…
In domains where transparency and trustworthiness are crucial, such as healthcare, rule-based systems are widely used and often preferred over black-box models for decision support systems due to their inherent interpretability. However, as…