Related papers: Robust subgroup discovery

Discovering outstanding subgroup lists for numeric targets using MDL

The task of subgroup discovery (SD) is to find interpretable descriptions of subsets of a dataset that stand out with respect to a target attribute. To address the problem of mining large numbers of redundant subgroups, subgroup set…

Machine Learning · Computer Science 2021-03-16 Hugo M. Proença , Peter Grünwald , Thomas Bäck , Matthijs van Leeuwen

Using Constraints to Discover Sparse and Alternative Subgroup Descriptions

Subgroup-discovery methods allow users to obtain simple descriptions of interesting regions in a dataset. Using constraints in subgroup discovery can enhance interpretability even further. In this article, we focus on two types of…

Machine Learning · Computer Science 2025-06-23 Jakob Bach

Efficiently Discovering Locally Exceptional yet Globally Representative Subgroups

Subgroup discovery is a local pattern mining technique to find interpretable descriptions of sub-populations that stand out on a given target variable. That is, these sub-populations are exceptional with regard to the global distribution.…

Databases · Computer Science 2017-09-26 Janis Kalofolias , Mario Boley , Jilles Vreeken

Subjectively Interesting Subgroup Discovery on Real-valued Targets

Deriving insights from high-dimensional data is one of the core problems in data mining. The difficulty mainly stems from the fact that there are exponentially many variable combinations to potentially consider, and there are infinitely…

Machine Learning · Statistics 2021-11-08 Jefrey Lijffijt , Bo Kang , Wouter Duivesteijn , Kai Puolamäki , Emilia Oikarinen , Tijl De Bie

A new algorithm for Subgroup Set Discovery based on Information Gain

Pattern discovery is a machine learning technique that aims to find sets of items, subsequences, or substructures that are present in a dataset with a higher frequency value than a manually set threshold. This process helps to identify…

Machine Learning · Computer Science 2023-08-01 Daniel Gómez-Bravo , Aaron García , Guillermo Vigueras , Belén Ríos , Alejandro Rodríguez-González

Robust subgroup-classifier learning and testing in change-plane regressions

Considered here are robust subgroup-classifier learning and testing in change-plane regressions with heavy-tailed errors, which can identify subgroups as a basis for making optimal recommendations for individualized treatment. A new…

Methodology · Statistics 2024-08-27 Xu Liu , Jian Huang , Yong Zhou , Xiao Zhang

SubSearch: Robust Estimation and Outlier Detection for Stochastic Block Models via Subgraph Search

Community detection is a fundamental task in graph analysis, with methods often relying on fitting models like the Stochastic Block Model (SBM) to observed networks. While many algorithms can accurately estimate SBM parameters when the…

Machine Learning · Statistics 2025-06-05 Leonardo Martins Bianco , Christine Keribin , Zacharie Naulet

Group-based Robustness: A General Framework for Customized Robustness in the Real World

Machine-learning models are known to be vulnerable to evasion attacks that perturb model inputs to induce misclassifications. In this work, we identify real-world scenarios where the true threat cannot be assessed accurately by existing…

Machine Learning · Computer Science 2024-03-12 Weiran Lin , Keane Lucas , Neo Eyal , Lujo Bauer , Michael K. Reiter , Mahmood Sharif

Robust and computationally feasible community detection in the presence of arbitrary outlier nodes

Community detection, which aims to cluster $N$ nodes in a given graph into $r$ distinct groups based on the observed undirected edges, is an important problem in network data analysis. In this paper, the popular stochastic block model (SBM)…

Statistics Theory · Mathematics 2015-06-04 T. Tony Cai , Xiaodong Li

Robust Subset Selection by Greedy and Evolutionary Pareto Optimization

Subset selection, which aims to select a subset from a ground set to maximize some objective function, arises in various applications such as influence maximization and sensor placement. In real-world scenarios, however, one often needs to…

Neural and Evolutionary Computing · Computer Science 2022-05-10 Chao Bian , Yawen Zhou , Chao Qian

Robust subset selection

The best subset selection (or "best subsets") estimator is a classic tool for sparse regression, and developments in mathematical optimization over the past decade have made it more computationally tractable than ever. Notwithstanding its…

Methodology · Statistics 2022-01-11 Ryan Thompson

A Splicing Approach to Best Subset of Groups Selection

Best subset of groups selection (BSGS) is the process of selecting a small part of non-overlapping groups to achieve the best interpretability on the response variable. It has attracted increasing attention and has far-reaching applications…

Machine Learning · Computer Science 2022-09-20 Yanhang Zhang , Junxian Zhu , Jin Zhu , Xueqin Wang

Reachable Sets of Classifiers and Regression Models: (Non-)Robustness Analysis and Robust Training

Neural networks achieve outstanding accuracy in classification and regression tasks. However, understanding their behavior still remains an open challenge that requires questions to be addressed on the robustness, explainability and…

Machine Learning · Computer Science 2021-05-13 Anna-Kathrin Kopetzki , Stephan Günnemann

Robust Multi-Model Subset Selection

Outlying observations can be challenging to handle and adversely affect subsequent analyses, especially in data with increasing dimensional complexity. Although outliers are not always undesired anomalies in the data and may possess…

Methodology · Statistics 2025-09-18 Anthony-Alexander Christidis , Gabriela Cohen-Freue

The Group Loss++: A deeper look into group loss for deep metric learning

Deep metric learning has yielded impressive results in tasks such as clustering and image retrieval by leveraging neural networks to obtain highly discriminative feature embeddings, which can be used to group samples into different classes.…

Computer Vision and Pattern Recognition · Computer Science 2022-04-05 Ismail Elezi , Jenny Seidenschwarz , Laurin Wagner , Sebastiano Vascon , Alessandro Torcinovich , Marcello Pelillo , Laura Leal-Taixe

Robust Information Selection for Hypothesis Testing with Misclassification Penalties

We study the problem of robust information selection for a Bayesian hypothesis testing / classification task, where the goal is to identify the true state of the world from a finite set of hypotheses based on observations from the selected…

Machine Learning · Statistics 2025-02-24 Jayanth Bhargav , Shreyas Sundaram , Mahsa Ghasemi

Expert-Guided Subgroup Discovery: Methodology and Application

This paper presents an approach to expert-guided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized…

Artificial Intelligence · Computer Science 2011-06-24 D. Gamberger , N. Lavrac

Feature Clustering for Accelerating Parallel Coordinate Descent

Large-scale L1-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. High-performance algorithms…

Machine Learning · Statistics 2012-12-19 Chad Scherrer , Ambuj Tewari , Mahantesh Halappanavar , David Haglin

Most Probable Densest Subgraphs

Computing the densest subgraph is a primitive graph operation with critical applications in detecting communities, events, and anomalies in biological, social, Web, and financial networks. In this paper, we study the novel problem of Most…

Social and Information Networks · Computer Science 2022-12-23 Arkaprava Saha , Xiangyu Ke , Arijit Khan , Cheng Long

Robust Densest Subgraph Discovery

Dense subgraph discovery is an important primitive in graph mining, which has a wide variety of applications in diverse domains. In the densest subgraph problem, given an undirected graph $G=(V,E)$ with an edge-weight vector $w=(w_e)_{e\in…

Social and Information Networks · Computer Science 2021-10-27 Atsushi Miyauchi , Akiko Takeda