Related papers: Selective Inference Approach for Statistically Sou…

Selective inference for clustering with unknown variance

In many modern statistical problems, the limited available data must be used both to develop the hypotheses to test, and to test these hypotheses-that is, both for exploratory and confirmatory data analysis. Reusing the same dataset for…

Methodology · Statistics 2023-07-24 Youngjoo Yun , Rina Foygel Barber

A Tutorial on Statistically Sound Pattern Discovery

Statistically sound pattern discovery harnesses the rigour of statistical hypothesis testing to overcome many of the issues that have hampered standard data mining approaches to pattern discovery. Most importantly, application of…

Methodology · Statistics 2019-01-07 Wilhelmiina Hämäläinen , Geoffrey I. Webb

Inference conditional on selection: a review

In this article, we review selective inference, a set of techniques for inference when the statistical question asked is a function of the data. This setting often arises in contemporary scientific workflows, where hypotheses and parameters…

Methodology · Statistics 2026-04-14 Anna Neufeld , Ronan Perry , Daniela Witten

Constraint-based Sequential Pattern Mining with Decision Diagrams

Constrained sequential pattern mining aims at identifying frequent patterns on a sequential database of items while observing constraints defined over the item attributes. We introduce novel techniques for constraint-based sequential…

Machine Learning · Computer Science 2019-01-01 Amin Hosseininasab , Willem-Jan van Hoeve , Andre A. Cire

Black-box Selective Inference via Bootstrapping

Conditional selective inference requires an exact characterization of the selection event, which is often unavailable except for a few examples like the lasso. This work addresses this challenge by introducing a generic approach to estimate…

Methodology · Statistics 2023-08-22 Sifan Liu , Jelena Markovic-Voronov , Jonathan Taylor

Statistical Inference Under Constrained Selection Bias

Large-scale datasets are increasingly being used to inform decision making. While this effort aims to ground policy in real-world evidence, challenges have arisen as selection bias and other forms of distribution shifts often plague…

Methodology · Statistics 2023-11-07 Santiago Cortes-Gomez , Mateo Dulce , Carlos Patino , Bryan Wilder

Selective Inference via Marginal Screening for High Dimensional Classification

Post-selection inference is a statistical technique for determining salient variables after model or variable selection. Recently, selective inference, a kind of post-selection inference framework, has garnered the attention in the…

Methodology · Statistics 2019-06-28 Yuta Umezu , Ichiro Takeuchi

Statistical Test for Feature Selection Pipelines by Selective Inference

A data analysis pipeline is a structured sequence of steps that transforms raw data into meaningful insights by integrating various analysis algorithms. In this paper, we propose a novel statistical test to assess the significance of data…

Machine Learning · Statistics 2024-10-15 Tomohiro Shiraishi , Tatsuya Matsukawa , Shuichi Nishino , Ichiro Takeuchi

Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference

Selection bias arises when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification. For example, in…

Machine Learning · Statistics 2026-04-21 Jonas Arruda , Sophie Chervet , Paula Staudt , Andreas Wieser , Michael Hoelscher , Isabelle Sermet-Gaudelus , Nadine Binder , Lulla Opatowski , Jan Hasenauer

Selective inference after feature selection via multiscale bootstrap

It is common to show the confidence intervals or $p$-values of selected features, or predictor variables in regression, but they often involve selection bias. The selective inference approach solves this bias by conditioning on the…

Methodology · Statistics 2022-06-02 Yoshikazu Terada , Hidetoshi Shimodaira

Selective Inference in Propensity Score Analysis

Selective inference (post-selection inference) is a methodology that has attracted much attention in recent years in the fields of statistics and machine learning. Naive inference based on data that are also used for model selection tends…

Methodology · Statistics 2021-11-25 Yoshiyuki Ninomiya , Yuta Umezu , Ichiro Takeuchi

Safe Pattern Pruning: An Efficient Approach for Predictive Pattern Mining

In this paper we study predictive pattern mining problems where the goal is to construct a predictive model based on a subset of predictive patterns in the database. Our main contribution is to introduce a novel method called safe pattern…

Machine Learning · Statistics 2016-02-16 Kazuya Nakagawa , Shinya Suzumura , Masayuki Karasuyama , Koji Tsuda , Ichiro Takeuchi

Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning

Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction…

Machine Learning · Statistics 2023-06-26 Takumi Yoshida , Hiroyuki Hanada , Kazuya Nakagawa , Kouichi Taji , Koji Tsuda , Ichiro Takeuchi

Finding Sequential Patterns from Large Sequence Data

Data mining is the task of discovering interesting patterns from large amounts of data. There are many data mining tasks, such as classification, clustering, association rule mining, and sequential pattern mining. Sequential pattern mining…

Databases · Computer Science 2010-02-08 Mahdi Esmaeili , Fazekas Gabor

Methods of Selective Inference for Linear Mixed Models: a Review and Empirical Comparison

Selective inference aims at providing valid inference after a data-driven selection of models or hypotheses. It is essential to avoid overconfident results and replicability issues. While significant advances have been made in this area for…

Methodology · Statistics 2025-03-14 Matteo D'Alessandro , Magne Thoresen

A Constraint Programming Approach for Mining Sequential Patterns in a Sequence Database

Constraint-based pattern discovery is at the core of numerous data mining tasks. Patterns are extracted with respect to a given set of constraints (frequency, closedness, size, etc). In the context of sequential pattern mining, a large…

Artificial Intelligence · Computer Science 2013-11-28 Jean-Philippe Métivier , Samir Loudni , Thierry Charnois

Estimating Propensities of Selection for Big Datasets via Data Integration

Big data presents potential but unresolved value as a source for analysis and inference. However,selection bias, present in many of these datasets, needs to be accounted for so that appropriate inferences can be made on the target…

Methodology · Statistics 2025-01-09 Lyndon Ang , Robert Clark , Bronwyn Loong , Anders Holmberg

Selective Inference for Latent Block Models

Model selection in latent block models has been a challenging but important task in the field of statistics. Specifically, a major challenge is encountered when constructing a test on a block structure obtained by applying a specific…

Machine Learning · Statistics 2021-06-08 Chihiro Watanabe , Taiji Suzuki

Statistically Significant Discriminative Patterns Searching

Discriminative pattern mining is an essential task of data mining. This task aims to discover patterns which occur more frequently in a class than other classes in a class-labeled dataset. This type of patterns is valuable in various…

Machine Learning · Computer Science 2019-06-05 Hoang Son Pham , Gwendal Virlet , Dominique Lavenier , Alexandre Termier

Pattern-Based Classification: A Unifying Perspective

The use of patterns in predictive models is a topic that has received a lot of attention in recent years. Pattern mining can help to obtain models for structured domains, such as graphs and sequences, and has been proposed as a means to…

Artificial Intelligence · Computer Science 2011-11-29 Björn Bringmann , Siegfried Nijssen , Albrecht Zimmermann