Related papers: Significance Analysis for Pairwise Variable Select…
We consider joint selection of fixed and random effects in general mixed-effects models. The interpretation of estimated mixed-effects models is challenging since changing the structure of one set of effects can lead to different choices of…
Joint models have proven to be an effective approach for uncovering potentially hidden connections between various types of outcomes, mainly continuous, time-to-event, and binary. Typically, longitudinal continuous outcomes are…
Factorizable joint shift (FJS) was recently proposed as a type of dataset shift for which the complete characteristics can be estimated from feature data observations on the test dataset by a method called Joint Importance Aligning. For the…
Let X; Z be r and s-dimensional covariates, respectively, used to model the response variable Y as Y = m(X;Z) + \sigma(X;Z)\epsilon. We develop an ANOVA-type test for the null hypothesis that Z has no influence on the regression function,…
Credible causal effect estimation requires treated subjects and controls to be otherwise similar. In observational settings, such as analysis of electronic health records, this is not guaranteed. Investigators must balance background…
Selective inference aims at providing valid inference after a data-driven selection of models or hypotheses. It is essential to avoid overconfident results and replicability issues. While significant advances have been made in this area for…
This paper is concerned with the selection and estimation of fixed and random effects in linear mixed effects models. We propose a class of nonconcave penalized profile likelihood methods for selecting and estimating important fixed…
Conjoint analysis is a popular experimental design used to measure multidimensional preferences. Researchers examine how varying a factor of interest, while controlling for other relevant factors, influences decision-making. Currently,…
Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…
Variable selection for optimal treatment regime in a clinical trial or an observational study is getting more attention. Most existing variable selection techniques focused on selecting variables that are important for prediction, therefore…
The marginal likelihood is a central tool for drawing Bayesian inference about the number of components in mixture models. It is often approximated since the exact form is unavailable. A bias in the approximation may be due to an incomplete…
We develop a pivotal test to assess the statistical significance of the feature variables in a single-layer feedforward neural network regression model. We propose a gradient-based test statistic and study its asymptotics using…
We propose a new model selection criterion for mixed effects regression models that is computable when the model is fitted with a two-step method, even when the structure and the distribution of the random effects are unknown. The criterion…
Given a set of several inputs into a system (e.g., independent variables characterizing stimuli) and a set of several stochastically non-independent outputs (e.g., random variables describing different aspects of responses), how can one…
In many data exploration tasks it is meaningful to identify groups of attribute interactions that are specific to a variable of interest. For instance, in a dataset where the attributes are medical markers and the variable of interest…
In high-throughput genetics studies, an important aim is to identify gene-environment interactions associated with the clinical outcomes. Recently, multiple marginal penalization methods have been developed and shown to be effective in…
Randomization tests are a popular method for testing causal effects in clinical trials with finite-sample validity. In the presence of heterogeneous treatment effects, it is often of interest to select a subgroup that benefits from the…
Variable selection remains a difficult problem, especially for generalized linear mixed models (GLMMs). While some frequentist approaches to simultaneously select joint fixed and random effects exist, primarily through the use of…
Feature selection and reducing the dimensionality of data is an essential step in data analysis. In this work, we propose a new criterion for feature selection that is formulated as conditional information between features given the labeled…
Supervised classifying of biological samples based on genetic information, (e.g. gene expression profiles) is an important problem in biostatistics. In order to find both accurate and interpretable classification rules variable selection is…