Related papers: Statistical Inference using the Morse-Smale Comple…
Morse complexes are gradient-based topological descriptors with close connections to Morse theory. They are widely applicable in scientific visualization as they serve as important abstractions for gaining insights into the topology of…
The problem of subgroups is ubiquitous in scientific research (ex. disease heterogeneity, spatial distributions in ecology...), and piecewise regression is one way to deal with this phenomenon. Morse-Smale regression offers a way to…
The model interpretation is essential in many application scenarios and to build a classification model with a ease of model interpretation may provide useful information for further studies and improvement. It is common to encounter with a…
Signal decomposition and multiscale signal analysis provide many useful tools for time-frequency analysis. We proposed a random feature method for analyzing time-series data by constructing a sparse approximation to the spectrogram. The…
The multivariate regression model basically offers the analysis of a single dataset with multiple responses. However, such a single-dataset analysis often leads to unsatisfactory results. Integrative analysis is an effective method to pool…
Functional data analysis deals with data recorded densely over time (or any other continuum) with one or more observed curves per subject. Conceptually, functional data are continuously defined, but in practice, they are usually observed at…
We present a method for estimating sparse high-dimensional inverse covariance and partial correlation matrices, which exploits the connection between the inverse covariance matrix and linear regression. The method is a two-stage estimation…
We propose new ensemble models for multivariate functional data classification as combinations of semi-metric-based weak learners. Our models extend current semi-metric-type methods from the univariate to the multivariate case, propose new…
We propose a method to reconstruct and cluster incomplete high-dimensional data lying in a union of low-dimensional subspaces. Exploring the sparse representation model, we jointly estimate the missing data while imposing the intrinsic…
Clustering techniques applied to multivariate data are a very useful tool in Statistics and have been fully studied in the literature. Nevertheless, these clustering methodologies are less well known when dealing with functional data. Our…
This paper studies high-dimensional regression models with lasso when data is sampled under multi-way clustering. First, we establish convergence rates for the lasso and post-lasso estimators. Second, we propose a novel inference method…
A new statistical model designed for regression analysis with a sparse design matrix is proposed. This new model utilizes the positions of the limited non-zero elements in the design matrix to decompose the regression model into…
Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature,…
Though analyzing a single scalar field using Morse complexes is well studied, there are few techniques for visualizing a collection of Morse complexes. We focus on analyses that are enabled by looking at a Morse complex as an embedded…
The nonparametric formulation of density-based clustering, known as modal clustering, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation…
Functional neuroimaging can measure the brain?s response to an external stimulus. It is used to perform brain mapping: identifying from these observations the brain regions involved. This problem can be cast into a linear supervised…
Sparse support vector machine (SVM) is a popular classification technique that can simultaneously learn a small set of the most interpretable features and identify the support vectors. It has achieved great successes in many real-world…
Fractional moments have been investigated by many authors to represent the density of univariate and bivariate random variables in different contexts. Fractional moments are indeed important when the density of the random variable has…
It is well known that the minimax rates of convergence of nonparametric density and regression function estimation of a random variable measured with error is much slower than the rate in the error free case. Surprisingly, we show that if…
Determining the number of clusters is a central challenge in unsupervised learning, where ground-truth labels are unavailable. The Silhouette coefficient is a widely used internal validation metric for this task, yet its standard…