Related papers: cardinalR: Generating Interesting High-Dimensional…

quollr: An R Package for Visualizing 2-D Models from Nonlinear Dimension Reductions in High-Dimensional Space

Nonlinear dimension reduction methods provide a low-dimensional representation of high-dimensional data by applying a Nonlinear transformation. However, the complexity of the transformations and data structures can create wildly different…

Methodology · Statistics 2025-12-23 Jayani P. Gamage , Dianne Cook , Paul Harrison , Michael Lydeamore , Thiyanga S. Talagala

A set of efficient methods to generate high-dimensional binary data with specified correlation structures

High dimensional correlated binary data arise in many areas, such as observed genetic variations in biomedical research. Data simulation can help researchers evaluate efficiency and explore properties of different computational and…

Methodology · Statistics 2020-07-29 Wei Jiang , Shuang Song , Lin Hou , Hongyu Zhao

Simulating High-Dimensional Multivariate Data using the bigsimr R Package

It is critical to accurately simulate data when employing Monte Carlo techniques and evaluating statistical methodology. Measurements are often correlated and high dimensional in this era of big data, such as data obtained in…

Computation · Statistics 2022-11-29 A. Grant Schissler , Edward J. Bedrick , Alexander D. Knudson , Tomasz J. Kozubowski , Tin Nguyen , Anna K. Panorska , Juli Petereit , Walter W. Piegorsch , Duc Tran

DRtool: An Interactive Tool for Analyzing High-Dimensional Clusterings

When faced with new data, we often conduct a cluster analysis to obtain a better understanding of the data's structure and the archetypical samples present in the data. This process often includes visualization of the data, either as a way…

Applications · Statistics 2026-04-06 Justin Lin , Julia Fukuyama

Mathematical Computation on High-dimensional Data via Array Programming and Parallel Acceleration

While deep learning excels in natural image and language processing, its application to high-dimensional data faces computational challenges due to the dimensionality curse. Current large-scale data tools focus on business-oriented…

Machine Learning · Computer Science 2025-07-01 Chen Zhang

Inference for high dimensional repeated measure designs with the R package hdrm

Repeated-measure designs allow comparisons within a group as well as between groups, and are commonly referred to as split-plot designs. While originating in agricultural experiments, they are now widely used in medical research,…

Computation · Statistics 2025-12-22 Paavo Sattler , Nils Hichert

Some Statistical Problems with High Dimensional Financial data

For high dimensional data, some of the standard statistical techniques do not work well. So modification or further development of statistical methods are necessary. In this paper, we explore these modifications. We start with the important…

Statistical Finance · Quantitative Finance 2024-05-29 Arnab Chakrabarti , Rituparna Sen

High dimensional gaussian classification

High dimensional data analysis is known to be as a challenging problem. In this article, we give a theoretical analysis of high dimensional classification of Gaussian data which relies on a geometrical analysis of the error measure. It…

Statistics Theory · Mathematics 2008-07-10 Robin Girard

High Dimensional Cluster Analysis Using Path Lengths

A hierarchical scheme for clustering data is presented which applies to spaces with a high number of dimension ($N_{_{D}}>3$). The data set is first reduced to a smaller set of partitions (multi-dimensional bins). Multiple clustering…

Data Analysis, Statistics and Probability · Physics 2017-10-16 Kevin McIlhany , Stephen Wiggins

Contributions to Robust and Efficient Methods for Analysis of High Dimensional Data

A ubiquitous feature of data of our era is their extra-large sizes and dimensions. Analyzing such high-dimensional data poses significant challenges, since the feature dimension is often much larger than the sample size. This thesis…

Statistics Theory · Mathematics 2025-09-11 Kai Yang

Clustering small datasets in high-dimension by random projection

Datasets in high-dimension do not typically form clusters in their original space; the issue is worse when the number of points in the dataset is small. We propose a low-computation method to find statistically significant clustering…

Machine Learning · Statistics 2020-08-24 Alden Bradford , Tarun Yellamraju , Mireille Boutin

High-Dimensional Data Clustering

Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for example in image analysis. The difficulty is due to the fact that high-dimensional data usually live in different low-dimensional subspaces…

Statistics Theory · Mathematics 2016-08-16 Charles Bouveyron , Stéphane Girard , Cordelia Schmid

A new model for natural groupings in high-dimensional data

Clustering aims to divide a set of points into groups. The current paradigm assumes that the grouping is well-defined (unique) given the probability model from which the data is drawn. Yet, recent experiments have uncovered several…

Machine Learning · Statistics 2024-06-25 Mireille Boutin , Evzenie Coupkova

Data integration in high dimension with multiple quantiles

This article deals with the analysis of high dimensional data that come from multiple sources (experiments) and thus have different possibly correlated responses, but share the same set of predictors. The measurements of the predictors may…

Methodology · Statistics 2020-07-01 Guorong Dai , Ursula U. Müller , Raymond J. Carroll

High-Dimensional Multivariate Time Series With Additional Structure

High-dimensional multivariate time series are challenging due to the dependent and high-dimensional nature of the data, but in many applications there is additional structure that can be exploited to reduce computing time along with…

Methodology · Statistics 2020-03-13 Michael Schweinberger , Sergii Babkin , Katherine Ensor

Simulating Complex Crossectional and Longitudinal Data using the simDAG R Package

Generating artificial data is a crucial step when performing Monte-Carlo simulation studies. Depending on the planned study, complex data generation processes (DGP) containing multiple, possibly time-varying, variables with various forms of…

Methodology · Statistics 2025-06-03 Robin Denz , Nina Timmesfeld

multiDimBio: An R Package for the Design, Analysis, and Visualization of Systems Biology Experiments

The past decade has witnessed a dramatic increase in the size and scope of biological and behavioral experiments. These experiments are providing an unprecedented level of detail and depth of data. However, this increase in data presents…

Quantitative Methods · Quantitative Biology 2014-04-03 Samuel V. Scarpino , Ross Gillette , David Crews

Diagonal Discriminant Analysis with Feature Selection for High Dimensional Data

We introduce a new method of performing high dimensional discriminant analysis, which we call multiDA. We achieve this by constructing a hybrid model that seamlessly integrates a multiclass diagonal discriminant analysis model and feature…

Machine Learning · Statistics 2018-07-05 Sarah Elizabeth Romanes , John Thomas Ormerod , Jean YH Yang

Causal-StoNet: Causal Inference for High-Dimensional Complex Data

With the advancement of data science, the collection of increasingly complex datasets has become commonplace. In such datasets, the data dimension can be extremely high, and the underlying data generation process can be unknown and highly…

Machine Learning · Statistics 2024-03-29 Yaxin Fang , Faming Liang

An Analytical Survey on Recent Trends in High Dimensional Data Visualization

Data visualization is the process by which data of any size or dimensionality is processed to produce an understandable set of data in a lower dimensionality, allowing it to be manipulated and understood more easily by people. The goal of…

Graphics · Computer Science 2021-07-06 Alexander Kiefer , Md. Khaledur Rahman