English
Related papers

Related papers: Measuring the Data

200 papers

The real-life data have a complex and non-linear structure due to their nature. These non-linearities and the large number of features can usually cause problems such as the empty-space phenomenon and the well-known curse of dimensionality.…

Machine Learning · Computer Science 2025-03-13 Kadir Özçoban , Murat Manguoğlu , Emrullah Fatih Yetkin

The ability to represent and compare machine learning models is crucial in order to quantify subtle model changes, evaluate generative models, and gather insights on neural network architectures. Existing techniques for comparing data…

Data augmentation is a widely used technique and an essential ingredient in the recent advance in self-supervised representation learning. By preserving the similarity between augmented data, the resulting data representation can improve…

Machine Learning · Statistics 2025-01-16 Shulei Wang

It is a standard assumption that datasets in high dimension have an internal structure which means that they in fact lie on, or near, subsets of a lower dimension. In many instances it is important to understand the real dimension of the…

Machine Learning · Statistics 2025-07-21 James A. D. Binnie , Paweł Dłotko , John Harvey , Jakub Malinowski , Ka Man Yim

We consider a common measurement paradigm, where an unknown subset of an affine space is measured by unknown continuous quasi-convex functions. Given the measurement data, can one determine the dimension of this space? In this paper, we…

Algebraic Topology · Mathematics 2020-07-08 Min-Chun Wu , Vladimir Itskov

We propose a new data representation method based on Quantum Cognition Machine Learning and apply it to manifold learning, specifically to the estimation of intrinsic dimension of data sets. The idea is to learn a representation of each…

The manifold hypothesis suggests that high-dimensional data often lie on or near a low-dimensional manifold. Estimating the dimension of this manifold is essential for leveraging its structure, yet existing work on dimension estimation is…

Machine Learning · Computer Science 2026-04-02 Zelong Bi , Pierre Lafaye de Micheaux

Estimating intrinsic dimensionality of data is a classic problem in pattern recognition and statistics. Principal Component Analysis (PCA) is a powerful tool in discovering dimensionality of data sets with a linear structure; it, however,…

Computer Vision and Pattern Recognition · Computer Science 2010-02-11 Mingyu Fan , Nannan Gu , Hong Qiao , Bo Zhang

Data living on manifolds commonly appear in many applications. Often this results from an inherently latent low-dimensional system being observed through higher dimensional measurements. We show that under certain conditions, it is possible…

Machine Learning · Statistics 2018-07-05 Ariel Schwartz , Ronen Talmon

Big Data are huge amounts of digital information that are automatically accrued or merged from several sources and rarely result from properly planned surveys. A Big Dataset is herein conceived of as a collection of information concerning a…

Computation · Statistics 2020-02-12 Deldossi Laura , Tommasi Chiara

Invariant measures encode the long-time behaviour of a dynamical system. In this work, we propose an optimization-based method to discover invariant measures directly from data gathered from a system. Our method does not require an explicit…

Dynamical Systems · Mathematics 2025-10-09 Jason J. Bramburger , Giovanni Fantuzzi

One develops a fast computational methodology for principal component analysis on manifolds. Instead of estimating intrinsic principal components on an object space with a Riemannian structure, one embeds the object space in a numerical…

Methodology · Statistics 2024-10-04 Ka Chun Wong , Vic Patrangenaru , Robert L. Paige , Mihaela Pricop Jeckstadt

We study non-linear data-dimension reduction. We are motivated by the classical linear framework of Principal Component Analysis. In nonlinear case, we introduce instead a new kernel-Principal Component Analysis, manifold and feature space…

Functional Analysis · Mathematics 2022-09-09 Palle E. T. Jorgensen , Sooran Kang , Myung-Sin Song , Feng Tian

Dimensionality-reduction methods are a fundamental tool in the analysis of large data sets. These algorithms work on the assumption that the "intrinsic dimension" of the data is generally much smaller than the ambient dimension in which it…

Machine Learning · Computer Science 2018-10-30 Henry Kvinge , Elin Farnell , Michael Kirby , Chris Peterson

Data-Driven Computational Mechanics is a novel computing paradigm that enables the transition from standard data-starved approaches to modern data-rich approaches. At this early stage of development, one can distinguish two mainstream…

Numerical Analysis · Mathematics 2019-10-29 Cristian Guillermo Gebhardt , Dominik Schillinger , Marc Christian Steinbach , Raimund Rolfes

Quantifying numerical data involves addressing two key challenges: first, determining whether the data can be naturally quantified, and second, identifying the numerical intervals or ranges of values that correspond to specific value…

Data Analysis, Statistics and Probability · Physics 2025-11-21 Anton Kolonin

Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems. It has several desirable properties for real world applications: it naturally deals with multivariate data, it can handle…

Machine Learning · Statistics 2024-10-30 Valero Laparra , J. Emmanuel Johnson , Gustau Camps-Valls , Raul Santos-Rodríguez , Jesus Malo

Datasets such as images, text, or movies are embedded in high-dimensional spaces. However, in important cases such as images of objects, the statistical structure in the data constrains samples to a manifold of dramatically lower…

Machine Learning · Computer Science 2019-10-29 Stefano Recanatesi , Matthew Farrell , Madhu Advani , Timothy Moore , Guillaume Lajoie , Eric Shea-Brown

We introduce novel estimators for computing the curvature, tangent spaces, and dimension of data from manifolds, using tools from diffusion geometry. Although classical Riemannian geometry is a rich source of inspiration for geometric data…

Differential Geometry · Mathematics 2026-02-13 Iolo Jones

The discovering of low-dimensional manifolds in high-dimensional data is one of the main goals in manifold learning. We propose a new approach to identify the effective dimension (intrinsic dimension) of low-dimensional manifolds. The scale…

Statistics Theory · Mathematics 2008-03-17 Xiaohui Wang , J. S. Marron
‹ Prev 1 2 3 10 Next ›