Related papers: A scale-based approach to finding effective dimens…
Datasets such as images, text, or movies are embedded in high-dimensional spaces. However, in important cases such as images of objects, the statistical structure in the data constrains samples to a manifold of dramatically lower…
We present a novel view of nonlinear manifold learning using derivative-free optimization techniques. Specifically, we propose an extension of the classical multi-dimensional scaling (MDS) method, where instead of performing gradient…
Analyzing large volumes of high-dimensional data is an issue of fundamental importance in data science, molecular simulations and beyond. Several approaches work on the assumption that the important content of a dataset belongs to a…
A common belief in high-dimensional data analysis is that data are concentrated on a low-dimensional manifold. This motivates simultaneous dimension reduction and regression on manifolds. We provide an algorithm for learning gradients on…
The real-life data have a complex and non-linear structure due to their nature. These non-linearities and the large number of features can usually cause problems such as the empty-space phenomenon and the well-known curse of dimensionality.…
Datasets with significant proportions of noisy (incorrect) class labels present challenges for training accurate Deep Neural Networks (DNNs). We propose a new perspective for understanding DNN generalization for such datasets, by…
The size of datasets has been increasing rapidly both in terms of number of variables and number of events. As a result, the empty space phenomenon and the curse of dimensionality complicate the extraction of useful information. But, in…
Manifold learning is used for dimensionality reduction, with the goal of finding a projection subspace to increase and decrease the inter- and intraclass variances, respectively. However, a bottleneck for subspace learning methods often…
In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes…
Manifold learning (ML), known also as non-linear dimension reduction, is a set of methods to find the low dimensional structure of data. Dimension reduction for large, high dimensional data is not merely a way to reduce the data; the new…
Manifold learning techniques have become increasingly valuable as data continues to grow in size. By discovering a lower-dimensional representation (embedding) of the structure of a dataset, manifold learning algorithms can substantially…
The manifold hypothesis suggests that high-dimensional data often lie on or near a low-dimensional manifold. Estimating the dimension of this manifold is essential for leveraging its structure, yet existing work on dimension estimation is…
Modern large-scale datasets are frequently said to be high-dimensional. However, their data point clouds frequently possess structures, significantly decreasing their intrinsic dimensionality (ID) due to the presence of clusters, points…
Scientists and engineers rely on accurate mathematical models to quantify the objects of their studies, which are often high-dimensional. Unfortunately, high-dimensional models are inherently difficult, i.e. when observations are sparse or…
The manifold hypothesis, which assumes that data lies on or close to an unknown manifold of low intrinsic dimension, is a staple of modern machine learning research. However, recent work has shown that real-world data exhibits distinct…
Dimension reduction (DR) aims to learn low-dimensional representations of high-dimensional data with the preservation of essential information. In the context of manifold learning, we define that the representation after…
Neural ordinary differential equations (NODE) have garnered significant attention for their design of continuous-depth neural networks and the ability to learn data/feature dynamics. However, for high-dimensional systems, estimating…
We propose a new data representation method based on Quantum Cognition Machine Learning and apply it to manifold learning, specifically to the estimation of intrinsic dimension of data sets. The idea is to learn a representation of each…
Multidimensional scaling is an important dimension reduction tool in statistics and machine learning. Yet few theoretical results characterizing its statistical performance exist, not to mention any in high dimensions. By considering a…
It is widely believed that natural image data exhibits low-dimensional structure despite the high dimensionality of conventional pixel representations. This idea underlies a common intuition for the remarkable success of deep learning in…