Related papers: From Data to the p-Adic or Ultrametric Model

The Correspondence Analysis Platform for Uncovering Deep Structure in Data and Information

We study two aspects of information semantics: (i) the collection of all relationships, (ii) tracking and spotting anomaly and change. The first is implemented by endowing all relevant information spaces with a Euclidean metric in a common…

Artificial Intelligence · Computer Science 2011-01-11 Fionn Murtagh

Ultrametric Wavelet Regression of Multivariate Time Series: Application to Colombian Conflict Analysis

We first pursue the study of how hierarchy provides a well-adapted tool for the analysis of change. Then, using a time sequence-constrained hierarchical clustering, we develop the practical aspects of a new approach to wavelet regression.…

Machine Learning · Statistics 2011-01-11 Fionn Murtagh , Michael Spagat , Jorge A. Restrepo

Ultrametric Component Analysis with Application to Analysis of Text and of Emotion

We review the theory and practice of determining what parts of a data set are ultrametric. It is assumed that the data set, to begin with, is endowed with a metric, and we include discussion of how this can be brought about if a…

Artificial Intelligence · Computer Science 2013-09-17 Fionn Murtagh

Ultrametric embedding: application to data fingerprinting and to fast data clustering

We begin with pervasive ultrametricity due to high dimensionality and/or spatial sparsity. How extent or degree of ultrametricity can be quantified leads us to the discussion of varied practical cases when ultrametricity can be partially or…

Statistics Theory · Mathematics 2011-01-11 Fionn Murtagh

Anomalous Change Point Detection Using Probabilistic Predictive Coding

Change point detection (CPD) and anomaly detection (AD) are essential techniques in various fields to identify abrupt changes or abnormal data instances. However, existing methods are often constrained to univariate data, face scalability…

Machine Learning · Statistics 2025-12-03 Roelof G. Hup , Julian P. Merkofer , Alex A. Bhogal , Ruud J. G. van Sloun , Reinder Haakma , Rik Vullings

The noncommutative replica approach

p-Adic and noncommutative analysis are applied to describe phase transitions in disordered systems. In the noncommutative replica approach we replicate the disorder instead of the system degrees of freedom. The noncommutatibe replica…

Disordered Systems and Neural Networks · Physics 2007-05-23 S. V. Kozyrev

Uncertainty-aware data assimilation through variational inference

Data assimilation, consisting in the combination of a dynamical model with a set of noisy and incomplete observations in order to infer the state of a system over time, involves uncertainty in most settings. Building upon an existing…

Machine Learning · Computer Science 2026-03-02 Anthony Frion , David S Greenberg

Understanding better (some) astronomical data using Bayesian methods

Current analysis of astronomical data are confronted with the daunting task of modeling the awkward features of astronomical data, among which heteroscedastic (point-dependent) errors, intrinsic scatter, non-ignorable data collection…

Instrumentation and Methods for Astrophysics · Physics 2011-12-19 S. Andreon

Visualizing probabilistic models: Intensive Principal Component Analysis

Unsupervised learning makes manifest the underlying structure of data without curated training and specific problem definitions. However, the inference of relationships between data points is frustrated by the `curse of dimensionality' in…

Statistical Mechanics · Physics 2022-06-08 Katherine N. Quinn , Colin B. Clement , Francesco De Bernardis , Michael D. Niemack , James P. Sethna

Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Metric learning aims to learn a highly discriminative model encouraging the embeddings of similar classes to be close in the chosen metrics and pushed apart for dissimilar ones. The common recipe is to use an encoder to extract embeddings…

Computer Vision and Pattern Recognition · Computer Science 2022-03-23 Aleksandr Ermolov , Leyla Mirvakhabova , Valentin Khrulkov , Nicu Sebe , Ivan Oseledets

Flexible space-time models for extreme data

Extreme value analysis is an essential methodology in the study of rare and extreme events, which hold significant interest in various fields, particularly in the context of environmental sciences. Models that employ the exceedances of…

Methodology · Statistics 2025-07-16 Lorenzo Dell'Oro , Carlo Gaetan

Ultrametric and Generalized Ultrametric in Computational Logic and in Data Analysis

Following a review of metric, ultrametric and generalized ultrametric, we review their application in data analysis. We show how they allow us to explore both geometry and topology of information, starting with measured data. Some themes…

Logic in Computer Science · Computer Science 2010-08-24 Fionn Murtagh

Compression-Complexity with Ordinal Patterns for Robust Causal Inference in Irregularly-Sampled Time Series

Distinguishing cause from effect is a scientific challenge resisting solutions from mathematics, statistics, information theory and computer science. Compression-Complexity Causality (CCC) is a recently proposed interventional measure of…

Data Analysis, Statistics and Probability · Physics 2022-04-26 Aditi Kathpalia , Pouya Manshour , Milan Paluš

Hilbert Space Becomes Ultrametric in the High Dimensional Limit: Application to Very High Frequency Data Analysis

An ultrametric topology formalizes the notion of hierarchical structure. An ultrametric embedding, referred to here as ultrametricity, is implied by a natural hierarchical embedding. Such hierarchical structure can be global in the data…

Data Analysis, Statistics and Probability · Physics 2007-05-23 Fionn Murtagh

Anomaly and Change Detection in Graph Streams through Constant-Curvature Manifold Embeddings

Mapping complex input data into suitable lower dimensional manifolds is a common procedure in machine learning. This step is beneficial mainly for two reasons: (1) it reduces the data dimensionality and (2) it provides a new data…

Machine Learning · Computer Science 2018-11-28 Daniele Zambon , Lorenzo Livi , Cesare Alippi

The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering

An ultrametric topology formalizes the notion of hierarchical structure. An ultrametric embedding, referred to here as ultrametricity, is implied by a hierarchical embedding. Such hierarchical structure can be global in the data set, or…

Methodology · Statistics 2011-01-11 Fionn Murtagh

Estimating the State of Large Spatiotemporally Chaotic Systems

Data assimilation refers to the process of obtaining an estimate of a system's state using a model for the system's time evolution and a time series of measurements that are possibly noisy and incomplete. However, for practical reasons, the…

Chaotic Dynamics · Physics 2007-05-23 Matthew Cornick , Brian Hunt , Edward Ott , Michael F. Schatz

Uncertainty Quantification of Data Shapley via Statistical Inference

As data plays an increasingly pivotal role in decision-making, the emergence of data markets underscores the growing importance of data valuation. Within the machine learning landscape, Data Shapley stands out as a widely embraced method…

Machine Learning · Statistics 2024-07-30 Mengmeng Wu , Zhihong Liu , Xiang Li , Ruoxi Jia , Xiangyu Chang

Statistical Embeddings for Similarity, Retrieval, and Interpretable Alignment of Numeric Tabular Datasets

Numeric tabular datasets are the dominant data format in scientific practice, yet large language models lack native mechanisms for representing numeric datasets in a meaningful way across heterogeneous feature spaces. Existing approaches…

Machine Learning · Computer Science 2026-05-29 M. Ross Kunz , John Merickel , Keith Wilson

An Emergent Space for Distributed Data with Hidden Internal Order through Manifold Learning

Manifold-learning techniques are routinely used in mining complex spatiotemporal data to extract useful, parsimonious data representations/parametrizations; these are, in turn, useful in nonlinear model identification tasks. We focus here…

Data Analysis, Statistics and Probability · Physics 2018-12-07 Felix P. Kemeth , Sindre W. Haugland , Felix Dietrich , Tom Bertalan , Kevin Höhlein , Qianxiao Li , Erik M. Bollt , Ronen Talmon , Katharina Krischer , Ioannis G. Kevrekidis