Related papers: Finding Patterns in Visualized Data by Adding Redu…
Factor graphs are a ubiquitous tool for multi-source inference in robotics and multi-sensor networks. They allow for heterogeneous measurements from many sources to be concurrently represented as factors in the state posterior distribution,…
Visual patterns represent the discernible regularity in the visual world. They capture the essential nature of visual objects or scenes. Understanding and modeling visual patterns is a fundamental problem in visual recognition that has wide…
The problem of how to properly quantify redundant information is an open question that has been the subject of much recent research. Redundant information refers to information about a target variable S that is common to two or more…
Network service providers and customers are often concerned with aggregate performance measures that span multiple network paths. Unfortunately, forming such network-wide measures can be difficult, due to the issues of scale involved. In…
Colors and shapes are commonly used to encode categories in multi-class scatterplots. Designers often combine the two channels to create redundant encodings, aiming to enhance class distinctions. However, evidence for the effectiveness of…
In this paper, we define a new measure of the redundancy of information from a fault tolerance perspective. The partial information decomposition (PID) emerged last decade as a framework for decomposing the multi-source mutual information…
Many systems exhibit complex interactions between their components: some features or actions amplify each other's effects, others provide redundant information, and some contribute independently. We present a simple geometric method for…
Pattern extraction algorithms are enabling insights into the ever-growing amount of today's datasets by translating reoccurring data properties into compact representations. Yet, a practical problem arises: With increasing data volumes and…
In the data deluge context, pattern recognition or labeling in streams is becoming quite an essential and pressing task as data flows inside always bigger streams. The assessment of such tasks is not so easy when dealing with temporal data,…
Extensive monitoring systems generate data that is usually compressed for network transmission. This compressed data might then be processed in the cloud for tasks such as anomaly detection. However, compression can potentially impair the…
We present a theoretical framework that extends classical information theory to finite and structured systems by redefining redundancy as a fundamental property of information organization rather than inefficiency. In this framework,…
In weighted graphs the shortest path between two nodes is often reached through an indirect path, out of all possible connections, leading to structural redundancies which play key roles in the dynamics and evolution of complex networks. We…
Access to complete data in large-scale networks is often infeasible. Therefore, the problem of missing data is a crucial and unavoidable issue in the analysis and modeling of real-world social networks. However, most of the research on…
Given a set $X$ of "empirical" points, whose coordinates are perturbed by errors, we analyze whether it contains redundant information, that is whether some of its elements could be represented by a single equivalent point. If this is the…
We develop a continual learning method for pretrained models that \emph{requires no access to old-task data}, addressing a practical barrier in foundation model adaptation where pretraining distributions are often unavailable. Our key…
Log-linear models are typically fitted to contingency table data to describe and identify the relationship between different categorical variables. However, the data may include observed zero cell entries. The presence of zero cell entries…
Supervised deep learning models require significant amount of labeled data to achieve an acceptable performance on a specific task. However, when tested on unseen data, the models may not perform well. Therefore, the models need to be…
The input data features set for many data driven tasks is high-dimensional while the intrinsic dimension of the data is low. Data analysis methods aim to uncover the underlying low dimensional structure imposed by the low dimensional hidden…
Feature selection has attracted significant attention in data mining and machine learning in the past decades. Many existing feature selection methods eliminate redundancy by measuring pairwise inter-correlation of features, whereas the…
The Partial Information Decomposition (PID) framework has emerged as a powerful tool for analyzing high-order interdependencies in complex network systems. However, its application to dynamic processes remains challenging due to the…