Related papers: Distributed Source Coding for Parametric and Non-P…
In the context of goal-oriented communications, this paper addresses the achievable rate versus generalization error region of a learning task applied on compressed data. The study focuses on the distributed setup where a source is…
Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear…
We study the problem of the reconstruction of a Gaussian field defined in [0,1] using N sensors deployed at regular intervals. The goal is to quantify the total data rate required for the reconstruction of the field with a given mean square…
Distributed learning facilitates the scaling-up of data processing by distributing the computational burden over several nodes. Despite the vast interest in distributed learning, generalization performance of such approaches is not well…
Distributed averaging, or distributed average consensus, is a common method for computing the sample mean of the data dispersed among the nodes of a network in a decentralized manner. By iteratively exchanging messages with neighbors, the…
Consensus is a common method for computing a function of the data distributed among the nodes of a network. Of particular interest is distributed average consensus, whereby the nodes iteratively compute the sample average of the data stored…
Source coding with a side information "vending machine" is a recently proposed framework in which the statistical relationship between the side information and the source, instead of being given and fixed as in the classical Wyner-Ziv…
Consider the problem where a statistician in a two-node system receives rate-limited information from a transmitter about marginal observations of a memoryless process generated from two possible distributions. Using its own observations,…
Learning-based and data-driven techniques have recently become a subject of primary interest in the field of reconstruction and regularization of inverse problems. Besides the development of novel methods, yielding excellent results in…
Symbolic regression algorithms search a space of mathematical expressions for formulas that explain given data. Transformer-based models have emerged as a promising, scalable approach shifting the expensive combinatorial search to a…
This paper studies nonparametric regression with repeated measurements when the response in the target domain is unobservable or costly to collect. We adopt a transfer learning framework that leverages a source domain with observable…
In this work, lossy distributed compression of pairs of correlated sources is considered. Conventionally, Shannon's random coding arguments -- using randomly generated unstructured codebooks whose blocklength is taken to be asymptotically…
This paper presents generalized channel coding theorems for a time-slotted distributed communication system where a transmitter-receiver pair is communicating in parallel with other transmitters. Assume that the channel code of each…
This work considers distributed sensing and transmission of sporadic random samples. Lower bounds are derived for the reconstruction error of a single normally or uniformly-distributed finite-dimensional vector imperfectly measured by a…
In the successive refinement problem, a fixed-length sequence emitted from an information source is encoded into two codewords by two encoders in order to give two reconstructions of the sequence. One of two reconstructions is obtained by…
Increasing practical interest has been shown in regression problems where the errors, or disturbances, are centred in a way that reflects particular characteristics of the mechanism that generated the data. In economics this occurs in…
This paper presents a novel information-theoretic perspective on generalization in machine learning by framing the learning problem within the context of lossy compression and applying finite blocklength analysis. In our approach, the…
Consider a lossy compression system with $\ell$ distributed encoders and a centralized decoder. Each encoder compresses its observed source and forwards the compressed data to the decoder for joint reconstruction of the target signals under…
We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden…
In this paper we present a series of results that permit to extend in a direct manner uniform deviation inequalities of the empirical process from the independent to the dependent case characterizing the additional error in terms of…