Related papers: Encoding of data sets and algorithms
The paper describes an approach to measuring convergence of an algorithm to its result in terms of an entropy-like function of partitions of its inputs of a given length. The goal is to look at the algorithmic data processing from the…
We study the computational complexity of (deterministic or randomized) algorithms based on point samples for approximating or integrating functions that can be well approximated by neural networks. Such algorithms (most prominently…
Entropy is the measure of uncertainty in any data and is adopted for maximisation of mutual information in many remote sensing operations. The availability of wide entropy variations motivated us for an investigation over the suitability…
We describe a framework in which is possible to develop and implement algorithms for the approximation of invariant measures of dynamical systems with a given bound on the error of the approximation. Our approach is based on a general…
Much algorithmic research in NLP aims to efficiently manipulate rich formal structures. An algorithm designer typically seeks to provide guarantees about their proposed algorithm -- for example, that its running time or space complexity is…
Measuring inter-dataset similarity is an important task in machine learning and data mining with various use cases and applications. Existing methods for measuring inter-dataset similarity are computationally expensive, limited, or…
With increasing use of digital control it is natural to view control inputs and outputs as stochastic processes assuming values over finite alphabets rather than in a Euclidean space. As control over networks becomes increasingly common,…
Automating algorithm configuration is growing increasingly necessary as algorithms come with more and more tunable parameters. It is common to tune parameters using machine learning, optimizing performance metrics such as runtime and…
Embeddings are a basic initial feature extraction step in many machine learning models, particularly in natural language processing. An embedding attempts to map data tokens to a low-dimensional space where similar tokens are mapped to…
We study the notion of approximate entropy within the framework of network theory. Approximate entropy is an uncertainty measure originally proposed in the context of dynamical systems and time series. We firstly define a purely structural…
Training a deep neural network (DNN) often involves stochastic optimization, which means each run will produce a different model. Several works suggest this variability is negligible when models have the same performance, which in the case…
Neural networks have dramatically increased our capacity to learn from large, high-dimensional datasets across innumerable disciplines. However, their decisions are not easily interpretable, their computational costs are high, and building…
By introducing Hilbert space and operators, we show how probabilities, approximations and entropy encoding from signal and image processing allow precise formulas and quantitative estimates. Our main results yield orthogonal bases which…
It is known that statistical model selection as well as identification of dynamical equations from available data are both very challenging tasks. Physical systems behave according to their underlying dynamical equations which, in turn, can…
Word embedding, specially with its recent developments, promises a quantification of the similarity between terms. However, it is not clear to which extent this similarity value can be genuinely meaningful and useful for subsequent tasks.…
Complex systems are characterised by a tight, nontrivial interplay of their constituents, which gives rise to a multi-scale spectrum of emergent properties. In this scenario, it is practically and conceptually difficult to identify those…
Accuracy is the most important parameter among few others which defines the effectiveness of a machine learning algorithm. Higher accuracy is always desirable. Now, there is a vast number of well established learning algorithms already…
Complex networks are ubiquitous to several Computer Science domains. Centrality measures are an important analysis mechanism to uncover vital elements of complex networks. However, these metrics have high computational costs and…
Consider a collection of competing machine learning algorithms. Given their performance on a benchmark of datasets, we would like to identify the best performing algorithm. Specifically, which algorithm is most likely to rank highest on a…
We study entropy-bounded computational geometry, that is, geometric algorithms whose running times depend on a given measure of the input entropy. Specifically, we introduce a measure that we call range-partition entropy, which unifies and…