Related papers: Inference of Flow Statistics via Packet Sampling i…
A problem which has recently attracted research attention is that of estimating the distribution of flow sizes in internet traffic. On high traffic links it is sometimes impossible to record every packet. Researchers have approached the…
In this work we study the set size distribution estimation problem, where elements are randomly sampled from a collection of non-overlapping sets and we seek to recover the original set size distribution from the samples. This problem has…
A new method of estimating some statistical characteristics of TCP flows in the Internet is developed in this paper. For this purpose, a new set of random variables (referred to as observables) is defined. When dealing with sampled traffic,…
The efficiency of flow-based networking mechanisms strongly depends on traffic characteristics and should thus be assessed using accurate flow models. For example, in the case of algorithms based on the distinction between elephant and mice…
The flow size distribution is a useful metric for traffic modeling and management. Its estimation based on sampled data, however, is problematic. Previous work has shown that flow sampling (FS) offers enormous statistical benefits over…
The high volume of packets and packet rates of traffic on some router links makes it exceedingly difficult for routers to examine every packet in order to keep detailed statistics about the traffic which is traversing the router. Sampling…
Typical event datasets such as those used in network intrusion detection comprise hundreds of thousands, sometimes millions, of discrete packet events. These datasets tend to be high dimensional, stateful, and time-series in nature, holding…
Link dimensioning is used by ISPs to properly provision the capacity of their network links. Operators have to make provisions for sudden traffic bursts and network failures to assure uninterrupted operations. In practice, traffic averages…
Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is…
The substantial growth of network traffic speed and volume presents practical challenges to network data analysis. Packet thinning and flow aggregation protocols such as NetFlow reduce the size of datasets by providing structured data…
In this paper we examine rigorously the evidence for dependence among data size, transfer rate and duration in Internet flows. We emphasize two statistical approaches for studying dependence, including Pearson's correlation coefficient and…
Consider a finite renewal process in the sense that interrenewal times are positive i.i.d. variables and the total number of renewals is a random variable, independent of interrenewal times. A finite point process can be obtained by…
We propose flow-based analysis to estimate quality of an Internet connection. Using results from the queuing theory we compare two expressions for backbone traffic that have different scopes of applicability. A curve that shows dependence…
We study the problem of disseminating a piece of information through all the nodes of a network, given that it is known originally only to a single node. In the absence of any structural knowledge on the network other than the nodes'…
The complex dynamics of physical systems can often be modeled with stochastic differential equations. However, computational constraints inhibit the estimation of dynamics from large time-series datasets. I present a method for estimating…
By studying the statistics of recurrence intervals, $\tau$, between volatilities of Internet traffic rate changes exceeding a certain threshold $q$, we find that the probability distribution functions, $P_{q}(\tau)$, for both byte and…
Network administrators want to detect TCP-level packet reordering to diagnose performance problems and attacks. However, reordering is expensive to measure, because each packet must be processed relative to the TCP sequence number of its…
This work is devoted to a certain class of probabilistic snapshots for elements of the observed data stream. We show you how one can control their probabilistic properties and we show some potential applications. Our solution can be used to…
It has been shown recently that graph signals with small total variation can be accurately recovered from only few samples if the sampling set satisfies a certain condition, referred to as the network nullspace property. Based on this…
Rectified flow (Liu et al., 2022; Liu, 2022; Wu et al., 2023) is a method for defining a transport map between two distributions, and enjoys popularity in machine learning, although theoretical results supporting the validity of these…