Related papers: A Data Flow Analysis Framework for Data Flow Subsu…
Data-flow testing (DFT) aims to detect potential data interaction anomalies by focusing on the points at which variables receive values and the points at which these values are used. Such test objectives are referred as \emph{def-use…
Many research questions can be answered quickly and efficiently using data already collected for previous research. This practice is called secondary data analysis (SDA), and has gained popularity due to lower costs and improved research…
Sensitivity analysis plays an important role in searching for constitutive parameters (e.g. permeability) subsurface flow simulations. The mathematics behind is to solve a dynamic constrained optimization problem. Traditional methods like…
Flow-Updating (FU) is a fault-tolerant technique that has proved to be efficient in practice for the distributed computation of aggregate functions in communication networks where individual processors do not have access to global…
Data association problems are an important component of many computer vision applications, with multi-object tracking being one of the most prominent examples. A typical approach to data association involves finding a graph matching or…
Data assimilation (DA) estimates a dynamical system's state from noisy observations. Recent generative models like the ensemble score filter (EnSF) improve DA in high-dimensional nonlinear settings but are computationally expensive. We…
How much data is needed to optimally schedule distributed energy resources (DERs)? Does the distribution system operator (DSO) have to know load demands at each bus of the feeder to solve an optimal power flow (OPF)? This work exploits…
The growing interconnection between software systems increases the need for security already at design time. Security-related properties like confidentiality are often analyzed based on data flow diagrams (DFDs). However, manually analyzing…
Context- and flow-sensitive value-flow information is an important building block for many static analysis tools. Unfortunately, current approaches to compute value-flows do not scale to large codebases, due to high memory and runtime…
Traffic sampling has become an indispensable tool in network management. While there exists a plethora of sampling systems, they generally assume flow rates are stable and predictable over a sampling period. Consequently, when deployed in…
We present DataFlow, a computational framework for building, testing, and deploying high-performance machine learning systems on unbounded time-series data. Traditional data science workflows assume finite datasets and require substantial…
Data-Flow Integrity (DFI) is a well-known approach to effectively detecting a wide range of software attacks. However, its real-world application has been quite limited so far because of the prohibitive performance overhead it incurs.…
More and more distributed software systems are being developed and deployed today. Like other software, distributed software systems also need very strong quality assurance support. Distributed software is often very large/complex, has…
The distributed adaptive signal fusion (DASF) framework allows to solve spatial filtering optimization problems in a distributed and adaptive fashion over a bandwidth-constrained wireless sensor network. The DASF algorithm requires each…
Federated Retrieval (FR) routes queries across multiple external knowledge sources, to mitigate hallucinations of LLMs, when necessary external knowledge is distributed. However, existing methods struggle to retrieve high-quality and…
The flow size distribution is a useful metric for traffic modeling and management. Its estimation based on sampled data, however, is problematic. Previous work has shown that flow sampling (FS) offers enormous statistical benefits over…
Operations over data streams typically hinge on efficient mechanisms to aggregate or summarize history on a rolling basis. For high-volume data steams, it is critical to manage state in a manner that is fast and memory efficient --…
The optimal power flow (OPF) is a multi-valued, non-convex mapping from loads to dispatch setpoints. The variability of system parameters (e.g., admittances, topology) further contributes to the multiplicity of dispatch setpoints for a…
We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics. We extend recent sampling-based approaches that leverage controlled stochastic…
Real-time control of distribution networks requires accurate information about the system state. In practice, however, such information is difficult to obtain because real-time measurements are available only at a limited number of…