Related papers: Transactional WaveCache: Towards Speculative and O…
Switching, routing, and security functions are the backbone of packet processing networks. Fast and efficient processing of packets requires maintaining the state of a large number of transient network connections. In particular, modern…
WarpFlow is a fast, interactive data querying and processing system with a focus on petabyte-scale spatiotemporal datasets and Tesseract queries. With the rapid growth in smartphones and mobile navigation services, we now have an…
This paper introduces the Wave Transactional Filesystem (WTF), a novel, transactional, POSIX-compatible filesystem based on a new file slicing API that enables efficient file transformations. WTF provides transactional access to a…
Serverless computing that runs functions with auto-scaling is a popular task execution pattern in the cloud-native era. By connecting serverless functions into workflows, tenants can achieve complex functionality. Prior researches adopt the…
Correlative computational microscopy can accelerate imaging and modeling of cellular dynamics by relaxing trade-offs inherent to dynamic imaging. Existing computational microscopy frameworks are either specialized or overly generic,…
We present DataFlow, a computational framework for building, testing, and deploying high-performance machine learning systems on unbounded time-series data. Traditional data science workflows assume finite datasets and require substantial…
Choreographic programming is a paradigm for writing distributed applications. It allows programmers to write a single program, called a choreography, that can be compiled to generate correct implementations of each process in the…
The ultimate goal of this work is a real-time processing framework for ultrasound image reconstruction augmented with machine learning. To attain this, we have implemented WaveFlow - a set of ultrasound data acquisition and processing tools…
Scalable ordered maps must ensure that range queries, which operate over many consecutive keys, provide intuitive semantics (e.g., linearizability) without degrading the performance of concurrent insertions and removals. These goals are…
Traveling waves of neural activity have been observed throughout the brain at a diversity of regions and scales; however, their precise computational role is still debated. One physically inspired hypothesis suggests that the cortical sheet…
FastFlow is a programming environment specifically targeting cache-coherent shared-memory multi-cores. FastFlow is implemented as a stack of C++ template libraries built on top of lock-free (fence-free) synchronization mechanisms. In this…
In this paper we provide a high performance solution to the problem of committing transactions while enforcing a predefined order. We provide the design and implementation of three algorithms, which deploy a specialized cooperative…
Persistent memory provides high-performance data persistence at main memory. Memory writes need to be performed in strict order to satisfy storage consistency requirements and enable correct recovery from system crashes. Unfortunately,…
In sight of fundamental thermal limits on further substantial performance improvements of modern digital computational processing units, wave-based analog computation is becoming an enticing alternative. A wave, as it propagates through a…
Software transactional memory implementations which allow transactions to work on inconsistent states of shared data, risk to cause application visible errors such as memory access violations or endless loops. Hence, many implementations…
Vision-Language-Action (VLA) models have emerged as a unified paradigm for robotic perception and control, enabling emergent generalization and long-horizon task execution. However, their deployment in dynamic, real-world environments is…
Since the introduction of the CDC 6600 in 1965 and its `scoreboarding' technique processors have not (necessarily) executed instructions in program order. Programmers of high-level code may sequence independent instructions in arbitrary…
How to model distribution of sequential data, including but not limited to speech and human motions, is an important ongoing research problem. It has been demonstrated that model capacity can be significantly enhanced by introducing…
Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try guess the destination and…
Many modern applications require real-time processing of large volumes of high-speed data. Such data processing needs can be modeled as a streaming computation. A streaming computation is specified as a dataflow graph that exposes multiple…