Computer Science
Modern table formats such as Apache Iceberg compute and store metadata-commit timestamps, record counts, and column-level statistics such as null counts and value bounds at write time as part of file writing. These statistics serve query…
Geo-distributed OLTP databases are widely deployed across cloud regions, yet current evaluation practices do not cover the challenges of this aspect. Existing benchmarks assume stable network conditions; they lack explicit settings for data…
This paper presents an extensible ns-3-based simulation framework for evaluating intent-based, semantics-aware control in Open RAN architectures. The framework integrates external Radio Access Network (RAN) Intelligent Controller (RIC)…
Critical networking workflows require high-fidelity packet captures (PCAPs) for testing, security analysis, and protocol validation, not just statistical flow-level summaries. Recent packet generators have demonstrated protocol-constrained…
Artificial Intelligence (AI)-powered Radio Access Network (RAN) networks have attracted significant attention from both industry and academia. Meanwhile, Digital Twins offer a safe playground for experimenting with AI/Machine Learning…
Text-to-Visualization (Text-to-Vis) translates natural language queries into visualization query languages, enabling non-expert users to perform data analysis. However, most existing methods follow a one-shot paradigm that requires users to…
Global megatrends, such as urbanization, population growth, and emerging network solutions are accelerating the development of the Connected and Autonomous Vehicles (CAVs) industry. There are many truths, some misconceptions, and even some…
As server CPUs scale to dozens and now hundreds of cores per socket, parallel query engines must rethink how they redistribute data between threads. Partitioned operators such as hash joins and aggregations require frequent data…
In cloud data platforms, developers often encounter performance regressions that occur in specific tenant datasets. However, due to confidentiality constraints, they cannot access the original data, which makes it difficult to reproduce…
Oracle Exadata consolidates thousands of tenant databases onto shared storage infrastructure deployed at hundreds of customer sites worldwide. Oracle Multitenant architecture enables this extreme density, with thousands of tenant databases…
University campuses host abundant but fragmented GPU resources whose voluntary sharing is blocked by a mismatch between revocable, autonomous ownership and migration mechanisms that assume stationary failure hazards, homogeneous…
In this paper, we present an open-source measurement platform designed to characterize the performance of commercial cellular (Verizon, a major US provider) and LEO satellite (Starlink) networks through real-world flight tests in rural…
Stacked intelligent metasurfaces (SIMs) have recently emerged as a powerful wave-domain technology that enables multi-stage manipulation of electromagnetic signals through multilayer programmable architectures. While SIMs offer…
Approximate k-Nearest Neighbor (AKNN) search is widely used in vector databases. When vectors carry additional attributes (e.g., labels or numerical values), filtered AKNN search retrieves the nearest vectors to a query vector under…
Data transformation correctness is a fundamental challenge in data engineering: how can we verify that pipelines produce correct results before executing on production data? Existing practice relies on iterative testing over materialized…
The accelerating expansion of AI workloads is colliding with an energy landscape increasingly dominated by intermittent renewable generation. While vast quantities of zero-carbon energy are routinely curtailed, today's centralized…
Workload traces from cloud data warehouse providers reveal that standard benchmarks such as TPC-H and TPC-DS fail to capture key characteristics of real-world workloads, including query repetition and string-heavy queries. In this paper, we…
M2M deployments such as drone swarms demand mission-critical communication: km-scale range, strong per-device identity and mutual authentication, and deterministic QoS for bandwidth-intensive payloads. Cellular 5G uniquely satisfies all of…
The first 6G standardization efforts are about to start, shaping the new generation of mobile networks. The IMT-2030 extends the IMT-2020 by expanding its usage scenarios to Immersive, Massive, and Hyper-Reliable and Low-Latency…
Large Language Model (LLM) serving systems must balance task performance against monetary cost. Two prominent optimization techniques have emerged independently: LLM routing, which directs each query to the most cost-effective model in a…