Related papers: Optimizing Tail Latency in Commodity Datacenters u…

Slytherin: Dynamic, Network-assisted Prioritization of Tail Packets in Datacenter Networks

Datacenter applications demand both low latency and high throughput; while interactive applications (e.g., Web Search) demand low tail latency for their short messages due to their partition-aggregate software architecture, many…

Networking and Internet Architecture · Computer Science 2018-07-09 Hamed Rezaei , Mojtaba Malekpourshahraki , Balajee Vamanan

Scalable Tail Latency Estimation for Data Center Networks

In this paper, we consider how to provide fast estimates of flow-level tail latency performance for very large scale data center networks. Network tail latency is often a crucial metric for cloud application performance that can be affected…

Networking and Internet Architecture · Computer Science 2022-10-03 Kevin Zhao , Prateesh Goyal , Mohammad Alizadeh , Thomas E. Anderson

PowerTCP: Pushing the Performance Limits of Datacenter Networks

Increasingly stringent throughput and latency requirements in datacenter networks demand fast and accurate congestion control. We observe that the reaction time and accuracy of existing datacenter congestion control schemes are inherently…

Networking and Internet Architecture · Computer Science 2021-12-30 Vamsi Addanki , Oliver Michel , Stefan Schmid

CommFuse: Hiding Tail Latency via Communication Decomposition and Fusion for Distributed LLM Training

The rapid growth in the size of large language models has necessitated the partitioning of computational workloads across accelerators such as GPUs, TPUs, and NPUs. However, these parallelization strategies incur substantial data…

Machine Learning · Computer Science 2026-05-11 Rezaul Karim , Austin Wen , Wang Zongzuo , Weiwei Zhang , Yang Liu , Walid Ahmed

SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management

Optimizing tail latency while efficiently managing computational resources is crucial for delivering high-performance, latency-sensitive services in edge computing. Emerging applications, such as augmented reality, require low-latency…

Machine Learning · Computer Science 2024-10-23 Jyoti Shokhanda , Utkarsh Pal , Aman Kumar , Soumi Chattopadhyay , Arani Bhattacharya

Taming Tail Latency for Erasure-coded, Distributed Storage Systems

Distributed storage systems are known to be susceptible to long tails in response time. In modern online storage systems such as Bing, Facebook, and Amazon, the long tails of the service latency are of particular concern. with 99.9th…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-04-27 Vaneet Aggarwal , Abubakr O. Al-Abbasi , Jingxian Fan , Tian Lan

Credence: Augmenting Datacenter Switch Buffer Sharing with ML Predictions

Packet buffers in datacenter switches are shared across all the switch ports in order to improve the overall throughput. The trend of shrinking buffer sizes in datacenter switches makes buffer sharing extremely challenging and a critical…

Networking and Internet Architecture · Computer Science 2024-01-08 Vamsi Addanki , Maciej Pacut , Stefan Schmid

Flow Optimization at Inter-Datacenter Networks for Application Run-time Acceleration

In the present-day, distributed applications are commonly spread across multiple datacenters, reaching out to edge and fog computing locations. The transition away from single datacenter hosting is driven by capacity constraints in…

Networking and Internet Architecture · Computer Science 2024-06-19 Berta Serracanta , Alberto Rodriguez-Natal , Fabio Maino , Albert Cabellos

Backpressure Flow Control

Effective congestion control for data center networks is becoming increasingly challenging with a growing amount of latency sensitive traffic, much fatter links, and extremely bursty traffic. Widely deployed algorithms, such as DCTCP and…

Networking and Internet Architecture · Computer Science 2021-03-30 Prateesh Goyal , Preey Shah , Kevin Zhao , Georgios Nikolaidis , Mohammad Alizadeh , Thomas E. Anderson

AccuracyTrader: Accuracy-aware Approximate Processing for Low Tail Latency and High Result Accuracy in Cloud Online Services

Modern latency-critical online services such as search engines often process requests by consulting large input data spanning massive parallel components. Hence the tail latency of these components determines the service latency. To trade…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-07-12 Rui Han , Siguang Huang , Fei Tang , Fugui Chang , Jianfeng Zhan

RepNet: Cutting Tail Latency in Data Center Networks with Flow Replication

Data center networks need to provide low latency, especially at the tail, as demanded by many interactive applications. To improve tail latency, existing approaches require modifications to switch hardware and/or end-host operating systems,…

Networking and Internet Architecture · Computer Science 2015-01-27 Shuhao Liu , Wei Bai , Hong Xu , Kai Chen , Zhiping Cai

Reducing Tail Latencies Through Environment- and Neighbour-aware Thread Management

Application tail latency is a key metric for many services, with high latencies being linked directly to loss of revenue. Modern deeply-nested micro-service architectures exacerbate tail latencies, increasing the likelihood of users…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-17 Andrew Jeffery , Chris Jensen , Richard Mortier

FODT: Fast, Online, Distributed and Temporary Failure Recovery Approach for MEC

Mobile edge computing (MEC) can reduce the latency of cloud computing successfully. However, the edge server may fail due to the hardware of software issues. When the edge server failure happens, the users who offload tasks to this server…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-17 Xin Yuan , Ning Li , Jose Fernan Martinez

ExpressPass: End-to-End Credit-based Congestion Control for Datacenters

As link speeds increase in datacenter networks, existing congestion control algorithms become less effective in providing fast convergence. TCP-based algorithms that probe for bandwidth take a long time to reach the fair-share and lead to…

Networking and Internet Architecture · Computer Science 2016-10-18 Inho Cho , Dongsu Han , Keon Jang

MPTCP meets FEC: Supporting Latency-Sensitive Applications over Heterogeneous Networks

Over the past years, TCP has gone through numerous updates to provide performance enhancement under diverse network conditions. However, with respect to losses, little can be achieved with legacy TCP detection and recovery mechanisms. Both…

Networking and Internet Architecture · Computer Science 2018-07-31 Simone Ferlin , Stepan Kucera , Holger Claussen , Ozgu Alay

Exploiting the Path Propagation Time Differences in Multipath Transmission with FEC

We consider a transmission of a delay-sensitive data stream from a single source to a single destination. The reliability of this transmission may suffer from bursty packet losses - the predominant type of failures in today's Internet. An…

Networking and Internet Architecture · Computer Science 2009-01-13 Maciej Kurant

TAROT: Towards Optimization-Driven Adaptive FEC Parameter Tuning for Video Streaming

Forward Error Correction (FEC) remains essential for protecting video streaming against packet loss, yet most real deployments still rely on static, coarse-grained configurations that cannot react to rapid shifts in loss rate, goodput, or…

Multimedia · Computer Science 2026-02-11 Jashanjot Singh Sidhu , Aman Sahu , Abdelhak Bentaleb

TailBench++: Flexible Multi-Client, Multi-Server Benchmarking for Latency-Critical Workloads

Cloud systems have rapidly expanded worldwide in the last decade, shifting computational tasks to cloud servers where clients submit their requests. Among cloud workloads, latency-critical applications -- characterized by high-percentile…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-07 Zhilin Li , Lucia Pons , Salvador Petit , Julio Sahuquillo , Julio Pons

CAFT: Congestion-Aware Fault-Tolerant Load Balancing for Three-Tier Clos Data Centers

Production data centers operate under various workload sizes ranging from latency-sensitive mice flows to long-lived elephant flows. However, the predominant load balancing scheme in data center networks, equal-cost multi-path (ECMP), is…

Networking and Internet Architecture · Computer Science 2020-10-05 Sultan Alanazi , Bechir Hamdaoui

Tail-Learning: Adaptive Learning Method for Mitigating Tail Latency in Autonomous Edge Systems

In the realm of edge computing, the increasing demand for high Quality of Service (QoS), particularly in dynamic multimedia streaming applications (e.g., Augmented Reality/Virtual Reality and online gaming), has prompted the need for…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-12-29 Cheng Zhang , Yinuo Deng , Hailiang Zhao , Tianlv Chen , Shuiguang Deng