Related papers: Flexible Queueing Architectures
A parallel server system is a stochastic processing network with applications in manufacturing, supply chain, ride-hailing, call centers, etc. Heterogeneous customers arrive in the system, and only a subset of servers can serve any customer…
Gaitonde and Tardos recently studied a model of queueing networks where queues compete for servers and re-send returned packets in future rounds. They quantify the amount of additional processing power that guarantees a decentralized…
In this thesis, we propose and analyze a multi-server model that captures a performance trade-off between centralized and distributed processing. In our model, a fraction $p$ of an available resource is deployed in a centralized manner…
Consider a network of $n$ single-server queues where tasks arrive independently at each server at rate $\lambda_n$. The servers are connected by a graph that is resampled at rate $\mu_n$ in a way that is symmetric with respect to the…
We consider a service system with an infinite number of exponential servers sharing a finite service capacity. The servers are ordered according to their speed, and arriving customers join the fastest idle server. A capacity allocation is…
We consider a single server queueing system with two classes of jobs: eager jobs with small sizes that require service to begin almost immediately upon arrival, and tolerant jobs with larger sizes that can wait for service. While blocking…
We consider load balancing in large-scale heterogeneous server systems in the presence of data locality that imposes constraints on which tasks can be assigned to which servers. The constraints are naturally captured by a bipartite graph…
We consider a generalized processing system having several queues, where the available service rate combinations are fluctuating over time due to reliability and availability variations. The objective is to allocate the available resources,…
We consider the following distributed service model: jobs with unit mean, general distribution, and independent processing times arrive as a renewal process of rate $\lambda n$, with $0<\lambda<1$, and are immediately dispatched to one of…
This paper considers a population process on a dynamically evolving graph, which can be alternatively interpreted as a queueing network. The queues are of infinite-server type, entailing that at each node all customers present are served in…
We consider the problem of scheduling a queueing system in which many statistically identical servers cater to several classes of impatient customers. Service times and impatience clocks are exponential while arrival processes are renewal.…
Distributed computing systems implement redundancy to reduce the job completion time and variability. Despite a large body of work about computing redundancy, the analytical performance evaluation of redundancy techniques in queuing systems…
We study large-scale systems operating under the JSQ$(d)$ policy in the presence of stringent task-server compatibility constraints. Consider a system with $N$ identical single-server queues and $M(N)$ task types, where each server is able…
We consider stability and network capacity in discrete time queueing systems. Relationships between four common notions of stability are described. Specifically, we consider rate stability, mean rate stability, steady state stability, and…
A fundamental challenge in large-scale cloud networks and data centers is to achieve highly efficient server utilization and limit energy consumption, while providing excellent user-perceived performance in the presence of uncertain and…
Virtualization technology facilitates a dynamic, demand-driven allocation and migration of servers. This paper studies how the flexibility offered by network virtualization can be used to improve Quality-of-Service parameters such as…
The prolonged service time at non-dedicated servers has been observed in [1]. Motivated by such real problems, we propose a stylized model which characterizes the feature of the prolonged service time at non-dedicated servers in an…
Modern processing networks often consist of heterogeneous servers with widely varying capabilities, and process job flows with complex structure and requirements. A major challenge in designing efficient scheduling policies in these…
We consider processing networks where multiple dispatchers are connected to single-server queues by a bipartite compatibility graph, modeling constraints that are common in data centers and cloud networks due to geographic reasons or data…
Problem definition: In many matching markets, some agents are fully flexible, while others only accept a subset of jobs. For example, ridesharing drivers can specify on the platform the destinations they are willing to accept. Conventional…