Related papers: Exploiting Data Skew for Improved Query Performanc…
All data is not equally popular. Often, some portion of data is more frequently accessed than the rest, which causes a skew in popularity of the data items. Adapting to this skew can improve performance, and this topic has been studied…
Replicating or caching popular content in memories distributed across the network is a technique to reduce peak network loads. Conventionally, the performance gain of caching was thought to result from making part of the requested data…
Increasing need for large-scale data analytics in a number of application domains has led to a dramatic rise in the number of distributed data management systems, both parallel relational databases, and systems that support alternative…
Due to the ubiquity of spatial data applications and the large amounts of spatial data that these applications generate and process, there is a pressing need for scalable spatial query processing. In this paper, we present new techniques…
Multi-criteria decision making has been made possible with the advent of skyline queries. However, processing such queries for high dimensional datasets remains a time consuming task. Real-time applications are thus infeasible, especially…
The ever-growing end user data demands, and the simultaneous reductions in memory costs are fueling edge-caching deployments. Caching at the edge is substantially different from that at the core and needs to take into account the nature of…
We study the problem of computing a conjunctive query q in parallel, using p of servers, on a large database. We consider algorithms with one round of communication, and study the complexity of the communication. We are especially…
Modern business applications and scientific databases call for inherently dynamic data storage environments. Such environments are characterized by two challenging features: (a) they have little idle system time to devote on physical…
Edge caching plays an increasingly important role in boosting user content retrieval performance while reducing redundant network traffic. The effectiveness of caching ultimately hinges on the accuracy of predicting content popularity in…
Caching search results is employed in information retrieval systems to expedite query processing and reduce back-end server workload. Motivated by the observation that queries belonging to different topics have different temporal-locality…
Rapid response, namely low latency, is fundamental in search applications; it is particularly so in interactive search sessions, such as those encountered in conversational settings. An observation with a potential to reduce latency asserts…
In this paper we develop a novel technique to analyze both isolated and interconnected caches operating under different caching strategies and realistic traffic conditions. The main strength of our approach is the ability to consider…
With the tremendous growth of data traffic over wired and wireless networks along with the increasing number of rich-media applications, caching is envisioned to play a critical role in next-generation networks. To intelligently prefetch…
Scientific collaborations are increasingly relying on large volumes of data for their work and many of them employ tiered systems to replicate the data to their worldwide user communities. Each user in the community often selects a…
One of the most common basic techniques for improving the performance of web applications is caching frequently accessed data in fast data stores, colloquially known as cache daemons. In this paper we present a cache daemon suitable for…
In recent times, the production of multidimensional data in various domains and their storage in array databases has witnessed a sharp increase; this rapid growth in data volumes necessitates compression in array databases. However,…
Indexing large-scale databases in main memory is still challenging today. Learned index structures -- in which the core components of classical indexes are replaced with machine learning models -- have recently been suggested to…
When data stores and users are distributed geographically, it is essential to organize distributed data cache points at ideal locations to minimize data transfers. To answer this, we are developing an adaptive distributed data caching…
Data-intensive applications often require exploratory analysis of large datasets. If analysis is performed on distributed resources, data locality can be crucial to high throughput and performance. We propose a "data diffusion" approach…
Caching at base stations (BSs) is a promising approach for supporting the tremendous traffic growth of content delivery over future small-cell wireless networks with limited backhaul. This paper considers exploiting spatial caching…