Related papers: SeLeP: Learning Based Semantic Prefetching for Exp…
Modern prefetchers identify memory access patterns in order to predict future accesses. However, many applications exhibit irregular access patterns that do not manifest spatio-temporal locality in the memory address space. Such…
Data prefetching, i.e., the act of predicting application's future memory accesses and fetching those that are not in the on-chip caches, is a well-known and widely-used approach to hide the long latency of memory accesses. The fruitfulness…
Query understanding plays a key role in exploring users' search intents and facilitating users to locate their most desired information. However, it is inherently challenging since it needs to capture semantic information from short and…
Data Prefetching is a technique that can hide memory latency by fetching data before it is needed by a program. Prefetching relies on accurate memory access prediction, to which task machine learning based methods are increasingly applied.…
Data prefetching aims to improve access times to data storage systems by predicting data records that are likely to be accessed by subsequent requests and retrieving them into a memory cache before they are needed. In the case of Persistent…
Cache prefetcher greatly eliminates compulsory cache misses, by fetching data from slower memory to faster cache before it is actually required by processors. Sophisticated prefetchers predict next use cache line by repeating program's…
Accurate memory prefetching is paramount for processor performance, and modern processors employ various techniques to identify and prefetch different memory access patterns. While most modern prefetchers target spatio-temporal patterns by…
Internet as become the way of life in the fast growing digital life.Even with the increase in the internet speed, higher latency time is still a challenge. To reduce latency, caching and pre fetching techniques can be used. However, caching…
The exploration and understanding of Executable and Linkable Format (ELF) objects underpin various critical activities in computer systems, from debugging to reverse engineering. Traditional UNIX tooling like readelf, nm, and objdump have…
The discrepancy between processor speed and memory system performance continues to limit the performance of many workloads. To address the issue, one effective and well studied technique is cache prefetching. Many prefetching designs have…
Machine learning algorithms have shown potential to improve prefetching performance by accurately predicting future memory accesses. Existing approaches are based on the modeling of text prediction, considering prefetching as a…
Mission-critical applications often run "forever" and process large data volumes in real time while demanding low latency. To handle the large state of these applications, modern streaming engines rely on key-value stores and store state on…
Mobile workloads incur heavy frontend stalls due to increasingly large code footprints as well as long repeat cycles. Existing instruction-prefetching techniques suffer from low coverage, poor timeliness, or high cost. We provide a SW/HW…
Recent approaches for learning policies to improve caching, target just one out of the prefetching, admission and eviction processes. In contrast, we propose an end to end pipeline to learn all three policies using machine learning. We also…
With the dramatic increase in the amount of the text-based data which commonly contains misspellings and other errors, querying such data with flexible search patterns becomes more and more commonplace. Relational databases support the LIKE…
Modern high-performance architectures employ large last-level caches (LLCs). While large LLCs can reduce average memory access latency for workloads with a high degree of locality, they can also increase latency for workloads with irregular…
Data preparation is a central and time-consuming stage in data analysis workflows. Traditionally, commercial tools have relied on graphical user interfaces (GUIs) to simplify data preparation, allowing users to define transformations…
This paper presents a framework for increasing the relevancy of the web pages retrieved by the search engine. The approach introduces a Predictive Prefetching Engine (PPE) which makes use of various data mining algorithms on the log…
Large Language Models (LLMs) have become widely used for Software Engineering (SE) tasks, spanning from function-level code generation to complex repository-level workflows. However, the high latency of autoregressive inference remains a…
High load latency that results from deep cache hierarchies and relatively slow main memory is an important limiter of single-thread performance. Data prefetch helps reduce this latency by fetching data up the hierarchy before it is…