Related papers: Latency Based Tiling

Model-Driven Automatic Tiling with Cache Associativity Lattices

Traditional compiler optimization theory distinguishes three separate classes of cache miss -- Cold, Conflict and Capacity. Tiling for cache is typically guided by capacity miss counts. Models of cache function have not been effectively…

Performance · Computer Science 2015-11-19 David Adjiashvili , Utz-Uwe Haus , Adrian Tate

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Contrastive loss is a powerful approach for representation learning, where larger batch sizes enhance performance by providing more negative samples to better distinguish between similar and dissimilar data. However, scaling batch sizes is…

Computer Vision and Pattern Recognition · Computer Science 2024-10-23 Zesen Cheng , Hang Zhang , Kehan Li , Sicong Leng , Zhiqiang Hu , Fei Wu , Deli Zhao , Xin Li , Lidong Bing

Modeling and Optimizing Latency for Delayed Hit Caching with Stochastic Miss Latency

Caching is crucial for system performance, but the delayed hit phenomenon, where requests queue during lengthy fetches after a cache miss, significantly degrades user-perceived latency in modern high-throughput systems. While prior works…

Networking and Internet Architecture · Computer Science 2025-05-22 Bowen Jiang , Chaofan Ma

Fused-Tiled Layers: Minimizing Data Movement on RISC-V SoCs with Software-Managed Caches

The success of DNNs and their high computational requirements pushed for large codesign efforts aiming at DNN acceleration. Since DNNs can be represented as static computational graphs, static memory allocation and tiling are two crucial…

Hardware Architecture · Computer Science 2025-04-08 Victor J. B. Jung , Alessio Burrello , Francesco Conti , Luca Benini

Automated Tiling of Unstructured Mesh Computations with Application to Seismological Modelling

Sparse tiling is a technique to fuse loops that access common data, thus increasing data locality. Unlike traditional loop fusion or blocking, the loops may have different iteration spaces and access shared datasets through indirect memory…

Computational Engineering, Finance, and Science · Computer Science 2019-06-20 Fabio Luporini , Michael Lange , Christian T. Jacobs , Gerard J. Gorman , J. Ramanujam , Paul H. J. Kelly

Deep Tiling: Texture Tile Synthesis Using a Deep Learning Approach

Texturing is a fundamental process in computer graphics. Texture is leveraged to enhance the visualization outcome for a 3D scene. In many cases a texture image cannot cover a large 3D model surface because of its small resolution.…

Computer Vision and Pattern Recognition · Computer Science 2021-03-16 Vasilis Toulatzis , Ioannis Fudos

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity

In modern systems, DRAM-based main memory is significantly slower than the processor. Consequently, processors spend a long time waiting to access data from main memory, making the long main memory access latency one of the most critical…

Hardware Architecture · Computer Science 2016-11-01 Donghyuk Lee

Taming Tail Latency for Erasure-coded, Distributed Storage Systems

Distributed storage systems are known to be susceptible to long tails in response time. In modern online storage systems such as Bing, Facebook, and Amazon, the long tails of the service latency are of particular concern. with 99.9th…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-04-27 Vaneet Aggarwal , Abubakr O. Al-Abbasi , Jingxian Fan , Tian Lan

Spatiotemporal Tile-based Attention-guided LSTMs for Traffic Video Prediction

This extended abstract describes our solution for the Traffic4Cast Challenge 2019. The task requires modeling both fine-grained (pixel-level) and coarse (region-level) spatial structure while preserving temporal relationships across long…

Computer Vision and Pattern Recognition · Computer Science 2025-10-09 Tu Nguyen

The PRIMPing Routine -- Tiling through Proximal Alternating Linearized Minimization

Mining and exploring databases should provide users with knowledge and new insights. Tiles of data strive to unveil true underlying structure and distinguish valuable information from various kinds of noise. We propose a novel Boolean…

Artificial Intelligence · Computer Science 2019-06-25 Sibylle Hess , Katharina Morik , Nico Piatkowski

ARMS: Adaptive and Robust Memory Tiering System

Memory tiering systems seek cost-effective memory scaling by adding multiple tiers of memory. For maximum performance, frequently accessed (hot) data must be placed close to the host in faster tiers and infrequently accessed (cold) data can…

Operating Systems · Computer Science 2025-08-07 Sujay Yadalam , Konstantinos Kanellis , Michael Swift , Shivaram Venkataraman

Faster and Better LLMs via Latency-Aware Test-Time Scaling

Test-Time Scaling (TTS) has proven effective in improving the performance of Large Language Models (LLMs) during inference. However, existing research has overlooked the efficiency of TTS from a latency-sensitive perspective. Through a…

Computation and Language · Computer Science 2025-09-15 Zili Wang , Tianyu Zhang , Haoli Bai , Lu Hou , Xianzhi Yu , Wulong Liu , Shiming Xiang , Lei Zhu

Latency and Token-Aware Test-Time Compute

Inference-time scaling has emerged as a powerful way to improve large language model (LLM) performance by generating multiple candidate responses and selecting among them. However, existing work on dynamic allocation for test-time compute…

Machine Learning · Computer Science 2025-09-15 Jenny Y. Huang , Mehul Damani , Yousef El-Kurdi , Ramon Astudillo , Wei Sun

Latent Fingerprint Recognition: Role of Texture Template

We propose a texture template approach, consisting of a set of virtual minutiae, to improve the overall latent fingerprint recognition accuracy. To compensate for the lack of sufficient number of minutiae in poor quality latent prints, we…

Computer Vision and Pattern Recognition · Computer Science 2018-04-30 Kai Cao , Anil K. Jain

Your Latent Reasoning is Secretly Policy Improvement Operator

Recently, small models with latent recursion have obtained promising results on complex reasoning tasks. These results are typically explained by the theory that such recursion increases a networks depth, allowing it to compactly emulate…

Computation and Language · Computer Science 2026-02-06 Arip Asadulaev , Rayan Banerjee , Fakhri Karray , Martin Takac

Fused Depthwise Tiling for Memory Optimization in TinyML Deep Neural Network Inference

Memory optimization for deep neural network (DNN) inference gains high relevance with the emergence of TinyML, which refers to the deployment of DNN inference tasks on tiny, low-power microcontrollers. Applications such as audio keyword…

Machine Learning · Computer Science 2023-04-03 Rafael Stahl , Daniel Mueller-Gritschneder , Ulf Schlichtmann

BackCache: Mitigating Contention-Based Cache Timing Attacks by Hiding Cache Line Evictions

Caches are used to reduce the speed differential between the CPU and memory to improve the performance of modern processors. However, attackers can use contention-based cache timing attacks to steal sensitive information from victim…

Cryptography and Security · Computer Science 2024-06-13 Quancheng Wang , Xige Zhang , Han Wang , Yuzhe Gu , Ming Tang

On the Impact of Spatial Covariance Matrix Ordering on Tile Low-Rank Estimation of Mat\'ern Parameters

Spatial statistical modeling and prediction involve generating and manipulating an n*n symmetric positive definite covariance matrix, where n denotes the number of spatial locations. However, when n is large, processing this covariance…

Computation · Statistics 2024-02-15 Sihan Chen , Sameh Abdulah , Ying Sun , Marc G. Genton

Latency-Aware Differentiable Neural Architecture Search

Differentiable neural architecture search methods became popular in recent years, mainly due to their low search costs and flexibility in designing the search space. However, these methods suffer the difficulty in optimizing network, so…

Computer Vision and Pattern Recognition · Computer Science 2020-03-27 Yuhui Xu , Lingxi Xie , Xiaopeng Zhang , Xin Chen , Bowen Shi , Qi Tian , Hongkai Xiong

Adaptive-Latency DRAM: Reducing DRAM Latency by Exploiting Timing Margins

This paper summarizes the idea of Adaptive-Latency DRAM (AL-DRAM), which was published in HPCA 2015, and examines the work's significance and future potential. AL-DRAM is a mechanism that optimizes DRAM latency based on the DRAM module and…

Hardware Architecture · Computer Science 2018-05-09 Donghyuk Lee , Yoongu Kim , Gennady Pekhimenko , Samira Khan , Vivek Seshadri , Kevin Chang , Onur Mutlu