English
Related papers

Related papers: Codestitcher: Inter-Procedural Basic Block Layout …

200 papers

Basic block reordering is an important step for profile-guided binary optimization. The state-of-the-art goal for basic block reordering is to maximize the number of fall-through branches. However, we demonstrate that such orderings may…

Programming Languages · Computer Science 2020-04-14 Andy Newell , Sergey Pupyrev

Fast compilation is important when compilation occurs at runtime, such as query compilers in modern database systems and WebAssembly virtual machines in modern browsers. We present copy-and-patch, an extremely fast compilation technique…

Programming Languages · Computer Science 2021-09-16 Haoran Xu , Fredrik Kjolstad

Function layout, also referred to as function reordering or function placement, is one of the most effective profile-guided compiler optimizations. By reordering functions in a binary, compilers are able to greatly improve the performance…

Programming Languages · Computer Science 2022-11-18 Ellis Hoag , Kyungwoo Lee , Julián Mestre , Sergey Pupyrev

Block-based programming languages like Scratch have become increasingly popular as introductory languages for novices. These languages are intended to be used with a "tinkering" approach which allows learners and teachers to quickly…

Code often suffers from performance bugs. These bugs necessitate the research and practice of code optimization. Traditional rule-based methods rely on manually designing and maintaining rules for specific performance bugs (e.g., redundant…

Software Engineering · Computer Science 2025-12-30 Yue Wu , Minghao Han , Ruiyin Li , Peng Liang , Amjed Tahir , Zengyang Li , Qiong Feng , Mojtaba Shahin

Coded caching is a technique that generalizes conventional caching and promises significant reductions in traffic over caching networks. However, the basic coded caching scheme requires that each file hosted in the server be partitioned…

Information Theory · Computer Science 2018-02-20 Li Tang , Aditya Ramamoorthy

Existing methods fail to effectively steer Large Language Models (LLMs) between textual reasoning and code generation, leaving symbolic computing capabilities underutilized. We introduce CodeSteer, an effective method for guiding LLM…

Computation and Language · Computer Science 2025-05-30 Yongchao Chen , Yilun Hao , Yueying Liu , Yang Zhang , Chuchu Fan

The blocks editor, such as the editor in Scratch, is widely applied for visual programming languages (VPL) nowadays. Despite it's friendly for non-programmers, it exists three main limitations while displaying block codes: (1) the…

Human-Computer Interaction · Computer Science 2016-05-04 Sheng-yi Hsu , Yuan-fu Lou , Chuen-tsai Sun

Code optimization is a challenging task requiring a substantial level of expertise from developers. Nonetheless, this level of human capacity is not sufficient considering the rapid evolution of new hardware architectures and software…

Elasticity is offered by cloud service providers to exploit under-utilized computing resources. The low-cost elastic nodes can leave and join any time during the computation cycle. The possibility of elastic events occurring together with…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-24 Shahrzad Kiani , Tharindu Adikari , Stark C. Draper

In recent years, there is a surge on machine learning applications in industry. Many of them are based on popular AI frameworks like Tensorflow, Torch, Caffe, or MxNet, etc, and are enpowered by accelerator platforms such as GPUs. One…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-14 Guoping Long , Jun Yang , Kai Zhu , Wei Lin

Performance optimization is the art of continuous seeking a harmonious mapping between the application domain and hardware. Recent years have witnessed a surge of deep learning (DL) applications in industry. Conventional wisdom for…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-27 Guoping Long , Jun Yang , Wei Lin

This study investigates the problem of learning linear block codes optimized for Belief-Propagation decoders significantly improving performance compared to the state-of-the-art. Our previous research is extended with an enhanced system…

Signal Processing · Electrical Eng. & Systems 2025-10-02 Louis-Adrien Dufrène , Quentin Lampin , Guillaume Larue

Performance optimization for large-scale applications has recently become more important as computation continues to move towards data centers. Data-center applications are generally very large and complex, which makes code layout an…

Programming Languages · Computer Science 2018-10-16 Maksim Panchenko , Rafael Auler , Bill Nell , Guilherme Ottoni

Even in the era of Deep Learning based methods, traditional machine learning methods with large data sets continue to attract significant attention. However, we find an apparent lack of a detailed performance characterization of these…

Performance · Computer Science 2024-12-30 Harsh Kumar , R. Govindarajan

Mining large graphs for information is becoming an increasingly important workload due to the plethora of graph structured data becoming available. An aspect of graph algorithms that has hitherto not received much interest is the effect of…

Data Structures and Algorithms · Computer Science 2012-03-27 Amitabha Roy

Modern high-performance architectures employ large last-level caches (LLCs). While large LLCs can reduce average memory access latency for workloads with a high degree of locality, they can also increase latency for workloads with irregular…

Hardware Architecture · Computer Science 2025-11-26 Hoa Nguyen , Pongstorn Maidee , Jason Lowe-Power , Alireza Kaviani

Building on the previous work of Lee et al. and Ferdinand et al. on coded computation, we propose a sequential approximation framework for solving optimization problems in a distributed manner. In a distributed computation system, latency…

Information Theory · Computer Science 2017-10-26 Jingge Zhu , Ye Pu , Vipul Gupta , Claire Tomlin , Kannan Ramchandran

We show in this work that memory intensive computations can result in severe performance problems due to off-chip memory access and CPU-GPU context switch overheads in a wide range of deep learning models. For this problem, current…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-20 Zhen Zheng , Pengzhan Zhao , Guoping Long , Feiwen Zhu , Kai Zhu , Wenyi Zhao , Lansong Diao , Jun Yang , Wei Lin

Traditional optimizing compilers have played an important role in adapting to the growing complexity of modern software systems. The need for efficient parallel programming in current architectures requires strong optimization techniques.…

Artificial Intelligence · Computer Science 2025-04-03 Miguel Romero Rosas , Miguel Torres Sanchez , Rudolf Eigenmann
‹ Prev 1 2 3 10 Next ›