Related papers: Migrating large codebases to C++ Modules
C++ Modules come in C++20 to fix the long-standing build scalability problems in the language. They provide an io-efficient, on-disk representation capable to reduce build times and peak memory usage. ROOT employs the C++ modules technology…
ROOT is a data analysis framework broadly used in and outside of High Energy Physics (HEP). Since HEP software frameworks always strive for performance improvements, ROOT was extended with experimental support of runtime C++ Modules. C++…
The ROOT software framework is foundational for the HEP ecosystem, providing capabilities such as IO, a C++ interpreter, GUI, and math libraries. It uses object-oriented concepts and build-time components to layer between them. We believe…
CMS has worked aggressively to make use of multi-core architectures, routinely running 4- to 8-core production jobs in 2017. The primary impediment to efficiently scaling beyond 8 cores has been our ROOT-based output module, which has been…
Rewriting C code in Rust provides stronger memory safety, yet migrating large codebases such as the 32-million-line Linux kernel remains challenging. While rule-based translators (e.g., C2Rust) provide accurate yet largely unsafe Rust…
Rust is a multi-paradigm programming language developed by Mozilla that focuses on performance and safety. Rust code is arguably known best for its speed and memory safety, a property essential while developing embedded systems. Thus, it…
Memory safety has long been a critical challenge in software engineering, particularly for legacy systems written in memory-unsafe languages such as C and C++. Rust, one of the youngest modern programming languages, offers built-in…
C++ 98/03 already has a reputation for overwhelming complexity compared to other programming languages. The raft of new features in C++ 11/14 suggests that the complexity in the next generation of C++ code bases will overwhelm still…
Testing is one of the most indispensable tasks in software engineering. The role of testing in software development has grown significantly because testing is able to reveal defects in the code in an early stage of development. Many unit…
C/C++ is a prevalent programming language. Yet, it suffers from significant memory and thread-safety issues. Recent studies have explored automated translation of C/C++ to safer languages, such as Rust. However, these studies focused mostly…
Memory safety vulnerabilities remain prevalent in today's software systems and one promising solution to mitigate them is to adopt memory-safe languages such as Rust. Due to legacy code written in memory unsafe C, there is strong motivation…
Rust is a strong contender for a memory-safe alternative to C as a "systems" language, but porting the vast amount of existing C code to Rust remains daunting. In this paper, we evaluate the potential of large language models (LLMs) to…
Mathematical software has traditionally been built in the form of "packages" that build on each other. A substantial fraction of these packages is written in C++ and, as a consequence, the interface of a package is described in the form of…
We introduce CPP-UT-Bench, a benchmark dataset to measure C++ unit test generation capability of a large language model (LLM). CPP-UT-Bench aims to reflect a broad and diverse set of C++ codebases found in the real world. The dataset…
Despite the importance of sparse matrices in numerous fields of science, software implementations remain difficult to use for non-expert users, generally requiring the understanding of underlying details of the chosen sparse matrix storage…
The challenges expected for the next era of the Large Hadron Collider (LHC), both in terms of storage and computing resources, provide LHC experiments with a strong motivation for evaluating ways of rethinking their computing models at many…
Migrating existing C programs into Rust is increasingly desired, as Rust offers superior memory safety while maintaining C's high performance. However, vastly different features between C and Rust--e.g., distinct definitions and usages of…
Automating C-to-Rust migration for industrial software remains difficult because build-critical context is scattered across compile configurations, macros, external symbols, and cross-module dependencies, while reusable migration knowledge…
The implementation of persistency in the Compact Muon Solenoid (CMS) Software Framework uses the core I/O functionality of ROOT. We will discuss the current ROOT/IO implementation, its evolution from the prior Objectivity/DB implementation,…
Large language models (LLMs) show promise in code translation due to their ability to generate idiomatic code. However, a significant limitation when using LLMs for code translation is scalability: existing works have shown a drop in…