Related papers: Fine-Grained Library Customization
Modern software development is increasingly dependent on components, libraries and frameworks coming from third-party vendors or open-source suppliers and made available through a number of platforms (or forges). This way of writing…
Software complexity has increased over the years. One common way to tackle this complexity during development is to encapsulate features into a shared library. This allows developers to reuse already implemented features instead of…
The Lean mathematical library Mathlib is one of the fastest-growing libraries of formalised mathematics. We describe various strategies to manage this growth, while allowing for change and avoiding maintainer overload. This includes dealing…
Fast numerical libraries have been a cornerstone of scientific computing for decades, but this comes at a price. Programs may be tied to vendor specific software ecosystems resulting in polluted, non-portable code. As we enter an era of…
Today's software is bloated with both code and features that are not used by most users. This bloat is prevalent across the entire software stack, from operating systems and applications to containers. Containers are lightweight…
Software reuse may result in software bloat when significant portions of application dependencies are effectively unused. Several tools exist to remove unused (byte)code from an application or its dependencies, thus producing smaller…
Dynamic slicing techniques compute program dependencies to find all statements that affect the value of a variable at a program point for a specific execution. Despite their many potential uses, applicability is limited by the fact that…
Serverless computing abstracts away server management, enabling automatic scaling, efficient resource utilization, and cost-effective pricing models. However, despite these advantages, it faces the significant challenge of cold-start…
Dynamic program slicing can significantly reduce the code developers need to inspect by narrowing it down to only a subset of relevant program statements. However, despite an extensive body of research showing its usefulness, dynamic…
Mixed Integer Linear Programming (MILP) is a fundamental tool for modeling combinatorial optimization problems. Recently, a growing body of research has used machine learning to accelerate MILP solving. Despite the increasing popularity of…
Third-party Python libraries introduce dependency management overhead, supply chain risk, and deployment friction in constrained environments. A natural question is how much of this ecosystem can be replicated using only Python's standard…
Iteratively improving and repairing source code with large language models (LLMs), known as refinement, has emerged as a popular way of generating programs that would be too complex to construct in one shot. Given a bank of test cases,…
Much software, whether beneficent or malevolent, is distributed only as binaries, sans source code. Absent source code, understanding binaries' behavior can be quite challenging, especially when compiled under higher levels of compiler…
Slicing is a program analysis technique originally developed for imperative languages. It facilitates understanding of data flow and debugging. This paper discusses slicing of Constraint Logic Programs. Constraint Logic Programming (CLP) is…
We describe here a library aimed at automating the solution of partial differential equations using the finite element method. By employing novel techniques for automated code generation, the library combines a high level of expressiveness…
Static program slicing is a fundamental software engineering technique for isolating code relevant to specific variables. While recent learning-based approaches using language models (LMs) show promise in automating slice prediction, they…
Software bloat refers to code and features that is not used by a software during runtime. For Machine Learning (ML) systems, bloat is a major contributor to their technical debt leading to decreased performance and resource wastage. In this…
Industrial applications heavily integrate open-source software libraries nowadays. Beyond the benefits that libraries bring, they can also impose a real threat in case a library is affected by a vulnerability but its community is not active…
Code completion is a prominent application of Large Language Models (LLMs) in software engineering. Due to the near real-time response requirements of this task, base models with small to medium-sized parameters are typically employed,…
Large Language Models (LLMs) equipped with external tools have demonstrated enhanced performance on complex reasoning tasks. The widespread adoption of this tool-augmented reasoning is hindered by the scarcity of domain-specific tools. For…