Related papers: A Linear Combination-based Method to Construct Pro…
For the architecture community, reasonable simulation time is a strong requirement in addition to performance data accuracy. However, emerging big data and AI workloads are too huge at binary size level and prohibitively expensive to run on…
Different from the traditional benchmarking methodology that creates a new benchmark or proxy for every possible workload, this paper presents a scalable big data benchmarking methodology. Among a wide variety of big data analytics…
Exascale computing will get mankind closer to solving important social, scientific and engineering problems. Due to high prototyping costs, High Performance Computing (HPC) system architects make use of simulation models for design space…
Quantum information processors need to be protected against errors and faults. One of the most widely considered fault-tolerant architecture is based on surface codes. While the general principles of these codes are well understood and…
As architecture, systems, and data management communities pay greater attention to innovative big data systems and architectures, the pressure of benchmarking and evaluating these systems rises. Considering the broad use of big data…
Data teams at frontier AI companies routinely train small proxy models to make critical decisions about pretraining data recipes for full-scale training runs. However, the community has a limited understanding of whether and when…
Several data warehouse and database providers have recently introduced extensions to SQL called AI Queries, enabling users to specify functions and conditions in SQL that are evaluated by LLMs, thereby broadening significantly the kinds of…
One of the biggest bottlenecks in a machine learning workflow is waiting for models to train. Depending on the available computing resources, it can take days to weeks to train a neural network on a large dataset with many classes such as…
Symbolic execution now becomes an indispensable technique for software testing and program analysis. There are several symbolic execution tools available off-the-shelf, and we need a practical benchmark approach to learn their capabilities.…
As models become larger, ML accelerators are a scarce resource whose performance must be continually optimized to improve efficiency. Existing performance analysis tools are coarse grained, and fail to capture model performance at the…
Big data benchmark suites must include a diversity of data and workloads to be useful in fairly evaluating big data systems and architectures. However, using truly comprehensive benchmarks poses great challenges for the architecture…
The exponential increase in complex IPs within modern SoCs, driven by Moore's Law, has created a pressing need for fast and accurate hardware-software power-performance analysis. Traditional performance simulators (such as cycle accurate…
The server central processing unit (CPU) market continues to exhibit robust demand due to the rising global need for computing power. Against this backdrop, CPU benchmark performance prediction is crucial for architecture designers. It…
The past few years have seen a surge of applying Deep Learning (DL) models for a wide array of tasks such as image classification, object detection, machine translation, etc. While DL models provide an opportunity to solve otherwise…
Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods. However, a public benchmark with appropriate performance metrics for…
The rise of big data systems has created a need for benchmarks to measure and compare the capabilities of these systems. Big data benchmarks present unique scalability challenges. The supercomputing community has wrestled with these…
Progress in language model development is often driven by comparative decisions: which architecture to adopt, which pretraining corpus to use, or which training recipe to apply. Making these decisions well requires reliable performance…
Power consumption costs takes upto half of operational expenses of datacenters making power management a critical concern. Advances in processor technology provide fine-grained control over operating frequency and voltage of processors and…
Proxy-apps, or mini-apps, are simple self-contained benchmark codes with performance-relevant kernels extracted from real applications. Initially used to facilitate software-hardware co-design, they are a crucial ingredient for serious…
We propose a simulation-based approach for performance modeling of parallel applications on high-performance computing platforms. Our approach enables full-system performance modeling: (1) the hardware platform is represented by an abstract…