Related papers: GEVO: GPU Code Optimization using Evolutionary Com…
Achieving high performance for GPU codes requires developers to have significant knowledge in parallel programming and GPU architectures, and in-depth understanding of the application. This combination makes it challenging to find…
Parallel accelerators, such as GPUs, are key enablers for large-scale Machine Learning (ML) applications. However, ML model developers often lack detailed knowledge of the underlying system architectures, while system programmers usually do…
Evolutionary multiobjective optimization has witnessed remarkable progress during the past decades. However, existing algorithms often encounter computational challenges in large-scale scenarios, primarily attributed to the absence of…
Evolutionary multiobjective optimization (EMO) has made significant strides over the past two decades. However, as problem scales and complexities increase, traditional EMO algorithms face substantial performance limitations due to…
Evolutionary computing (EC) has proven to be effective in solving complex optimization and robotics problems. Unfortunately, typical Evolutionary Algorithms (EAs) are constrained by the computational capacity available to researchers. More…
Evolutionary Reinforcement Learning (EvoRL) has emerged as a promising approach to overcoming the limitations of traditional reinforcement learning (RL) by integrating the Evolutionary Computation (EC) paradigm with RL. However, the…
Tree-based Genetic Programming (TGP) is a widely used evolutionary algorithm for tasks such as symbolic regression, classification, and robotic control. Due to the intensive computational demands of running TGP, GPU acceleration is crucial…
GPUs are widely used to accelerate the training of machine learning workloads. As modern machine learning models become increasingly larger, they require a longer time to train, leading to higher GPU energy consumption. This paper presents…
Genetic Programming (GP), an evolutionary learning technique, has multiple applications in machine learning such as curve fitting, data modelling, feature selection, classification etc. GP has several inherent parallel steps, making it an…
Evolutionary algorithms (EAs) provide unique advantages for optimizing neural networks in complex search spaces. This paper introduces a new web platform, NeuroEvo (neuroevo.io), that allows users to interactively design and train neural…
Optimizing scientific computing algorithms for modern GPUs is a labor-intensive and iterative process involving repeated code modification, benchmarking, and tuning across complex hardware and software stacks. Recent work has explored large…
Inspired by natural evolutionary processes, Evolutionary Computation (EC) has established itself as a cornerstone of Artificial Intelligence. Recently, with the surge in data-intensive applications and large-scale complex systems, the…
Recent advances in LLM-guided evolutionary computation, particularly AlphaEvolve (Novikov et al., 2025; Georgiev et al., 2025), have demonstrated remarkable success in discovering novel mathematical constructions and solving challenging…
CUDA kernel optimization has become a critical bottleneck for AI performance, as deep learning training and inference efficiency directly depends on highly optimized GPU kernels. Despite the promise of Large Language Models (LLMs) for…
Convolution is a fundamental operation in many applications, such as computer vision, natural language processing, image processing, etc. Recent successes of convolutional neural networks in various deep learning applications put even…
GPU kernels have come to the forefront of computing due to their utility in varied fields, from high-performance computing to machine learning. A typical GPU compute kernel is invoked millions, if not billions of times in a typical…
Binary convolutional networks have lower computational load and lower memory foot-print compared to their full-precision counterparts. So, they are a feasible alternative for the deployment of computer vision applications on limited…
Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Processing Units (GPUs) due to their high parallel computation power at relatively low cost. However, writing a computationally efficient GPU…
Modern computing platforms tend to deploy multiple GPUs (2, 4, or more) on a single node to boost system performance, with each GPU having a large capacity of global memory and streaming multiprocessors (SMs). GPUs are an expensive…
Recent advances in data-driven evolutionary algorithms (EAs) have demonstrated the potential of leveraging historical data to improve optimization accuracy and adaptability. Despite these advancements, existing methods remain reliant on…