Related papers: EvolveGen: Algorithmic Level Hardware Model Checki…

EvolveCoder: Evolving Test Cases via Adversarial Verification for Code Reinforcement Learning

Reinforcement learning with verifiable rewards (RLVR) is a promising approach for improving code generation in large language models, but its effectiveness is limited by weak and static verification signals in existing coding RL datasets.…

Computation and Language · Computer Science 2026-03-16 Chi Ruan , Dongfu Jiang , Huaye Zeng , Ping Nie , Wenhu Chen

AutoINV: Automated Invariant Generation Framework for Formal Verification on High-Level Synthesis Designs

High-level synthesis (HLS) transforms an algorithmic description of hardware from a higher abstraction (e.g., C/C++) into a register-transfer level (RTL) design, offering reduced development time and greater flexibility in design space…

Hardware Architecture · Computer Science 2026-04-27 Xiaofeng Zhou , Linfeng Du , Guangyu Hu , Sharad Sinha , Hongce Zhang , Wei Zhang

EvolveTool-Bench: Evaluating the Quality of LLM-Generated Tool Libraries as Software Artifacts

Modern LLM agents increasingly create their own tools at runtime -- from Python functions to API clients -- yet existing benchmarks evaluate them almost exclusively by downstream task completion. This is analogous to judging a software…

Software Engineering · Computer Science 2026-04-02 Alibek T. Kaliyev , Artem Maryanskyy

AssertGen: Enhancement of LLM-aided Assertion Generation through Cross-Layer Signal Bridging

Assertion-based verification (ABV) serves as a crucial technique for ensuring that register-transfer level (RTL) designs adhere to their specifications. While Large Language Model (LLM) aided assertion generation approaches have recently…

Hardware Architecture · Computer Science 2025-09-30 Hongqin Lyu , Yonghao Wang , Yunlin Du , Mingyu Shi , Zhiteng Chao , Wenxing Li , Tiancheng Wang , Huawei Li

ATGen: Adversarial Reinforcement Learning for Test Case Generation

Large Language Models (LLMs) excel at code generation, yet their outputs often contain subtle bugs, for which effective test cases are a critical bottleneck. Existing test generation methods, whether based on prompting or supervised…

Software Engineering · Computer Science 2025-10-17 Qingyao Li , Xinyi Dai , Weiwen Liu , Xiangyang Li , Yasheng Wang , Ruiming Tang , Yong Yu , Weinan Zhang

VERIRL: Boosting the LLM-based Verilog Code Generation via Reinforcement Learning

Recent advancements in code generation have shown remarkable success across software domains, yet hardware description languages (HDLs) such as Verilog remain underexplored due to their concurrency semantics, syntactic rigidity, and…

Machine Learning · Computer Science 2025-08-27 Fu Teng , Miao Pan , Xuhong Zhang , Zhezhi He , Yiyao Yang , Xinyi Chai , Mengnan Qi , Liqiang Lu , Jianwei Yin

EvolVE: Evolutionary Search for LLM-based Verilog Generation and Optimization

Verilog's design cycle is inherently labor-intensive and necessitates extensive domain expertise. Although Large Language Models (LLMs) offer a promising pathway toward automation, their limited training data and intrinsic sequential…

Artificial Intelligence · Computer Science 2026-01-27 Wei-Po Hsin , Ren-Hao Deng , Yao-Ting Hsieh , En-Ming Huang , Shih-Hao Hung

ReVeal: Self-Evolving Code Agents via Reliable Self-Verification

Reinforcement learning with verifiable rewards (RLVR) has advanced the reasoning capabilities of large language models. However, existing methods rely solely on outcome rewards, without explicitly optimizing verification or leveraging…

Software Engineering · Computer Science 2025-10-22 Yiyang Jin , Kunzhao Xu , Hang Li , Xueting Han , Yanmin Zhou , Cheng Li , Jing Bai

Leveraging Procedural Generation to Benchmark Reinforcement Learning

We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like environments designed to benchmark both sample efficiency and generalization in reinforcement learning. We believe that the community will benefit from increased…

Machine Learning · Computer Science 2020-07-28 Karl Cobbe , Christopher Hesse , Jacob Hilton , John Schulman

BERGEN: A Benchmarking Library for Retrieval-Augmented Generation

Retrieval-Augmented Generation allows to enhance Large Language Models with external knowledge. In response to the recent popularity of generative LLMs, many RAG approaches have been proposed, which involve an intricate number of different…

Computation and Language · Computer Science 2024-07-02 David Rau , Hervé Déjean , Nadezhda Chirkova , Thibault Formal , Shuai Wang , Vassilina Nikoulina , Stéphane Clinchant

An Evolutionary Framework for Automatic Optimization Benchmark Generation via Large Language Models

Optimization benchmarks play a fundamental role in assessing algorithm performance; however, existing artificial benchmarks often fail to capture the diversity and irregularity of real-world problem structures, while benchmarks derived from…

Neural and Evolutionary Computing · Computer Science 2026-01-26 Yuhiro Ono , Tomohiro Harada , Yukiya Miura

Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies

The objective comparison of Reinforcement Learning (RL) algorithms is notoriously complex as outcomes and benchmarking of performances of different RL approaches are critically sensitive to environmental design, reward structures, and…

Machine Learning · Computer Science 2026-03-19 Sinan Ibrahim , Grégoire Ouerdane , Hadi Salloum , Henni Ouerdane , Stefan Streif , Pavel Osinenko

ForgeBench: A Machine Learning Benchmark Suite and Auto-Generation Framework for Next-Generation HLS Tools

Although High-Level Synthesis (HLS) has attracted considerable interest in hardware design, it has not yet become mainstream due to two primary challenges. First, current HLS hardware design benchmarks are outdated as they do not cover…

Hardware Architecture · Computer Science 2025-04-22 Andy Wanna , Hanqiu Chen , Cong Hao

Evolutionary Discovery of Reinforcement Learning Algorithms via Large Language Models

Reinforcement learning algorithms are defined by their learning update rules, which are typically hand-designed and fixed. We present an evolutionary framework for discovering reinforcement learning algorithms by searching directly over…

Machine Learning · Computer Science 2026-03-31 Alkis Sygkounas , Amy Loutfi , Andreas Persson

ThreatLens: LLM-guided Threat Modeling and Test Plan Generation for Hardware Security Verification

Current hardware security verification processes predominantly rely on manual threat modeling and test plan generation, which are labor-intensive, error-prone, and struggle to scale with increasing design complexity and evolving attack…

Cryptography and Security · Computer Science 2025-05-13 Dipayan Saha , Hasan Al Shaikh , Shams Tarek , Farimah Farahmandi

RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation

As the complexity of System-on-Chip (SoC) designs grows, the shift-left paradigm necessitates the rapid development of high-fidelity reference models (typically written in SystemC) for early architecture exploration and verification. While…

Software Engineering · Computer Science 2026-04-28 Yifan Zhang , Jianmin Ye , Jiahao Yang , Xi Wang

Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

Decoding-based regression, which reformulates regression as a sequence generation task, has emerged as a promising paradigm of applying large language models for numerical prediction. However, its progress is hindered by the misalignment…

Machine Learning · Computer Science 2025-12-09 Ming Chen , Sheng Tang , Rong-Xi Tan , Ziniu Li , Jiacheng Chen , Ke Xue , Chao Qian

EvolMathEval: Towards Evolvable Benchmarks for Mathematical Reasoning via Evolutionary Testing

The rapid advancement of Large Language Models (LLMs) poses a significant challenge to existing mathematical reasoning benchmarks. However, these benchmarks tend to become easier over time as LLMs can learn from the published benchmarks.…

Artificial Intelligence · Computer Science 2025-10-07 Shengbo Wang , Mingwei Liu , Zike Li , Anji Li , Yanlin Wang , Xin Peng , Zibin Zheng

RL-ViGen: A Reinforcement Learning Benchmark for Visual Generalization

Visual Reinforcement Learning (Visual RL), coupled with high-dimensional observations, has consistently confronted the long-standing challenge of out-of-distribution generalization. Despite the focus on algorithms aimed at resolving visual…

Artificial Intelligence · Computer Science 2023-09-27 Zhecheng Yuan , Sizhe Yang , Pu Hua , Can Chang , Kaizhe Hu , Huazhe Xu

EARL: Entropy-Aware RL Alignment of LLMs for Reliable RTL Code Generation

Recent advances in large language models (LLMs) have demonstrated significant potential in hardware design automation, particularly in using natural language to synthesize Register-Transfer Level (RTL) code. Despite this progress, a gap…

Machine Learning · Computer Science 2026-02-26 Jiahe Shi , Zhengqi Gao , Ching-Yun Ko , Duane Boning