English
Related papers

Related papers: IndustryCode: A Benchmark for Industry Code Genera…

200 papers

With the rapid advancement of Multimodal Large Language Models (MLLMs), numerous evaluation benchmarks have emerged. However, comprehensive assessments of their performance across diverse industrial applications remain limited. In this…

Computation and Language · Computer Science 2025-01-29 Dongyi Yi , Guibo Zhu , Chenglin Ding , Zongshu Li , Dong Yi , Jinqiao Wang

LLM-powered coding agents are redefining how real-world software is developed. To drive the research towards better coding agents, we require challenging benchmarks that can rigorously evaluate the ability of such agents to perform various…

Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains, with code generation emerging as a key area of focus. While numerous benchmarks have been proposed to evaluate their code generation abilities,…

Recent advances in Code Large Language Models (CodeLLMs) have primarily focused on open-ended code generation, often overlooking the crucial aspect of code understanding and reasoning. To bridge this gap, we introduce CodeMMLU, a…

Software Engineering · Computer Science 2025-04-10 Dung Nguyen Manh , Thang Phan Chau , Nam Le Hai , Thong T. Doan , Nam V. Nguyen , Quang Pham , Nghi D. Q. Bui

Large Language Models (LLMs) have demonstrated remarkable performance on assisting humans in programming and facilitating programming automation. However, existing benchmarks for evaluating the code understanding and generation capacities…

Computation and Language · Computer Science 2024-06-10 Weixiang Yan , Haitian Liu , Yunkun Wang , Yunzhe Li , Qian Chen , Wen Wang , Tingyu Lin , Weishan Zhao , Li Zhu , Hari Sundaram , Shuiguang Deng

Recent advances in code generation have illuminated the potential of employing large language models (LLMs) for general-purpose programming languages such as Python and C++, opening new opportunities for automating software development and…

Machine Learning · Computer Science 2025-03-06 Jiahao Gai , Hao Mark Chen , Zhican Wang , Hongyu Zhou , Wanru Zhao , Nicholas Lane , Hongxiang Fan

Large language models (LLMs) have been widely deployed in coding tasks, drawing increasing attention to the evaluation of the quality and safety of LLMs' outputs. However, research on bias in code generation remains limited. Existing…

Computation and Language · Computer Science 2025-04-03 Yongkang Du , Jen-tse Huang , Jieyu Zhao , Lu Lin

Writing competitive programming problems is exacting. Authors must: set constraints, input distributions, and edge cases that rule out shortcuts; target specific algorithms (e.g., max-flow, dynamic programming, data structures); and…

Industrial Computer-Aided Design (CAD) code generation requires models to produce executable parametric programs from visual or textual inputs. Beyond recognizing the outer shape of a part, this task involves understanding its 3D structure,…

Artificial Intelligence · Computer Science 2026-05-13 Haozhe Zhang , Kaichen Liu , Miaomiao Chen , Lei Li , Shaojie Yang , Cheng Peng , Hanjie Chen

This study presents a comprehensive empirical evaluation of six state-of-the-art large language models (LLMs) for code generation, including both general-purpose and code-specialized models. Using a dataset of 944 real-world LeetCode…

Software Engineering · Computer Science 2025-12-23 Le Zhang , Suresh Kothari

The rapid advancement of large language models (LLMs) has significantly improved their performance in code generation tasks. However, existing code benchmarks remain static, consisting of fixed datasets with predefined problems. This makes…

Computation and Language · Computer Science 2025-05-30 Wenhao Hu , Jinhao Duan , Chunchen Wei , Li Zhang , Yue Zhang , Kaidi Xu

Code generation is one of the tasks for which the use of Large Language Models is widely adopted and highly successful. Given this popularity, there are many benchmarks dedicated to code generation that can help select the best model.…

Software Engineering · Computer Science 2026-05-12 Joanna Szych , Anne Schwerk

Since language models (LMs) now outperform average humans on many challenging tasks, it has become increasingly difficult to develop challenging, high-quality, and realistic evaluations. We address this issue by examining LMs' capabilities…

Large Language Models (LLMs) have emerged as coding assistants, capable of generating source code from natural language prompts. With the increasing adoption of LLMs in software development, academic research and industry based projects are…

Code review is a critical practice in modern software engineering, helping developers detect defects early, improve code quality, and facilitate knowledge sharing. With the rapid advancement of large language models (LLMs), a growing body…

Software Engineering · Computer Science 2026-02-17 Taufiqul Islam Khan , Shaowei Wang , Haoxiang Zhang , Tse-Hsun Chen

Recently, large language models (LLMs) have become increasingly powerful and have become capable of solving a plethora of tasks through proper instructions in natural language. However, the vast majority of testing suites assume that the…

Computation and Language · Computer Science 2024-02-21 Adrian Cosma , Bogdan Iordache , Paolo Rosso

Large Language Models are essential coding assistants, yet their training is predominantly English-centric. In this study, we evaluate the performance of code language models in non-English contexts, identifying challenges in their adoption…

Large language models (LLMs) have demonstrated remarkable capabilities in code generation across various domains. However, their effectiveness in generating simulation scripts for domain-specific environments like ns-3 remains…

Networking and Internet Architecture · Computer Science 2025-07-16 Tasnim Ahmed , Mirza Mohammad Azwad , Salimur Choudhury

Large language models (LLMs) can often generate functionally correct code, but their ability to produce efficient implementations for performance-critical systems tasks remains limited. Existing code benchmarks mainly emphasize correctness…

Software Engineering · Computer Science 2026-05-18 Huihao Jing , Wenbin Hu , Haochen Shi , Hanyu Yang , Sirui Zhang , Shaojin Chen , Haoran Li , Yangqiu Song

As large language models (LLMs) become increasingly embedded in software engineering workflows, a critical capability remains underexplored: generating correct code that enables cross-programming-language (CPL) interoperability. This skill…

Software Engineering · Computer Science 2025-07-29 Zhanhang Xiong , Dongxia Wang , Yuekang Li , Xinyuan An , Wenhai Wang
‹ Prev 1 2 3 10 Next ›