English
Related papers

Related papers: Multi-Programming Language Sandbox for LLMs

200 papers

As the capabilities of code large language models (LLMs) continue to expand, their applications across diverse code intelligence domains are rapidly increasing. However, most existing datasets only evaluate limited application domains. To…

To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs. This library is featured with three main merits: (1) a…

While large language models (LLMs) are powerful assistants in programming tasks, they may also produce malicious code. Testing LLM-generated code therefore poses significant risks to assessment infrastructure tasked with executing untrusted…

Cryptography and Security · Computer Science 2025-04-02 Rafiqul Rabin , Jesse Hostetler , Sean McGregor , Brett Weir , Nick Judd

Code sandboxes have emerged as a critical infrastructure for advancing the coding capabilities of large language models, providing verifiable feedback for both RL training and evaluation. However, existing systems fail to provide accurate…

Software Engineering · Computer Science 2026-05-01 Jiasheng Zheng , Xin Zheng , Boxi Cao , Pengbo Wang , Zhengzhao Ma , Qiming Zhu , Jiazhen Jiang , Yaojie Lu , Hongyu Lin , Xianpei Han , Le Sun

Large language models (LLMs) have become an essential tool to support developers using traditional text-based programming languages, but the graphical notation of the block-based Scratch programming environment inhibits the use of LLMs. To…

Software Engineering · Computer Science 2026-02-09 Benedikt Fein , Florian Obermüller , Gordon Fraser

Recent large language models (LLMs) advancements sparked a growing research interest in tool assisted LLMs solving real-world challenges, which calls for comprehensive evaluation of tool-use capabilities. While previous works focused on…

Computation and Language · Computer Science 2025-04-18 Jiarui Lu , Thomas Holleis , Yizhe Zhang , Bernhard Aumayer , Feng Nan , Felix Bai , Shuang Ma , Shen Ma , Mengyu Li , Guoli Yin , Zirui Wang , Ruoming Pang

Large Language Models (LLMs) have emerged as coding assistants, capable of generating source code from natural language prompts. With the increasing adoption of LLMs in software development, academic research and industry based projects are…

Task automation has been greatly empowered by the recent advances in Large Language Models (LLMs) via Python code, where the tasks ranging from software engineering development to general-purpose reasoning. While current benchmarks have…

The evaluation of large language models (LLMs) is crucial to assess their performance and mitigate potential security risks. In this paper, we introduce PromptBench, a unified library to evaluate LLMs. It consists of several key components…

Artificial Intelligence · Computer Science 2024-08-21 Kaijie Zhu , Qinlin Zhao , Hao Chen , Jindong Wang , Xing Xie

Large Language Model (LLM) alignment aims to ensure that LLM outputs match with human values. Researchers have demonstrated the severity of alignment problems with a large spectrum of jailbreak techniques that can induce LLMs to produce…

Computation and Language · Computer Science 2024-02-06 Xiaolong Jin , Zhuo Zhang , Xiangyu Zhang

Large Language Models (LLMs) demonstrate remarkable performance in semantic understanding and generation, yet accurately assessing their output reliability remains a significant challenge. While numerous studies have explored calibration…

Artificial Intelligence · Computer Science 2024-12-18 Liangru Xie , Hui Liu , Jingying Zeng , Xianfeng Tang , Yan Han , Chen Luo , Jing Huang , Zhen Li , Suhang Wang , Qi He

Large language models (LLMs) can often generate functionally correct code, but their ability to produce efficient implementations for performance-critical systems tasks remains limited. Existing code benchmarks mainly emphasize correctness…

Software Engineering · Computer Science 2026-05-18 Huihao Jing , Wenbin Hu , Haochen Shi , Hanyu Yang , Sirui Zhang , Shaojin Chen , Haoran Li , Yangqiu Song

Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program…

Software Engineering · Computer Science 2024-09-10 Jinglue Xu , Jialong Li , Zhen Liu , Nagar Anthel Venkatesh Suryanarayanan , Guoyuan Zhou , Jia Guo , Hitoshi Iba , Kenji Tei

Large Language Model (LLM) agents show considerable promise for automating complex tasks using contextual reasoning; however, interactions involving multiple agents and the system's susceptibility to prompt injection and other forms of…

Cryptography and Security · Computer Science 2025-06-02 Kaiyuan Zhang , Zian Su , Pin-Yu Chen , Elisa Bertino , Xiangyu Zhang , Ninghui Li

Although large language models (LLMs) have demonstrated their strong intelligence ability, the high demand for computation and storage hinders their practical application. To this end, many model compression techniques are proposed to…

Computation and Language · Computer Science 2024-11-01 Ge Yang , Changyi He , Jinyang Guo , Jianyu Wu , Yifu Ding , Aishan Liu , Haotong Qin , Pengliang Ji , Xianglong Liu

Large Language Models for Code (or code LLMs) are increasingly gaining popularity and capabilities, offering a wide array of functionalities such as code completion, code generation, code summarization, test generation, code translation,…

Software Engineering · Computer Science 2024-10-18 Rahul Krishna , Rangeet Pan , Raju Pavuluri , Srikanth Tamilselvam , Maja Vukovic , Saurabh Sinha

Large Language Models (LLMs) are fast becoming indispensable tools for software developers, assisting or even partnering with them in crafting complex programs. The advantages are evident -- LLMs can significantly reduce development time,…

Software Engineering · Computer Science 2025-09-12 Ayelet Berzack , Guy Katz

Computing educators and researchers have used programming process data to understand how programs are constructed and what sorts of problems students struggle with. Although such data shows promise for using it for feedback, fully automated…

Computers and Society · Computer Science 2024-11-04 John Edwards , Arto Hellas , Juho Leinonen

Multi-agent Large Language Model (LLM) systems have been leading the way in applied LLM research across a number of fields. One notable area is software development, where researchers have advanced the automation of code implementation,…

Software Engineering · Computer Science 2025-11-25 Vali Tawosi , Keshav Ramani , Salwa Alamir , Xiaomo Liu

Recent advances in code generation have illuminated the potential of employing large language models (LLMs) for general-purpose programming languages such as Python and C++, opening new opportunities for automating software development and…

Machine Learning · Computer Science 2025-03-06 Jiahao Gai , Hao Mark Chen , Zhican Wang , Hongyu Zhou , Wanru Zhao , Nicholas Lane , Hongxiang Fan
‹ Prev 1 2 3 10 Next ›