Related papers: Enhancing Deployment-Time Predictive Model Robustn…

On the Adversarial Robustness of Instruction-Tuned Large Language Models for Code

The advent of instruction-tuned Large Language Models designed for coding tasks (Code LLMs) has transformed software engineering practices. However, their robustness against various input challenges remains a critical concern. This study…

Software Engineering · Computer Science 2024-12-02 Md Imran Hossen , Xiali Hei

DePro: Understanding the Role of LLMs in Debugging Competitive Programming Code

Debugging consumes a substantial portion of the software development lifecycle, yet the effectiveness of Large Language Models(LLMs) in this task is not well understood. Competitive programming offers a rich benchmark for such evaluation,…

Software Engineering · Computer Science 2026-03-23 Nabiha Parvez , Tanvin Sarkar Pallab , Mia Mohammad Imran , Tarannum Shaila Zaman

$\textbf{PLUM}$: Improving Code LMs with Execution-Guided On-Policy Preference Learning Driven By Synthetic Test Cases

Preference learning provides a promising solution to address the limitations of supervised fine-tuning (SFT) for code language models, where the model is not explicitly trained to differentiate between correct and incorrect code. Recent…

Computation and Language · Computer Science 2024-10-15 Dylan Zhang , Shizhe Diao , Xueyan Zou , Hao Peng

PRISM: Pushing the Frontier of Deep Think via Process Reward Model-Guided Inference

DEEPTHINK methods improve reasoning by generating, refining, and aggregating populations of candidate solutions, which enables strong performance on complex mathematical and scientific tasks. However, existing frameworks often lack reliable…

Artificial Intelligence · Computer Science 2026-03-04 Rituraj Sharma , Weiyuan Chen , Noah Provenzano , Tu Vu

Large Language Model Unlearning for Source Code

While Large Language Models (LLMs) excel at code generation, their inherent tendency toward verbatim memorization of training data introduces critical risks like copyright infringement, insecure emission, and deprecated API utilization,…

Software Engineering · Computer Science 2025-11-25 Xue Jiang , Yihong Dong , Huangzhao Zhang , Tangxinyu Wang , Zheng Fang , Yingwei Ma , Rongyu Cao , Binhua Li , Zhi Jin , Wenpin Jiao , Yongbin Li , Ge Li

To Err is Machine: Vulnerability Detection Challenges LLM Reasoning

In this paper, we present a challenging code reasoning task: vulnerability detection. Large Language Models (LLMs) have shown promising results in natural-language and math reasoning, but state-of-the-art (SOTA) models reported only 54.5%…

Software Engineering · Computer Science 2025-01-09 Benjamin Steenhoek , Md Mahbubur Rahman , Monoshi Kumar Roy , Mirza Sanjida Alam , Hengbo Tong , Swarna Das , Earl T. Barr , Wei Le

DeepPERF: A Deep Learning-Based Approach For Improving Software Performance

Improving software performance is an important yet challenging part of the software development cycle. Today, the majority of performance inefficiencies are identified and patched by performance experts. Recent advancements in deep learning…

Software Engineering · Computer Science 2022-06-29 Spandan Garg , Roshanak Zilouchian Moghaddam , Colin B. Clement , Neel Sundaresan , Chen Wu

LLM-ProS: Analyzing Large Language Models' Performance in Competitive Problem Solving

The rapid advancement of large language models has opened new avenues for automating complex problem-solving tasks such as algorithmic coding and competitive programming. This paper introduces a novel evaluation technique, LLM-ProS, to…

Computation and Language · Computer Science 2026-03-03 Md Sifat Hossain , Anika Tabassum , Md. Fahim Arefin , Tarannum Shaila Zaman

PROMPT: A Fast and Extensible Memory Profiling Framework

Memory profiling captures programs' dynamic memory behavior, assisting programmers in debugging, tuning, and enabling advanced compiler optimizations like speculation-based automatic parallelization. As each use case demands its unique…

Performance · Computer Science 2023-11-07 Ziyang Xu , Yebin Chon , Yian Su , Zujun Tan , Sotiris Apostolakis , Simone Campanoni , David I. August

Probing the Unknown: Exploring Student Interactions with Probeable Problems at Scale in Introductory Programming

Introductory programming courses often rely on small code-writing exercises that have clearly specified problem statements. This limits opportunities for students to practice how to clarify ambiguous requirements -- a critical skill in…

Human-Computer Interaction · Computer Science 2025-04-17 Paul Denny , Viraj Kumar , Stephen MacNeil , James Prather , Juho Leinonen

PRISM: Prompt Reliability via Iterative Simulation and Monitoring for Enterprise Conversational AI

Deploying large language model (LLM)-driven conversational agents in enterprise settings requires prompts that are simultaneously correct at launch and resilient to the non-deterministic behavioral drift that characterizes production LLM…

Artificial Intelligence · Computer Science 2026-05-18 Keshava Chaitanya , Jahnavi Gundakaram

PRoA: A Probabilistic Robustness Assessment against Functional Perturbations

In safety-critical deep learning applications robustness measurement is a vital pre-deployment phase. However, existing robustness verification methods are not sufficiently practical for deploying machine learning systems in the real world.…

Machine Learning · Computer Science 2022-07-06 Tianle Zhang , Wenjie Ruan , Jonathan E. Fieldsend

How to Select Pre-Trained Code Models for Reuse? A Learning Perspective

Pre-training a language model and then fine-tuning it has shown to be an efficient and effective technique for a wide range of code intelligence tasks, such as code generation, code summarization, and vulnerability detection. However,…

Software Engineering · Computer Science 2025-01-08 Zhangqian Bi , Yao Wan , Zhaoyang Chu , Yufei Hu , Junyi Zhang , Hongyu Zhang , Guandong Xu , Hai Jin

Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning

Large language models (LLMs) have shown increasing competence in solving mathematical reasoning problems. However, many open-source LLMs still struggle with errors in calculation and semantic understanding during intermediate reasoning…

Computation and Language · Computer Science 2024-12-18 Vernon Y. H. Toh , Deepanway Ghosal , Soujanya Poria

PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)

The capability of generating high-quality source code using large language models (LLMs) reduces software development time and costs. However, they often introduce security vulnerabilities due to training on insecure open-source data. This…

Software Engineering · Computer Science 2024-09-20 Mahmoud Nazzal , Issa Khalil , Abdallah Khreishah , NhatHai Phan

PRISM: Parametrically Refactoring Inference for Speculative Sampling Draft Models

Large Language Models (LLMs), constrained by their auto-regressive nature, suffer from slow decoding. Speculative decoding methods have emerged as a promising solution to accelerate LLM decoding, attracting attention from both systems and…

Artificial Intelligence · Computer Science 2026-02-03 Xuliang Wang , Yuetao Chen , Maochan Zhen , Fang Liu , Xinzhou Zheng , Xingwu Liu , Hong Xu , Ming Li

Process Supervision-Guided Policy Optimization for Code Generation

Reinforcement learning (RL) with unit test feedback has enhanced large language models' (LLMs) code generation, but relies on sparse rewards provided only after complete code evaluation, limiting learning efficiency and incremental…

Artificial Intelligence · Computer Science 2025-02-05 Ning Dai , Zheng Wu , Renjie Zheng , Ziyun Wei , Wenlei Shi , Xing Jin , Guanlin Liu , Chen Dun , Liang Huang , Lin Yan

Challenging Machine Learning Algorithms in Predicting Vulnerable JavaScript Functions

The rapid rise of cyber-crime activities and the growing number of devices threatened by them place software security issues in the spotlight. As around 90% of all attacks exploit known types of security issues, finding vulnerable…

Cryptography and Security · Computer Science 2024-05-14 Rudolf Ferenc , Péter Hegedűs , Péter Gyimesi , Gábor Antal , Dénes Bán , Tibor Gyimóthy

RapidProM: Mine Your Processes and Not Just Your Data

The number of events recorded for operational processes is growing every year. This applies to all domains: from health care and e-government to production and maintenance. Event data are a valuable source of information for organizations…

Other Computer Science · Computer Science 2017-03-13 Wil M. P. van der Aalst , Alfredo Bolt , Sebastiaan J. van Zelst

Deep Model-Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy Optimization

Model-based reinforcement learning algorithms tend to achieve higher sample efficiency than model-free methods. However, due to the inevitable errors of learned models, model-based methods struggle to achieve the same asymptotic performance…

Machine Learning · Computer Science 2019-12-02 Qi Zhou , Houqiang Li , Jie Wang