English
Related papers

Related papers: InData: Towards Secure Multi-Step, Tool-Based Data…

200 papers

Large Language Models (LLMs) hold promise in automating data analysis tasks, yet open-source models face significant limitations in these kinds of reasoning-intensive scenarios. In this work, we investigate strategies to enhance the data…

Computation and Language · Computer Science 2025-11-14 Yuqi Zhu , Yi Zhong , Jintian Zhang , Ziheng Zhang , Shuofei Qiao , Yujie Luo , Lun Du , Da Zheng , Ningyu Zhang , Huajun Chen

The integration of large language models (LLMs) into autonomous agents has enabled complex tool use, yet in high-stakes domains, these systems must strictly adhere to regulatory standards beyond simple functional correctness. However,…

Computation and Language · Computer Science 2026-01-14 Da Song , Yuheng Huang , Boqi Chen , Tianshuo Cong , Randy Goebel , Lei Ma , Foutse Khomh

Large Language Model (LLM)-generated data is increasingly used in software analytics, but it is unclear how this data compares to human-written data, particularly when models are exposed to adversarial scenarios. Adversarial attacks can…

Software Engineering · Computer Science 2025-05-07 Md. Abdul Awal , Mrigank Rochan , Chanchal K. Roy

Adapting large language models (LLMs) to specific domains often faces a critical bottleneck: the scarcity of high-quality, human-curated data. While large volumes of unchecked data are readily available, indiscriminately using them for…

Computation and Language · Computer Science 2025-09-09 Jian Wu , Hang Yu , Bingchang Liu , Wenjie Yang , Peng Di , Jianguo Li , Yue Zhang

Thinking Large Language Models (LLMs) generate explicit intermediate reasoning traces before final answers, potentially improving transparency, interpretability, and solution accuracy for code generation. However, the quality of these…

Artificial Intelligence · Computer Science 2025-11-11 Haoran Xue , Gias Uddin , Song Wang

Multi-step processes via large language models (LLMs) have proven effective for solving complex reasoning tasks. However, the depth of exploration of the reasoning procedure can significantly affect the task performance. Existing methods to…

Artificial Intelligence · Computer Science 2025-06-19 Jinghan Zhang , Xiting Wang , Fengran Mo , Yeyang Zhou , Wanfu Gao , Kunpeng Liu

Despite the great advance of Multimodal Large Language Models (MLLMs) in both instruction dataset building and benchmarking, the independence of training and evaluation makes current MLLMs hard to further improve their capability under the…

Machine Learning · Computer Science 2023-09-12 Zhiyuan Zhao , Linke Ouyang , Bin Wang , Siyuan Huang , Pan Zhang , Xiaoyi Dong , Jiaqi Wang , Conghui He

The last two years have seen a rapid growth in concerns around the safety of large language models (LLMs). Researchers and practitioners have met these concerns by creating an abundance of datasets for evaluating and improving LLM safety.…

Computation and Language · Computer Science 2025-01-13 Paul Röttger , Fabio Pernisi , Bertie Vidgen , Dirk Hovy

As Large Language Models (LLMs) continue to exhibit remarkable performance in natural language understanding tasks, there is a crucial need to measure their ability for human-like multi-step logical reasoning. Existing logical reasoning…

Computation and Language · Computer Science 2024-10-08 Nisarg Patel , Mohith Kulkarni , Mihir Parmar , Aashna Budhiraja , Mutsumi Nakamura , Neeraj Varshney , Chitta Baral

Realistic, large-scale, and well-labeled cybersecurity datasets are essential for training and evaluating Intrusion Detection Systems (IDS). However, they remain difficult to obtain due to privacy constraints, data sensitivity, and the cost…

Cryptography and Security · Computer Science 2026-01-09 Konstantinos E. Kampourakis , Vyron Kampourakis , Efstratios Chatzoglou , Georgios Kambourakis , Stefanos Gritzalis

Data is a crucial element in large language model (LLM) alignment. Recent studies have explored using LLMs for efficient data collection. However, LLM-generated data often suffers from quality issues, with underrepresented or absent aspects…

Computation and Language · Computer Science 2024-10-08 Fei Wang , Ninareh Mehrabi , Palash Goyal , Rahul Gupta , Kai-Wei Chang , Aram Galstyan

Advancement in Large Language Models (LLMs) reasoning capabilities enables them to solve scientific problems with enhanced efficacy. Thereby, a high-quality benchmark for comprehensive and appropriate assessment holds significance, while…

Command injection vulnerabilities are a significant security threat in dynamic languages like Python, particularly in widely used open-source projects where security issues can have extensive impact. With the proven effectiveness of Large…

Software Engineering · Computer Science 2025-05-22 Yuxuan Wang , Jingshu Chen , Qingyang Wang

Training large language models (LLMs) with synthetic reasoning data has become a popular approach to enhancing their reasoning capabilities, while a key factor influencing the effectiveness of this paradigm is the quality of the generated…

Artificial Intelligence · Computer Science 2026-03-24 Zhuojie Yang , Wentao Wan , Keze Wang

System Instructions in Large Language Models (LLMs) are commonly used to enforce safety policies, define agent behavior, and protect sensitive operational context in agentic AI applications. These instructions may contain sensitive…

Cryptography and Security · Computer Science 2026-04-02 Anubhab Sahu , Diptisha Samanta , Reza Soosahabi

Advanced applied mathematics problems are underrepresented in existing Large Language Model (LLM) benchmark datasets. To address this, we introduce HARDMath, a dataset inspired by a graduate course on asymptotic methods, featuring…

In this paper, we present a challenging code reasoning task: vulnerability detection. Large Language Models (LLMs) have shown promising results in natural-language and math reasoning, but state-of-the-art (SOTA) models reported only 54.5%…

Software Engineering · Computer Science 2025-01-09 Benjamin Steenhoek , Md Mahbubur Rahman , Monoshi Kumar Roy , Mirza Sanjida Alam , Hengbo Tong , Swarna Das , Earl T. Barr , Wei Le

Analyzing Open Source Intelligence (OSINT) from large volumes of data is critical for drafting and publishing comprehensive CTI reports. This process usually follows a three-stage workflow -- triage, deep search and TI drafting. While Large…

Cryptography and Security · Computer Science 2026-03-11 Xiangsen Chen , Xuan Feng , Shuo Chen , Matthieu Maitre , Sudipto Rakshit , Diana Duvieilh , Ashley Picone , Nan Tang

This paper investigates the automation of qualitative data analysis, focusing on inductive coding using large language models (LLMs). Unlike traditional approaches that rely on deductive methods with predefined labels, this research…

Computation and Language · Computer Science 2025-12-02 Angelina Parfenova , Andreas Marfurt , Alexander Denzler , Juergen Pfeffer

Large language models (LLMs) have shown impressive promise in code generation, yet their progress remains limited by the shortage of large-scale datasets that are both diverse and well-aligned with human reasoning. Most existing resources…

Machine Learning · Computer Science 2025-10-28 Amal Abed , Ivan Lukic , Jörg K. H. Franke , Frank Hutter
‹ Prev 1 2 3 10 Next ›