Related papers: DyPyBench: A Benchmark of Executable Python Softwa…

Scalpel: The Python Static Analysis Framework

Despite being the most popular programming language, Python has not yet received enough attention from the community. To the best of our knowledge, there is no general static analysis framework proposed to facilitate the implementation of…

Software Engineering · Computer Science 2022-02-25 Li Li , Jiawei Wang , Haowei Quan

SymPyBench: A Dynamic Benchmark for Scientific Reasoning with Executable Python Code

We introduce, a large-scale synthetic benchmark of 15,045 university-level physics problems (90/10% train/test split). Each problem is fully parameterized, supporting an effectively infinite range of input configurations, and is accompanied…

Artificial Intelligence · Computer Science 2025-12-08 Shima Imani , Seungwhan Moon , Adel Ahmadyan , Lu Zhang , Kirmani Ahmed , Babak Damavandi

pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods

Python is one of the fastest-growing programming languages and currently ranks as the top language in many lists, even recently overtaking JavaScript as the top language on GitHub. Given its importance in data science and machine learning,…

Software Engineering · Computer Science 2025-02-10 Idriss Abdelmadjid , Robert Dyer

Landscape of High-performance Python to Develop Data Science and Machine Learning Applications

Python has become the prime language for application development in the Data Science and Machine Learning domains. However, data scientists are not necessarily experienced programmers. While Python lets them quickly implement their…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-24 Oscar Castro , Pierrick Bruneau , Jean-Sébastien Sottet , Dario Torregrossa

PyCG: Practical Call Graph Generation in Python

Call graphs play an important role in different contexts, such as profiling and vulnerability propagation analysis. Generating call graphs in an efficient manner can be a challenging task when it comes to high-level languages that are…

Programming Languages · Computer Science 2021-03-02 Vitalis Salis , Thodoris Sotiropoulos , Panos Louridas , Diomidis Spinellis , Dimitris Mitropoulos

An Empirical Study of Vulnerabilities in Python Packages and Their Detection

In the rapidly evolving software development landscape, Python stands out for its simplicity, versatility, and extensive ecosystem. Python packages, as units of organization, reusability, and distribution, have become a pressing concern,…

Software Engineering · Computer Science 2025-09-05 Haowei Quan , Junjie Wang , Xinzhe Li , Terry Yue Zhuo , Xiao Chen , Xiaoning Du

DySec: A Machine Learning-based Dynamic Analysis for Detecting Malicious Packages in PyPI Ecosystem

Malicious Python packages make software supply chains vulnerable by exploiting trust in open-source repositories like Python Package Index (PyPI). Lack of real-time behavioral monitoring makes metadata inspection and static code analysis…

Cryptography and Security · Computer Science 2025-03-04 Sk Tanzir Mehedi , Chadni Islam , Gowri Ramachandran , Raja Jurdak

A Python Benchmark Functions Framework for Numerical Optimisation Problems

This work proposes a framework of benchmark functions designed to facilitate the creation of test cases for numerical optimisation techniques. The framework, written in Python 3, is designed to be easy to install, use, and expand. The…

Numerical Analysis · Mathematics 2024-06-25 Luca Baronti , Marco Castellani

Serenity: Library Based Python Code Analysis for Code Completion and Automated Machine Learning

Dynamically typed languages such as Python have become very popular. Among other strengths, Python's dynamic nature and its straightforward linking to native code have made it the de-facto language for many research areas such as Artificial…

Programming Languages · Computer Science 2023-01-13 Wenting Zhao , Ibrahim Abdelaziz , Julian Dolby , Kavitha Srinivas , Mossad Helali , Essam Mansour

An Analysis of Python's Topics, Trends, and Technologies Through Mining Stack Overflow Discussions

Python is a popular, widely used, and general-purpose programming language. In spite of its ever-growing community, researchers have not performed much analysis on Python's topics, trends, and technologies which provides insights for…

Software Engineering · Computer Science 2020-04-15 Hamed Tahmooresi , Abbas Heydarnoori , Alireza Aghamohammadi

Dynabench: Rethinking Benchmarking in NLP

We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the-loop dataset creation: annotators seek to create examples that a target…

Computation and Language · Computer Science 2021-04-30 Douwe Kiela , Max Bartolo , Yixin Nie , Divyansh Kaushik , Atticus Geiger , Zhengxuan Wu , Bertie Vidgen , Grusha Prasad , Amanpreet Singh , Pratik Ringshia , Zhiyi Ma , Tristan Thrush , Sebastian Riedel , Zeerak Waseem , Pontus Stenetorp , Robin Jia , Mohit Bansal , Christopher Potts , Adina Williams

EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code

Existing code generation benchmarks primarily evaluate functional correctness, with limited focus on code efficiency and often restricted to a single language like Python. To address this gap, we introduce EffiBench-X, the first…

Computation and Language · Computer Science 2025-05-20 Yuhao Qing , Boyu Zhu , Mingzhe Du , Zhijiang Guo , Terry Yue Zhuo , Qianru Zhang , Jie M. Zhang , Heming Cui , Siu-Ming Yiu , Dong Huang , See-Kiong Ng , Luu Anh Tuan

AIDABench: AI Data Analytics Benchmark

As AI-driven document understanding and processing tools become increasingly prevalent in real-world applications, the need for rigorous evaluation standards has grown increasingly urgent. Existing benchmarks and evaluations often focus on…

Artificial Intelligence · Computer Science 2026-03-30 Yibo Yang , Fei Lei , Yixuan Sun , Yantao Zeng , Chengguang Lv , Jiancao Hong , Jiaojiao Tian , Tianyu Qiu , Xin Wang , Yanbing Chen , Yanjie Li , Zheng Pan , Xiaochen Zhou , Guanzhou Chen , Haoran Lv , Yuning Xu , Yue Ou , Haodong Liu , Shiqi He , Anya Jia , Yulei Xin , Huan Wu , Liang Liu , Jiaye Ge , Jianxin Dong , Dahua Lin , Wenxiu Sun

Performance Evaluation of Python Parallel Programming Models: Charm4Py and mpi4py

Python is rapidly becoming the lingua franca of machine learning and scientific computing. With the broad use of frameworks such as Numpy, SciPy, and TensorFlow, scientific computing and machine learning are seeing a productivity boost on…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-01 Zane Fink , Simeng Liu , Jaemin Choi , Matthias Diener , Laxmikant V. Kale

Asynchronous Execution of Python Code on Task Based Runtime Systems

Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and…

Programming Languages · Computer Science 2019-03-08 R. Tohid , Bibek Wagle , Shahrzad Shirzad , Patrick Diehl , Adrian Serio , Alireza Kheirkhahan , Parsa Amini , Katy Williams , Kate Isaacs , Kevin Huck , Steven Brandt , Hartmut Kaiser

A systematic review of Python packages for time series analysis

This paper presents a systematic review of Python packages with a focus on time series analysis. The objective is to provide (1) an overview of the different time series analysis tasks and preprocessing methods implemented, and (2) an…

Mathematical Software · Computer Science 2021-06-23 Julien Siebert , Janek Groß , Christof Schroth

PyBench: Evaluating LLM Agent on various real-world coding tasks

The LLM Agent, equipped with a code interpreter, is capable of automatically solving real-world coding tasks, such as data analysis and image editing. However, existing benchmarks primarily focus on either simplistic tasks, such as…

Software Engineering · Computer Science 2024-08-06 Yaolun Zhang , Yinxu Pan , Yudong Wang , Jie Cai

DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks

Large language models (LLMs) have achieved remarkable performance in various evaluation benchmarks. However, concerns are raised about potential data contamination in their considerable volume of training corpus. Moreover, the static nature…

Artificial Intelligence · Computer Science 2024-03-15 Kaijie Zhu , Jiaao Chen , Jindong Wang , Neil Zhenqiang Gong , Diyi Yang , Xing Xie

dynsight: an Open Python Platform for Simulation and Experimental Trajectory Data Analysis

The study of complex many-body systems via analysis of the trajectories of the units that dynamically move and interact within them is a non-trivial task. The workflow for extracting meaningful information from the raw trajectory data is…

Materials Science · Physics 2025-10-31 Simone Martino , Matteo Becchi , Andrew Tarzia , Daniele Rapetti , Giovanni M. Pavan

pymdp: A Python library for active inference in discrete state spaces

Active inference is an account of cognition and behavior in complex systems which brings together action, perception, and learning under the theoretical mantle of Bayesian inference. Active inference has seen growing applications in…

Artificial Intelligence · Computer Science 2022-05-06 Conor Heins , Beren Millidge , Daphne Demekas , Brennan Klein , Karl Friston , Iain Couzin , Alexander Tschantz