English
Related papers

Related papers: ManyTypes4Py: A Benchmark Python Dataset for Machi…

200 papers

Dynamic languages, such as Python and Javascript, trade static typing for developer flexibility and productivity. Lack of static typing can cause run-time exceptions and is a major factor for weak IDE support. To alleviate these issues, PEP…

Machine Learning · Computer Science 2022-01-20 Amir M. Mir , Evaldas Latoskinas , Sebastian Proksch , Georgios Gousios

In light of the growing interest in type inference research for Python, both researchers and practitioners require a standardized process to assess the performance of various type inference techniques. This paper introduces TypeEvalPy, a…

Software Engineering · Computer Science 2024-01-03 Ashwin Prasad Shivarpatna Venkatesh , Samkutty Sabu , Jiawei Wang , Amir M. Mir , Li Li , Eric Bodden

Type annotations in Python enhance maintainability and error detection. However, generating these annotations manually is error prone and requires extra effort. Traditional automation approaches like static analysis, machine learning, and…

Programming Languages · Computer Science 2025-08-04 Varun Bharti , Shashwat Jha , Dhruv Kumar , Pankaj Jalote

Optional type annotations allow for enriching dynamic programming languages with static typing features like better Integrated Development Environment (IDE) support, more precise program analysis, and early detection and prevention of…

Software Engineering · Computer Science 2023-07-31 Bernd Gruner , Tim Sonnekalb , Thomas S. Heinze , Clemens-Alexander Brust

Recently, dynamically typed languages, such as Python, have gained unprecedented popularity. Although these languages alleviate the need for mandatory type annotations, types still play a critical role in program understanding and…

Programming Languages · Computer Science 2022-02-08 Ibrahim Abdelaziz , Julian Dolby , Kavitha Srinivas

Machine learning (ML) has gained much attention and been incorporated into our daily lives. While there are numerous publicly available ML projects on open source platforms such as GitHub, there have been limited attempts in filtering those…

Software Engineering · Computer Science 2023-03-14 Ratnadira Widyasari , Zhou Yang , Ferdian Thung , Sheng Qin Sim , Fiona Wee , Camellia Lok , Jack Phan , Haodi Qi , Constance Tan , Qijin Tay , David Lo

Automated regression test generation has been extensively explored, yet generating high-quality tests for Python programs remains particularly challenging. Because of the Python's dynamic typing features, existing approaches, ranging from…

Software Engineering · Computer Science 2025-10-23 Runlin Liu , Zhe Zhang , Yunge Hu , Yuhang Lin , Xiang Gao , Hailong Sun

Python type inference is challenging in practice. Due to its dynamic properties and extensive dependencies on third-party libraries without type annotations, the performance of traditional static analysis techniques is limited. Although…

Software Engineering · Computer Science 2021-06-29 Siwei Cui , Gang Zhao , Zeyu Dai , Luochao Wang , Ruihong Huang , Jeff Huang

Python's dynamic type system, while offering significant flexibility and expressiveness, poses substantial challenges for static analysis and automated tooling, particularly in unannotated or partially annotated codebases. Existing type…

Software Engineering · Computer Science 2026-04-08 Ali Aman , Muhammad Asaduzzaman , Shaowei Wang

Python is one of the fastest-growing programming languages and currently ranks as the top language in many lists, even recently overtaking JavaScript as the top language on GitHub. Given its importance in data science and machine learning,…

Software Engineering · Computer Science 2025-02-10 Idriss Abdelmadjid , Robert Dyer

A large amount of data is produced every second from modern information systems such as mobile devices, the world wide web, Internet of Things, social media, etc. Analysis and mining of this massive data requires a lot of advanced tools and…

Machine Learning · Computer Science 2020-01-13 Rising Odegua , Festus Ikpotokin

Large Language Models (LLMs) are increasingly being explored for their potential in software engineering, particularly in static analysis tasks. In this study, we investigate the potential of current LLMs to enhance call-graph analysis and…

Software Engineering · Computer Science 2025-07-17 Ashwin Prasad Shivarpatna Venkatesh , Rose Sunil , Samkutty Sabu , Amir M. Mir , Sofia Reis , Eric Bodden

We present our vision for developing an automated tool capable of translating visual properties observed in Machine Learning (ML) visualisations into Python assertions. The tool aims to streamline the process of manually verifying these…

Software Engineering · Computer Science 2024-01-17 Arumoy Shome , Luis Cruz , Arie van Deursen

In software engineering, different approaches and machine learning models leverage different types of data: source code, textual information, historical data. An important part of any project is its dependencies. The list of dependencies is…

Software Engineering · Computer Science 2022-09-09 Yaroslav Golubev , Egor Bogomolov , Egor Bulychev , Timofey Bryksin

Program code as a data source is gaining popularity in the data science community. Possible applications for models trained on such assets range from classification for data dimensionality reduction to automatic code generation. However,…

Software Engineering · Computer Science 2022-10-31 Anastasia Drozdova , Polina Guseva , Ekaterina Trofimova , Anna Scherbakova , Andrey Ustyuzhanin

Defect prediction has been a popular research topic where machine learning (ML) and deep learning (DL) have found numerous applications. However, these ML/DL-based defect prediction models are often limited by the quality and size of their…

Software Engineering · Computer Science 2023-07-26 Parvez Mahbub , Ohiduzzaman Shuvo , Mohammad Masudur Rahman

In this paper, we present resolvent4py, a parallel Python package for the analysis, model reduction and control of large-scale linear systems with millions or billions of degrees of freedom. This package provides the user with a friendly…

Existing class-level code generation datasets are either synthetic (ClassEval: 100 classes) or insufficient in scale for modern training needs (RealClassEval: 400 classes), hindering robust evaluation and empirical analysis. We present…

Software Engineering · Computer Science 2026-05-01 Musfiqur Rahman , SayedHassan Khatoonabadi , Emad Shihab

Imbalanced-learn is an open-source python toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced dataset frequently encountered in machine learning and pattern recognition. The implemented…

Machine Learning · Computer Science 2016-09-22 Guillaume Lemaitre , Fernando Nogueira , Christos K. Aridas

We introduce a novel dataset tailored for code generation, aimed at aiding developers in common tasks. Our dataset provides examples that include a clarified intent, code snippets associated, and an average of three related unit tests. It…

Computation and Language · Computer Science 2024-09-26 Nathanaël Beau , Benoît Crabbé
‹ Prev 1 2 3 10 Next ›