Related papers: Algorithmic Programming Language Identification

Machine Learning Based Source Code Classification Using Syntax Oriented Features

As of today the programming language of the vast majority of the published source code is manually specified or programmatically assigned based on the sole file extension. In this paper we show that the source code programming language…

Machine Learning · Computer Science 2017-03-23 Shaul Zevin , Catherine Holzem

Natural Language-Guided Programming

In today's software world with its cornucopia of reusable software libraries, when a programmer is faced with a programming task that they suspect can be completed through the use of a library, they often look for code examples using a…

Software Engineering · Computer Science 2021-10-08 Geert Heyman , Rafael Huysegems , Pascal Justen , Tom Van Cutsem

An Empirical Study on Automatically Detecting AI-Generated Source Code: How Far Are We?

Artificial Intelligence (AI) techniques, especially Large Language Models (LLMs), have started gaining popularity among researchers and software developers for generating source code. However, LLMs have been shown to generate code with…

Software Engineering · Computer Science 2024-11-08 Hyunjae Suh , Mahan Tafreshipour , Jiawei Li , Adithya Bhattiprolu , Iftekhar Ahmed

Beryllium: Neural Search for Algorithm Implementations

In this paper, we explore the feasibility of finding algorithm implementations from code. Successfully matching code and algorithms can help understand unknown code, provide reference implementations, and automatically collect data for…

Software Engineering · Computer Science 2023-07-04 Adithya Kulkarni , Mohna Chakraborty , Yonas Sium , Sai Charishma Valluri , Wei Le , Qi Li

Is This You, LLM? Recognizing AI-written Programs with Multilingual Code Stylometry

With the increasing popularity of LLM-based code completers, like GitHub Copilot, the interest in automatically detecting AI-generated code is also increasing-in particular in contexts where the use of LLMs to program is forbidden by policy…

Software Engineering · Computer Science 2024-12-20 Andrea Gurioli , Maurizio Gabbrielli , Stefano Zacchiroli

Automatic Labeling of the Object-oriented Source Code: The Lotus Approach

Most of open-source software systems become available on the internet today. Thus, we need automatic methods to label software code. Software code can be labeled with a set of keywords. These keywords in this paper referred as software…

Software Engineering · Computer Science 2018-03-02 Ra'Fat Al-Msie'deen

Exploring Large Language Models for Analyzing and Improving Method Names in Scientific Code

Research scientists increasingly rely on implementing software to support their research. While previous research has examined the impact of identifier names on program comprehension in traditional programming environments, limited work has…

Software Engineering · Computer Science 2025-07-23 Gunnar Larsen , Carol Wong , Anthony Peruma

A Systematic Approach to Programming

We show how to systematically implement an algorithm in any imperative or functional programming language. The method is based on the premise that it is easy to write down how an algorithm proceeds on a concrete input. This…

Software Engineering · Computer Science 2020-04-28 Maurice Chandoo

A Survey of Machine Learning for Big Code and Naturalness

Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit code's abundance of patterns. In…

Software Engineering · Computer Science 2018-05-08 Miltiadis Allamanis , Earl T. Barr , Premkumar Devanbu , Charles Sutton

An Exploratory Study on the Predominant Programming Paradigms in Python Code

Python is a multi-paradigm programming language that fully supports object-oriented (OO) programming. The language allows writing code in a non-procedural imperative manner, using procedures, using classes, or in a functional style. To…

Software Engineering · Computer Science 2022-09-07 Robert Dyer , Jigyasa Chauhan

Identifying Algorithm Names in Code Comments

For recent machine-learning-based tasks like API sequence generation, comment generation, and document generation, large amount of data is needed. When software developers implement algorithms in code, we find that they often mention…

Software Engineering · Computer Science 2019-07-11 Jakapong Klainongsuang , Yusuf Sulistyo Nugroho , Hideaki Hata , Bundit Manaskasemsak , Arnon Rungsawang , Pattara Leelaprute , Kenichi Matsumoto

Automatic Programming: Large Language Models and Beyond

Automatic programming has seen increasing popularity due to the emergence of tools like GitHub Copilot which rely on Large Language Models (LLMs). At the same time, automatically generated code faces challenges during deployment due to…

Software Engineering · Computer Science 2024-05-16 Michael R. Lyu , Baishakhi Ray , Abhik Roychoudhury , Shin Hwei Tan , Patanamon Thongtanunam

Anyone Can Code: Algorithmic Thinking

As the second book in the Anyone Can Code series, Algorithmic Thinking focuses on the logic behind computer programming and software design. With a data-centred approach, it starts with simple algorithms that work on simple data items and…

Programming Languages · Computer Science 2023-11-27 Ali Arya

Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering

Authorship attribution (i.e., determining who is the author of a piece of source code) is an established research topic. State-of-the-art results for the authorship attribution problem look promising for the software engineering field,…

Software Engineering · Computer Science 2021-06-22 Egor Bogomolov , Vladimir Kovalenko , Yurii Rebryk , Alberto Bacchelli , Timofey Bryksin

SCC: Automatic Classification of Code Snippets

Determining the programming language of a source code file has been considered in the research community; it has been shown that Machine Learning (ML) and Natural Language Processing (NLP) algorithms can be effective in identifying the…

Software Engineering · Computer Science 2018-09-24 Kamel Alreshedy , Dhanush Dharmaretnam , Daniel M. German , Venkatesh Srinivasan , T. Aaron Gulliver

CoDet-M4: Detecting Machine-Generated Code in Multi-Lingual, Multi-Generator and Multi-Domain Settings

Large language models (LLMs) have revolutionized code generation, automating programming with remarkable efficiency. However, these advancements challenge programming skills, ethics, and assessment integrity, making the detection of…

Computation and Language · Computer Science 2025-07-18 Daniil Orel , Dilshod Azizov , Preslav Nakov

Detection of a Source Code Plagiarism in a Student Programming Competition

The article presents a system for testing the independence of solutions to algorithmic problems sent by students as part of the student programming competition. First, the context was discussed, as well as the need to organize programming…

Software Engineering · Computer Science 2019-12-18 Zenon Gniazdowski , Maciej Boniecki

A Survey on Natural Language Processing for Programming

Natural language processing for programming aims to use NLP techniques to assist programming. It is increasingly prevalent for its effectiveness in improving productivity. Distinct from natural language, a programming language is highly…

Computation and Language · Computer Science 2023-08-08 Qingfu Zhu , Xianzhen Luo , Fang Liu , Cuiyun Gao , Wanxiang Che

Between Lines of Code: Unraveling the Distinct Patterns of Machine and Human Programmers

Large language models have catalyzed an unprecedented wave in code generation. While achieving significant advances, they blur the distinctions between machine- and human-authored source code, causing integrity and authenticity issues of…

Software Engineering · Computer Science 2024-07-31 Yuling Shi , Hongyu Zhang , Chengcheng Wan , Xiaodong Gu

Automating the Analysis of Parsing Algorithms (and other Dynamic Programs)

Much algorithmic research in NLP aims to efficiently manipulate rich formal structures. An algorithm designer typically seeks to provide guarantees about their proposed algorithm -- for example, that its running time or space complexity is…

Programming Languages · Computer Science 2025-12-30 Tim Vieira , Ryan Cotterell , Jason Eisner