Related papers: Combining Program Analysis and Statistical Languag…
Coding is an integral aspect of programming. A programmer can automatically complete a code fragment after writing a few tokens, and the process of automatic completion is known as code completion. Several research studies on code…
Automatic code generation is to generate the program code according to the given natural language description. The current mainstream approach uses neural networks to encode natural language descriptions, and output abstract syntax trees…
Formal verification provides a rigorous and systematic approach to ensure the correctness and reliability of software systems. Yet, constructing specifications for the full proof relies on domain expertise and non-trivial manpower. In view…
Code completion is widely used by software developers to provide coding suggestions given a partially written code snippet. Apart from the traditional code completion methods, which only support single token completion at minimal positions,…
Code completion is a popular software development tool integrated into all major IDEs. Many neural language models have achieved promising results in completion suggestion prediction on synthetic benchmarks. However, a recent study When…
Transformer-based pre-trained models have recently achieved great results in solving many software engineering tasks including automatic code completion which is a staple in a developer's toolkit. While many have striven to improve the…
Software testing is an essential part of the software lifecycle and requires a substantial amount of time and effort. It has been estimated that software developers spend close to 50% of their time on testing the code they write. For these…
Probabilistic programming is perfectly suited to reliable and transparent data science, as it allows the user to specify their models in a high-level language without worrying about the complexities of how to fit the models. Static analysis…
Autoformalization aims to convert informal mathematical proofs into machine-verifiable formats, bridging the gap between natural and formal languages. However, ensuring semantic alignment between the informal and formalized statements…
Background: During software maintenance and development, the comprehension of program code is key to success. High-quality comments can help us better understand programs, but they're often missing or outmoded in today's programs. Automatic…
Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development. Recent work has proved that statistical language modeling with transformers can greatly…
Evaluating the correctness of code generated by AI is a challenging open problem. In this paper, we propose a fully automated method, named ACCA, to evaluate the correctness of AI-generated code for security purposes. The method uses…
Code completion is an essential feature of IDEs, yet current autocompleters are restricted to either grammar-based or NLP-based single token completions. Both approaches have significant drawbacks: grammar-based autocompletion is restricted…
Winograd Schema Challenge (WSC) was proposed as an AI-hard problem in testing computers' intelligence on common sense representation and reasoning. This paper presents the new state-of-theart on WSC, achieving an accuracy of 71.1%. We…
As an integral part of source code files, code comments help improve program readability and comprehension. However, developers sometimes do not comment on their program code adequately due to the incurred extra efforts, lack of relevant…
With the development of code generation techniques, selecting the correct code solution from multiple candidate solutions has become a crucial task. This study proposes AutoTest, a novel technique that combines automated test case…
Writing competitive programming problems is exacting. Authors must: set constraints, input distributions, and edge cases that rule out shortcuts; target specific algorithms (e.g., max-flow, dynamic programming, data structures); and…
Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding, which represent the input as a token sequence without explicitly modeling its structure. Some prior works…
Recent advancements in the field of large language models have made it possible to use language models for advanced reasoning. In this paper we leverage this ability for designing complex project plans based only on knowing the current…
Some approaches to increasing program reliability involve a disciplined use of programming languages so as to minimise the hazards introduced by error-prone features. This is realised by writing code that is constrained to a subset of the a…