Related papers: Learning to Format Coq Code Using Language Models

Deep Generation of Coq Lemma Names Using Elaborated Terms

Coding conventions for naming, spacing, and other essentially stylistic properties are necessary for developers to effectively understand, review, and modify source code in large software projects. Consistent conventions in verification…

Programming Languages · Computer Science 2020-04-23 Pengyu Nie , Karl Palmskog , Junyi Jessy Li , Milos Gligoric

Taming Differentiable Logics with Coq Formalisation

For performance and verification in machine learning, new methods have recently been proposed that optimise learning systems to satisfy formally expressed logical properties. Among these methods, differentiable logics (DLs) are used to…

Logic in Computer Science · Computer Science 2024-07-08 Reynald Affeldt , Alessandro Bruni , Ekaterina Komendantskaya , Natalia Ślusarz , Kathrin Stark

Enhancing Formal Theorem Proving: A Comprehensive Dataset for Training AI Models on Coq Code

In the realm of formal theorem proving, the Coq proof assistant stands out for its rigorous approach to verifying mathematical assertions and software correctness. Despite the advances in artificial intelligence and machine learning, the…

Artificial Intelligence · Computer Science 2024-04-03 Andreas Florath

Mechanizing Matching Logic In Coq

Matching logic is a formalism for specifying, and reasoning about, mathematical structures, using patterns and pattern matching. Growing in popularity, it has been used to define many logical systems such as separation logic with recursive…

Logic in Computer Science · Computer Science 2022-09-22 Péter Bereczky , Xiaohong Chen , Dániel Horpácsi , Lucas Peña , Jan Tušil

CoqQ: Foundational Verification of Quantum Programs

CoqQ is a framework for reasoning about quantum programs in the Coq proof assistant. Its main components are: a deeply embedded quantum programming language, in which classic quantum algorithms are easily expressed, and an expressive…

Programming Languages · Computer Science 2022-07-26 Li Zhou , Gilles Barthe , Pierre-Yves Strub , Junyi Liu , Mingsheng Ying

Dependent-Type-Preserving Memory Allocation

Dependently typed programming languages such as Coq, Agda, Idris, and F*, allow programmers to write detailed specifications of their programs and prove their programs meet these specifications. However, these specifications can be violated…

Programming Languages · Computer Science 2025-09-12 Paulette Koronkevich , William J. Bowman

KoCo: Conditioning Language Model Pre-training on Knowledge Coordinates

Standard Large Language Model (LLM) pre-training typically treats corpora as flattened token sequences, often overlooking the real-world context that humans naturally rely on to contextualize information. To bridge this gap, we introduce…

Computation and Language · Computer Science 2026-04-15 Yudong Li , Jiawei Cai , Linlin Shen

The Hidden Cost of Readability: How Code Formatting Silently Consumes Your LLM Budget

Source code is usually formatted with elements like indentation and newlines to improve readability for human developers. However, these visual aids do not seem to be beneficial for large language models (LLMs) in the same way since the…

Software Engineering · Computer Science 2025-08-21 Dangfeng Pan , Zhensu Sun , Cenyuan Zhang , David Lo , Xiaoning Du

Lost in Space: Finding the Right Tokens for Structured Output

General-purpose language models are trained to produce varied natural language outputs, but for some tasks, like annotation or classification, we need more specific output formats. LLM systems increasingly support structured output, which…

Computation and Language · Computer Science 2025-08-04 Sil Hamilton , David Mimno

While Loops in Coq

While loops are present in virtually all imperative programming languages. They are important both for practical reasons (performing a number of iterations not known in advance) and theoretical reasons (achieving Turing completeness). In…

Programming Languages · Computer Science 2023-09-26 David Nowak , Vlad Rusu

CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

To accelerate software development, much research has been performed to help people understand and reuse the huge amount of available code resources. Two important tasks have been widely studied: code retrieval, which aims to retrieve code…

Software Engineering · Computer Science 2019-04-02 Ziyu Yao , Jayavardhan Reddy Peddamail , Huan Sun

A Formalization of Operads in Coq

What provides the highest level of assurance for correctness of execution within a programming language? One answer, and our solution in particular, to this problem is to provide a formalization for, if it exists, the denotational semantics…

Category Theory · Mathematics 2023-03-17 Zachary Flores , Angelo Taranto , Eric Bond , Yakir Forman

On Formal Reasoning on the Semantics of PLC using Coq

Programmable Logic Controllers (PLC) and its programming standard IEC 61131-3 are widely used in embedded systems for the industrial automation domain. We propose a framework for the formal treatment of PLC based on the IEC 61131-3…

Software Engineering · Computer Science 2013-01-15 Jan Olaf Blech , Sidi Ould Biha

CoReQA: Uncovering Potentials of Language Models in Code Repository Question Answering

Large language models that enhance software development tasks, such as code generation, code completion, and code question answering (QA), have been extensively studied in both academia and the industry. The models are integrated into…

Software Engineering · Computer Science 2025-01-08 Jialiang Chen , Kaifa Zhao , Jie Liu , Chao Peng , Jierui Liu , Hang Zhu , Pengfei Gao , Ping Yang , Shuiguang Deng

Studying the Difference Between Natural and Programming Language Corpora

Code corpora, as observed in large software systems, are now known to be far more repetitive and predictable than natural language corpora. But why? Does the difference simply arise from the syntactic limitations of programming languages?…

Computation and Language · Computer Science 2018-06-08 Casey Casalnuovo , Kenji Sagae , Prem Devanbu

CoSQA: 20,000+ Web Queries for Code Search and Question Answering

Finding codes given natural language query isb eneficial to the productivity of software developers. Future progress towards better semantic matching between query and code requires richer supervised training resources. To remedy this, we…

Computation and Language · Computer Science 2021-05-28 Junjie Huang , Duyu Tang , Linjun Shou , Ming Gong , Ke Xu , Daxin Jiang , Ming Zhou , Nan Duan

Formalizing Higher-Order Termination in Coq

We describe a formalization of higher-order rewriting theory and formally prove that an AFS is strongly normalizing if it can be interpreted in a well-founded domain. To do so, we use Coq, which is a proof assistant based on dependent type…

Logic in Computer Science · Computer Science 2021-12-14 Deivid Vale , Niels van der Weide

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Recent advancements in Unified Multimodal Models (UMMs) have significantly advanced text-to-image (T2I) generation, particularly through the integration of Chain-of-Thought (CoT) reasoning. However, existing CoT-based T2I methods largely…

Artificial Intelligence · Computer Science 2026-03-10 Haodong Li , Chunmei Qing , Huanyu Zhang , Dongzhi Jiang , Yihang Zou , Hongbo Peng , Dingming Li , Yuhong Dai , ZePeng Lin , Juanxi Tian , Yi Zhou , Siqi Dai , Jingwei Wu

Typed Closure Conversion for the Calculus of Constructions

Dependently typed languages such as Coq are used to specify and verify the full functional correctness of source programs. Type-preserving compilation can be used to preserve these specifications and proofs of correctness through…

Programming Languages · Computer Science 2018-08-14 William J. Bowman , Amal Ahmed

A Systematic Literature Review on the Impact of Formatting Elements on Code Legibility

Context: Software programs can be written in different but functionally equivalent ways. Even though previous research has compared specific formatting elements to find out which alternatives affect code legibility, seeing the bigger…

Software Engineering · Computer Science 2023-06-02 Delano Oliveira , Reydne Santos , Fernanda Madeiral , Hidehiko Masuhara , Fernando Castor