Related papers: ReSyn: A Generalized Recursive Regular Expression …

SynGuar: Guaranteeing Generalization in Programming by Example

Programming by Example (PBE) is a program synthesis paradigm in which the synthesizer creates a program that matches a set of given examples. In many applications of such synthesis (e.g., program repair or reverse engineering), we are to…

Programming Languages · Computer Science 2021-06-23 Bo Wang , Teodora Baluta , Aashish Kolluri , Prateek Saxena

A semi-supervised framework for diverse multiple hypothesis testing scenarios

Standard multiple testing procedures are designed to report a list of discoveries, or suspected false null hypotheses, given the hypotheses' p-values or test scores. Recently there has been a growing interest in enhancing such procedures by…

Methodology · Statistics 2025-10-29 Jack Freestone , William Stafford Noble , Uri Keich

reCSE: Portable Reshaping Features for Sentence Embedding in Self-supervised Contrastive Learning

We propose reCSE, a self supervised contrastive learning sentence representation framework based on feature reshaping. This framework is different from the current advanced models that use discrete data augmentation methods, but instead…

Computation and Language · Computer Science 2024-08-27 Fufangchen Zhao , Jian Gao , Danfeng Yan

Sketch-Driven Regular Expression Generation from Natural Language and Examples

Recent systems for converting natural language descriptions into regular expressions (regexes) have achieved some success, but typically deal with short, formulaic text and can only produce simple regexes. Realworld regexes are complex,…

Computation and Language · Computer Science 2020-08-05 Xi Ye , Qiaochu Chen , Xinyu Wang , Isil Dillig , Greg Durrett

The Usability of Pragmatic Communication in Regular Expression Synthesis

Programming-by-example (PBE) systems aim to alleviate the burden of programming. However, user-specified examples are often ambiguous, leaving multiple programs to satisfy the specification. Consequently, in most prior work, users have had…

Human-Computer Interaction · Computer Science 2023-08-15 Priyan Vaithilingam , Yewen Pu , Elena L. Glassman

Neuro-Symbolic Regex Synthesis Framework via Neural Example Splitting

Due to the practical importance of regular expressions (regexes, for short), there has been a lot of research to automatically generate regexes from positive and negative string examples. We tackle the problem of learning regexes faster…

Machine Learning · Computer Science 2022-05-24 Su-Hyeon Kim , Hyunjoon Cheon , Yo-Sub Han , Sang-Ki Ko

Optimizing Regular Expressions via Rewrite-Guided Synthesis

Regular expressions are pervasive in modern systems. Many real-world regular expressions are inefficient, sometimes to the extent that they are vulnerable to complexity-based attacks, and while much research has focused on detecting…

Programming Languages · Computer Science 2022-09-30 Jedidiah McClurg , Miles Claver , Jackson Garner , Jake Vossen , Jordan Schmerge , Mehmet E. Belviranli

Resource-Guided Program Synthesis

This article presents resource-guided synthesis, a technique for synthesizing recursive programs that satisfy both a functional specification and a symbolic resource bound. The technique is type-directed and rests upon a novel type system…

Programming Languages · Computer Science 2019-04-19 Tristan Knoth , Di Wang , Nadia Polikarpova , Jan Hoffmann

Multi-modal Synthesis of Regular Expressions

In this paper, we propose a multi-modal synthesis technique for automatically constructing regular expressions (regexes) from a combination of examples and natural language. Using multiple modalities is useful in this context because…

Programming Languages · Computer Science 2020-03-24 Qiaochu Chen , Xinyu Wang , Xi Ye , Greg Durrett , Isil Dillig

Grammar Filtering For Syntax-Guided Synthesis

Programming-by-example (PBE) is a synthesis paradigm that allows users to generate functions by simply providing input-output examples. While a promising interaction paradigm, synthesis is still too slow for realtime interaction and more…

Machine Learning · Computer Science 2020-02-10 Kairo Morton , William Hallahan , Elven Shum , Ruzica Piskac , Mark Santolucito

Data Extraction via Semantic Regular Expression Synthesis

Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only…

Programming Languages · Computer Science 2023-08-28 Qiaochu Chen , Arko Banerjee , Çağatay Demiralp , Greg Durrett , Isil Dillig

Re-evaluating Retrosynthesis Algorithms with Syntheseus

Automated Synthesis Planning has recently re-emerged as a research area at the intersection of chemistry and machine learning. Despite the appearance of steady progress, we argue that imperfect benchmarks and inconsistent comparisons mask…

Machine Learning · Computer Science 2024-09-09 Krzysztof Maziarz , Austin Tripp , Guoqing Liu , Megan Stanley , Shufang Xie , Piotr Gaiński , Philipp Seidl , Marwin Segler

Is Reuse All You Need? A Systematic Comparison of Regular Expression Composition Strategies

Composing regexes is a common but challenging engineering activity. Software engineers struggle with regex complexity, leading to defects, performance issues, and security vulnerabilities. Researchers have proposed tools to synthesize…

Software Engineering · Computer Science 2025-09-25 Berk Çakar , Charles M. Sale , Sophie Chen , Dongyoon Lee , James C. Davis

Regularization Using Synthetic Data in High-Dimensional Models

To address the challenges of reliable statistical inference in high-dimensional models, we introduce the Synthetic-data Regularized Estimator (SRE). Unlike traditional regularization methods, the SRE regularizes the complex target model via…

Statistics Theory · Mathematics 2025-03-18 Weihao Li , Dongming Huang

Deep synthesis regularization of inverse problems

Recently, a large number of efficient deep learning methods for solving inverse problems have been developed and show outstanding numerical performance. For these deep learning methods, however, a solid theoretical foundation in the form of…

Numerical Analysis · Mathematics 2020-02-04 Daniel Obmann , Johannes Schwab , Markus Haltmeier

Programming by Example Made Easy

Programming by example (PBE) is an emerging programming paradigm that automatically synthesizes programs specified by user-provided input-output examples. Despite the convenience for end-users, implementing PBE tools often requires strong…

Software Engineering · Computer Science 2023-07-25 Jiarong Wu , Lili Wei , Yanyan Jiang , Shing-Chi Cheung , Luyao Ren , Chang Xu

gMeta: Template-based Regular Expression Generation over Noisy Examples

Regular expressions (regexes) are widely used in different fields of computer science, such as programming languages, string processing, and databases. However, existing tools for synthesizing or repairing regexes always assume that the…

Software Engineering · Computer Science 2022-11-02 Shujun Wang , Yongqiang Tian andDengcheng He

Repairing DoS Vulnerability of Real-World Regexes

There has been much work on synthesizing and repairing regular expressions (regexes for short) from examples. These programming-by-example (PBE) methods help the users write regexes by letting them reflect their intention by examples.…

Programming Languages · Computer Science 2022-08-23 Nariyoshi Chida , Tachio Terauchi

Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis

The rapid advances in deep generative models over the past years have led to highly {realistic media, known as deepfakes,} that are commonly indistinguishable from real to human eyes. These advances make assessing the authenticity of visual…

Computer Vision and Pattern Recognition · Computer Science 2021-06-01 Yang He , Ning Yu , Margret Keuper , Mario Fritz

ReSyn: Autonomously Scaling Synthetic Environments for Reasoning Models

Reinforcement learning with verifiable rewards (RLVR) has emerged as a promising approach for training reasoning language models (RLMs) by leveraging supervision from verifiers. Although verifier implementation is easier than solution…

Artificial Intelligence · Computer Science 2026-02-24 Andre He , Nathaniel Weir , Kaj Bostrom , Allen Nie , Darion Cassel , Sam Bayless , Huzefa Rangwala