Related papers: Introduction to Searching with Regular Expressions

From Regular Expression Matching to Parsing

Given a regular expression $R$ and a string $Q$, the regular expression parsing problem is to determine if $Q$ matches $R$ and if so, determine how it matches, e.g., by a mapping of the characters of $Q$ to the characters in $R$. Regular…

Data Structures and Algorithms · Computer Science 2019-01-30 Philip Bille , Inge Li Gørtz

Developing Smart Web-Search Using RegEx

Due to the increasing storage data on Web Applications, it becomes very difficult to use only keyword-based searches to provide comprehensive search results, thus increasing the difficulty for web users to search information on the web. In…

Information Retrieval · Computer Science 2021-10-12 Ikechukwu Onyenwe , Stanley Ogbonna , Ebele Onyedimma , Onyedikachukwu Ikechukwu-Onyenwe , Chidinma Nwafor

Detecting Structural Irregularity in Electronic Dictionaries Using Language Modeling

Dictionaries are often developed using tools that save to Extensible Markup Language (XML)-based standards. These standards often allow high-level repeating elements to represent lexical entries, and utilize descendants of these repeating…

Computation and Language · Computer Science 2016-02-18 Paul Rodrigues , David Zajic , David Doermann , Michael Bloodgood , Peng Ye

Textual Features for Programming by Example

In Programming by Example, a system attempts to infer a program from input and output examples, generally by searching for a composition of certain base functions. Performing a naive brute force search is infeasible for even mildly involved…

Artificial Intelligence · Computer Science 2012-09-19 Aditya Krishna Menon , Omer Tamuz , Sumit Gulwani , Butler Lampson , Adam Tauman Kalai

Regular Expression Search on Compressed Text

We present an algorithm for searching regular expression matches in compressed text. The algorithm reports the number of matching lines in the uncompressed text in time linear in the size of its compressed version. We define efficient data…

Formal Languages and Automata Theory · Computer Science 2019-01-17 Pierre Ganty , Pedro Valero

Noun-Phrase Analysis in Unrestricted Text for Information Retrieval

Information retrieval is an important application area of natural-language processing where one encounters the genuine challenge of processing large quantities of unrestricted natural-language text. This paper reports on the application of…

cmp-lg · Computer Science 2008-02-03 David A. Evans , Chengxiang Zhai

Document Spanners for Extracting Incomplete Information: Expressiveness and Complexity

Rule-based information extraction has lately received a fair amount of attention from the database community, with several languages appearing in the last few years. Although information extraction systems are intended to deal with…

Databases · Computer Science 2018-01-01 Francisco Maturana , Cristian Riveros , Domagoj Vrgoč

Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript

Existing support for regular expressions in automated test generation or verification tools is lacking. Common aspects of regular expression engines found in mainstream programming languages, such as backreferences or greedy matching, are…

Programming Languages · Computer Science 2020-03-16 Blake Loring , Duncan Mitchell , Johannes Kinder

Temporal Support of Regular Expressions in Sequential Pattern Mining

Classic algorithms for sequential pattern discovery, return all frequent sequences present in a database, but, in general, only a few ones are interesting for the user. Languages based on regular expressions (RE) have been proposed to…

Databases · Computer Science 2008-11-25 Leticia Gomez , Bart Kuijpers , Alejandro Vaisman

Generating Clarifying Questions for Query Refinement in Source Code Search

In source code search, a common information-seeking strategy involves providing a short initial query with a broad meaning, and then iteratively refining the query using terms gleaned from the results of subsequent searches. This strategy…

Software Engineering · Computer Science 2022-01-26 Zachary Eberhart , Collin McMillan

A Program That Simplifies Regular Expressions (Tool paper)

This paper presents the main features of a system that aims to transform regular expressions into shorter equivalent expressions. The system is also capable of computing other operations useful for simplification, such as checking the…

Symbolic Computation · Computer Science 2023-07-14 Baudouin Le Charlier

Crowd Sourced Data Analysis: Mapping of Programming Concepts to Syntactical Patterns

Since programming concepts do not match their syntactic representations, code search is a very tedious task. For instance in Java or C, array doesn't match [], so using "array" as a query, one cannot find what they are looking for. Often…

Information Retrieval · Computer Science 2019-04-01 Deepak Thukral , Darvesh Punia

Learning to Identify Regular Expressions that Describe Email Campaigns

This paper addresses the problem of inferring a regular expression from a given set of strings that resembles, as closely as possible, the regular expression that a human expert would have written to identify the language. This is motivated…

Machine Learning · Computer Science 2012-06-22 Paul Prasse , Christoph Sawade , Niels Landwehr , Tobias Scheffer

Modeling Information Need of Users in Search Sessions

Users issue queries to Search Engines, and try to find the desired information in the results produced. They repeat this process if their information need is not met at the first place. It is crucial to identify the important words in a…

Information Retrieval · Computer Science 2020-01-06 Kishaloy Halder , Heng-Tze Cheng , Ellie Ka In Chio , Georgios Roumpos , Tao Wu , Ritesh Agarwal

Learning from Uncurated Regular Expressions

Significant work has been done on learning regular expressions from a set of data values. Depending on the domain, this approach can be very successful. However, significant time is required to learn these expressions and the resulting…

Databases · Computer Science 2024-03-18 Michael J. Mior

An index for regular expression queries: Design and implementation

The like regular expression predicate has been part of the SQL standard since at least 1989. However, despite its popularity and wide usage, database vendors provide only limited indexing support for regular expression queries which almost…

Databases · Computer Science 2011-08-16 Dominic Tsang , Sanjay Chawla

Leveraging Cognitive Search Patterns to Enhance Automated Natural Language Retrieval Performance

The search of information in large text repositories has been plagued by the so-called document-query vocabulary gap, i.e. the semantic discordance between the contents in the stored document entities on the one hand and the human query on…

Information Retrieval · Computer Science 2020-04-22 Bhawani Selvaretnam , Mohammed Belkhatir

Marrying up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding

The success of many natural language processing (NLP) tasks is bound by the number and quality of annotated data, but there is often a shortage of such training data. In this paper, we ask the question: "Can we combine a neural network (NN)…

Computation and Language · Computer Science 2018-05-16 Bingfeng Luo , Yansong Feng , Zheng Wang , Songfang Huang , Rui Yan , Dongyan Zhao

Formal Specifications from Natural Language

We study the generalization abilities of language models when translating natural language into formal specifications with complex semantics. In particular, we fine-tune language models on three datasets consisting of English sentences and…

Software Engineering · Computer Science 2022-10-21 Christopher Hahn , Frederik Schmitt , Julia J. Tillman , Niklas Metzger , Julian Siber , Bernd Finkbeiner

Bayesian Inference of Regular Expressions from Human-Generated Example Strings

In programming by example, users "write" programs by generating a small number of input-output examples and asking the computer to synthesize consistent programs. We consider a challenging problem in this domain: learning regular…

Artificial Intelligence · Computer Science 2018-09-28 Long Ouyang