Related papers: Stream Processing using Grammars and Regular Expre…

Solving String Constraints With Regex-Dependent Functions Through Transducers With Priorities And Variables

Regular expressions are a classical concept in formal language theory. Regular expressions in programming languages (RegEx) such as JavaScript, feature non-standard semantics of operators (e.g. greedy/lazy Kleene star), as well as…

Programming Languages · Computer Science 2021-11-23 Taolue Chen , Alejandro Flores Lamas , Matthew Hague , Zhilei Han , Denghang Hu , Shuanglong Kan , Anthony Widjaja Lin , Philipp Ruemmer , Zhilin Wu

Streaming Word Embeddings with the Space-Saving Algorithm

We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec. We compare our streaming algorithm to word2vec empirically by measuring…

Computation and Language · Computer Science 2017-04-26 Chandler May , Kevin Duh , Benjamin Van Durme , Ashwin Lall

Data Extraction via Semantic Regular Expression Synthesis

Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only…

Programming Languages · Computer Science 2023-08-28 Qiaochu Chen , Arko Banerjee , Çağatay Demiralp , Greg Durrett , Isil Dillig

Unique Pattern Matching in Strings

Regular expression patterns are a key feature of document processing languages like Perl and XDuce. It is in this context that the first and longest match policies have been proposed to disambiguate the pattern matching process. We formally…

Programming Languages · Computer Science 2007-05-23 Stijn Vansummeren

New Algorithms and Lower Bounds for Sequential-Access Data Compression

This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by…

Information Theory · Computer Science 2009-02-03 Travis Gagie

PaREM: A Novel Approach for Parallel Regular Expression Matching

Regular expression matching is essential for many applications, such as finding patterns in text, exploring substrings in large DNA sequences, or lexical analysis. However, sequential regular expression matching may be time-prohibitive for…

Formal Languages and Automata Theory · Computer Science 2015-06-30 Suejb Memeti , Sabri Pllana

Streaming Algorithms for Multitasking Scheduling with Shared Processing

In this paper, we design the first streaming algorithms for the problem of multitasking scheduling on parallel machines with shared processing. In one pass, our streaming approximation schemes can provide an approximate value of the optimal…

Data Structures and Algorithms · Computer Science 2022-04-06 Bin Fu , Yumei Huo , Hairong Zhao

Fast and Faster: A Comparison of Two Streamed Matrix Decomposition Algorithms

With the explosion of the size of digital dataset, the limiting factor for decomposition algorithms is the \emph{number of passes} over the input, as the input is often stored out-of-core or even off-site. Moreover, we're only interested in…

Numerical Analysis · Computer Science 2016-08-14 Radim Řeh{ů}řek

Revisiting Regex Generation for Modeling Industrial Applications by Incorporating Byte Pair Encoder

Regular expression is important for many natural language processing tasks especially when used to deal with unstructured and semi-structured data. This work focuses on automatically generating regular expressions and proposes a novel…

Neural and Evolutionary Computing · Computer Science 2020-06-25 Desheng Wang , Jiawei Liu , Xiang Qi , Baolin Sun , Peng Zhang

Recognizing well-parenthesized expressions in the streaming model

Motivated by a concrete problem and with the goal of understanding the sense in which the complexity of streaming algorithms is related to the complexity of formal languages, we investigate the problem Dyck(s) of checking matching…

Data Structures and Algorithms · Computer Science 2009-11-18 F. Magniez , C. Mathieu , A. Nayak

From Regular Expression Matching to Parsing

Given a regular expression $R$ and a string $Q$, the regular expression parsing problem is to determine if $Q$ matches $R$ and if so, determine how it matches, e.g., by a mapping of the characters of $Q$ to the characters in $R$. Regular…

Data Structures and Algorithms · Computer Science 2019-01-30 Philip Bille , Inge Li Gørtz

Fast Packed String Matching for Short Patterns

Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. In the last two…

Information Retrieval · Computer Science 2012-10-01 Simone Faro , M. Oguzhan Külekci

Stream-based State-Machine Replication

Developing state-machine replication protocols for practical use is a complex and labor-intensive process because of the myriad of essential tasks (e.g., deployment, communication, recovery) that need to be taken into account in an…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-25 Laura Lawniczak , Tobias Distler

Semi-Streaming Algorithms for Hypergraph Matching

We propose two one-pass streaming algorithms for the $\mathcal{NP}$-hard hypergraph matching problem. The first algorithm stores a small subset of potential matching edges in a stack using dual variables to select edges. It has an…

Data Structures and Algorithms · Computer Science 2025-07-09 Henrik Reinstädtler , S M Ferdous , Alex Pothen , Bora Uçar , Christian Schulz

From Regexes to Parsing Expression Grammars

Most scripting languages nowadays use regex pattern-matching libraries. These regex libraries borrow the syntax of regular expressions, but have an informal semantics that is different from the semantics of regular expressions, removing the…

Formal Languages and Automata Theory · Computer Science 2014-02-17 Sérgio Medeiros , Fabio Mascarenhas , Roberto Ierusalimschy

Online Linearized LASSO

Sparse regression has been a popular approach to perform variable selection and enhance the prediction accuracy and interpretability of the resulting statistical model. Existing approaches focus on offline regularized regression, while the…

Machine Learning · Statistics 2023-01-03 Shuoguang Yang , Yuhao Yan , Xiuneng Zhu , Qiang Sun

Parameterized Streaming Algorithms for Vertex Cover

As graphs continue to grow in size, we seek ways to effectively process such data at scale. The model of streaming graph processing, in which a compact summary is maintained as each edge insertion/deletion is observed, is an attractive one.…

Data Structures and Algorithms · Computer Science 2014-07-25 Rajesh Chitnis , Graham Cormode , MohammadTaghi Hajiaghayi , Morteza Monemizadeh

StreamTensor: Make Tensors Stream in Dataflow Accelerators for LLMs

Efficient execution of deep learning workloads on dataflow architectures is crucial for overcoming memory bottlenecks and maximizing performance. While streaming intermediate results between computation kernels can significantly improve…

Hardware Architecture · Computer Science 2025-09-24 Hanchen Ye , Deming Chen

Maximum Matching in Semi-Streaming with Few Passes

In the semi-streaming model, an algorithm receives a stream of edges of a graph in arbitrary order and uses a memory of size $O(n \mbox{ polylog } n)$, where $n$ is the number of vertices of a graph. In this work, we present semi-streaming…

Data Structures and Algorithms · Computer Science 2014-04-11 Christian Konrad , Frédéric Magniez , Claire Mathieu

Stream: Scaling up Mechanistic Interpretability to Long Context in LLMs via Sparse Attention

As Large Language Models (LLMs) scale to million-token contexts, traditional Mechanistic Interpretability techniques for analyzing attention scale quadratically with context length, demanding terabytes of memory beyond 100,000 tokens. We…

Computation and Language · Computer Science 2026-02-03 J Rosser , José Luis Redondo García , Gustavo Penha , Konstantina Palla , Hugues Bouchard