English
Related papers

Related papers: Stream Processing using Grammars and Regular Expre…

200 papers

Regular expressions are a classical concept in formal language theory. Regular expressions in programming languages (RegEx) such as JavaScript, feature non-standard semantics of operators (e.g. greedy/lazy Kleene star), as well as…

We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec. We compare our streaming algorithm to word2vec empirically by measuring…

Computation and Language · Computer Science 2017-04-26 Chandler May , Kevin Duh , Benjamin Van Durme , Ashwin Lall

Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only…

Programming Languages · Computer Science 2023-08-28 Qiaochu Chen , Arko Banerjee , Çağatay Demiralp , Greg Durrett , Isil Dillig

Regular expression patterns are a key feature of document processing languages like Perl and XDuce. It is in this context that the first and longest match policies have been proposed to disambiguate the pattern matching process. We formally…

Programming Languages · Computer Science 2007-05-23 Stijn Vansummeren

This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by…

Information Theory · Computer Science 2009-02-03 Travis Gagie

Regular expression matching is essential for many applications, such as finding patterns in text, exploring substrings in large DNA sequences, or lexical analysis. However, sequential regular expression matching may be time-prohibitive for…

Formal Languages and Automata Theory · Computer Science 2015-06-30 Suejb Memeti , Sabri Pllana

In this paper, we design the first streaming algorithms for the problem of multitasking scheduling on parallel machines with shared processing. In one pass, our streaming approximation schemes can provide an approximate value of the optimal…

Data Structures and Algorithms · Computer Science 2022-04-06 Bin Fu , Yumei Huo , Hairong Zhao

With the explosion of the size of digital dataset, the limiting factor for decomposition algorithms is the \emph{number of passes} over the input, as the input is often stored out-of-core or even off-site. Moreover, we're only interested in…

Numerical Analysis · Computer Science 2016-08-14 Radim Řeh{ů}řek

Regular expression is important for many natural language processing tasks especially when used to deal with unstructured and semi-structured data. This work focuses on automatically generating regular expressions and proposes a novel…

Neural and Evolutionary Computing · Computer Science 2020-06-25 Desheng Wang , Jiawei Liu , Xiang Qi , Baolin Sun , Peng Zhang

Motivated by a concrete problem and with the goal of understanding the sense in which the complexity of streaming algorithms is related to the complexity of formal languages, we investigate the problem Dyck(s) of checking matching…

Data Structures and Algorithms · Computer Science 2009-11-18 F. Magniez , C. Mathieu , A. Nayak

Given a regular expression $R$ and a string $Q$, the regular expression parsing problem is to determine if $Q$ matches $R$ and if so, determine how it matches, e.g., by a mapping of the characters of $Q$ to the characters in $R$. Regular…

Data Structures and Algorithms · Computer Science 2019-01-30 Philip Bille , Inge Li Gørtz

Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. In the last two…

Information Retrieval · Computer Science 2012-10-01 Simone Faro , M. Oguzhan Külekci

Developing state-machine replication protocols for practical use is a complex and labor-intensive process because of the myriad of essential tasks (e.g., deployment, communication, recovery) that need to be taken into account in an…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-25 Laura Lawniczak , Tobias Distler

We propose two one-pass streaming algorithms for the $\mathcal{NP}$-hard hypergraph matching problem. The first algorithm stores a small subset of potential matching edges in a stack using dual variables to select edges. It has an…

Data Structures and Algorithms · Computer Science 2025-07-09 Henrik Reinstädtler , S M Ferdous , Alex Pothen , Bora Uçar , Christian Schulz

Most scripting languages nowadays use regex pattern-matching libraries. These regex libraries borrow the syntax of regular expressions, but have an informal semantics that is different from the semantics of regular expressions, removing the…

Formal Languages and Automata Theory · Computer Science 2014-02-17 Sérgio Medeiros , Fabio Mascarenhas , Roberto Ierusalimschy

Sparse regression has been a popular approach to perform variable selection and enhance the prediction accuracy and interpretability of the resulting statistical model. Existing approaches focus on offline regularized regression, while the…

Machine Learning · Statistics 2023-01-03 Shuoguang Yang , Yuhao Yan , Xiuneng Zhu , Qiang Sun

As graphs continue to grow in size, we seek ways to effectively process such data at scale. The model of streaming graph processing, in which a compact summary is maintained as each edge insertion/deletion is observed, is an attractive one.…

Data Structures and Algorithms · Computer Science 2014-07-25 Rajesh Chitnis , Graham Cormode , MohammadTaghi Hajiaghayi , Morteza Monemizadeh

Efficient execution of deep learning workloads on dataflow architectures is crucial for overcoming memory bottlenecks and maximizing performance. While streaming intermediate results between computation kernels can significantly improve…

Hardware Architecture · Computer Science 2025-09-24 Hanchen Ye , Deming Chen

In the semi-streaming model, an algorithm receives a stream of edges of a graph in arbitrary order and uses a memory of size $O(n \mbox{ polylog } n)$, where $n$ is the number of vertices of a graph. In this work, we present semi-streaming…

Data Structures and Algorithms · Computer Science 2014-04-11 Christian Konrad , Frédéric Magniez , Claire Mathieu

As Large Language Models (LLMs) scale to million-token contexts, traditional Mechanistic Interpretability techniques for analyzing attention scale quadratically with context length, demanding terabytes of memory beyond 100,000 tokens. We…

Computation and Language · Computer Science 2026-02-03 J Rosser , José Luis Redondo García , Gustavo Penha , Konstantina Palla , Hugues Bouchard
‹ Prev 1 2 3 10 Next ›