English
Related papers

Related papers: Fast Sequence Segmentation using Log-Linear Models

200 papers

Monotonicity is a simple yet significant qualitative characteristic. We consider the problem of segmenting a sequence in up to K segments. We want segments to be as monotonic as possible and to alternate signs. We propose a quality metric…

Databases · Computer Science 2009-09-01 Daniel Lemire , Martin Brooks , Yuhong Yan

Monotonicity is a simple yet significant qualitative characteristic. We consider the problem of segmenting an array in up to K segments. We want segments to be as monotonic as possible and to alternate signs. We propose a quality metric for…

Data Structures and Algorithms · Computer Science 2007-05-23 Daniel Lemire , Martin Brooks , Yuhong Yan

Partitioning a sequence of length $n$ into $k$ coherent segments (Seg) is one of the classic optimization problems. As long as the optimization criterion is additive, Seg can be solved exactly in $O(n^2k)$ time using a classic dynamic…

Data Structures and Algorithms · Computer Science 2019-02-06 Nikolaj Tatti

Many software analysis methods have come to rely on machine learning approaches. Code segmentation - the process of decomposing source code into meaningful blocks - can augment these methods by featurizing code, reducing noise, and limiting…

Software Engineering · Computer Science 2019-07-23 Jacob Dormuth , Ben Gelman , Jessica Moore , David Slater

The $k$-segmentation of a video stream is used to partition it into $k$ piecewise-linear segments, so that each linear segment has a meaningful interpretation. Such segmentation may be used to summarize large videos using a small set of…

Computer Vision and Pattern Recognition · Computer Science 2020-09-14 Sabarish Vadarevu , Vijay Karamcheti

Binary segmentation is the classic greedy algorithm which recursively splits a sequential data set by optimizing some loss or likelihood function. Binary segmentation is widely used for changepoint detection in data sets measured over space…

Machine Learning · Computer Science 2024-10-14 Toby Dylan Hocking

Due to the increasing complexity and interconnectedness of different components in modern automotive software systems there is a great number of interactions between these system components and their environment. These interactions result…

Applications · Statistics 2025-03-11 Bojan Lukić , Thorben Knust , Andreas Rausch

In high-dimensional generalized linear models, it is crucial to identify a sparse model that adequately accounts for response variation. Although the best subset section has been widely regarded as the Holy Grail of problems of this type,…

Machine Learning · Statistics 2023-08-02 Junxian Zhu , Jin Zhu , Borui Tang , Xuanyu Chen , Hongmei Lin , Xueqin Wang

I tackle the problem of partitioning a sequence into homogeneous segments, where homogeneity is defined by a set of Markov models. The problem is to study the likelihood that a sequence is divided into a given number of segments. Here, the…

Quantitative Methods · Quantitative Biology 2009-11-17 Laurent Guéguen

Computational methods for discovering patterns of local correlations in sequences are important in computational biology. Here we show how to determine the optimal partitioning of aligned sequences into non-overlapping segments such that…

Computational Engineering, Finance, and Science · Computer Science 2012-06-26 Joseph Bockhorst , Nebojsa Jojic

Segmental structure is a common pattern in many types of sequences such as phrases in human languages. In this paper, we present a probabilistic model for sequences via their segmentations. The probability of a segmented sequence is…

Machine Learning · Statistics 2018-07-20 Chong Wang , Yining Wang , Po-Sen Huang , Abdelrahman Mohamed , Dengyong Zhou , Li Deng

Identifying the underlying models in a set of data points contaminated by noise and outliers, leads to a highly complex multi-model fitting problem. This problem can be posed as a clustering problem by the projection of higher order…

Computer Vision and Pattern Recognition · Computer Science 2018-08-01 Ruwan Tennakoon , Alireza Sadri , Reza Hoseinnezhad , Alireza Bab-Hadiashar

Kernel segmentation aims at partitioning a data sequence into several non-overlapping segments that may have nonlinear and complex structures. In general, it is formulated as a discrete optimization problem with combinatorial constraints. A…

Machine Learning · Computer Science 2022-06-23 Tung Doan , Atsuhiro Takasu

There exist several methods of calculating a similarity curve, or a sequence of similarity values, representing the lexical cohesion of successive text constituents, e.g., paragraphs. Methods for deciding the locations of fragment…

Computation and Language · Computer Science 2007-05-23 Oskari Heinonen

This paper describes a linear-time algorithm that finds the longest stretch in a sequence of real numbers (``scores'') in which the sum exceeds an input parameter. The algorithm also solves the problem of finding the longest interval in…

Data Structures and Algorithms · Computer Science 2007-05-23 Miklós Csűrös

We study the fixed design segmented regression problem: Given noisy samples from a piecewise linear function $f$, we want to recover $f$ up to a desired accuracy in mean-squared error. Previous rigorous approaches for this problem rely on…

Machine Learning · Computer Science 2016-07-15 Jayadev Acharya , Ilias Diakonikolas , Jerry Li , Ludwig Schmidt

Sequence-to-Sequence models were introduced to tackle many real-life problems like machine translation, summarization, image captioning, etc. The standard optimization algorithms are mainly based on example-to-example matching like maximum…

Computation and Language · Computer Science 2018-09-05 Wenhu Chen , Guanlin Li , Shujie Liu , Zhirui Zhang , Mu Li , Ming Zhou

Mixed-integer optimisation problems can be computationally challenging. Here, we introduce and analyse two efficient algorithms with a specific sequential design that are aimed at dealing with sampled problems within this class. At each…

Optimization and Control · Mathematics 2023-03-07 Mohammadreza Chamanbaz , Roland Bouffanais

We propose a randomized method for solving linear programs with a large number of columns but a relatively small number of constraints. Since enumerating all the columns is usually unrealistic, such linear programs are commonly solved by…

Optimization and Control · Mathematics 2023-11-29 Yi-Chun Akchen , Velibor V. Mišić

Sequence classification algorithms, such as SVM, require a definition of distance (similarity) measure between two sequences. A commonly used notion of similarity is the number of matches between $k$-mers ($k$-length subsequences) in the…

Data Structures and Algorithms · Computer Science 2017-12-13 Muhammad Farhan , Juvaria Tariq , Arif Zaman , Mudassir Shabbir , Imdad Ullah Khan
‹ Prev 1 2 3 10 Next ›