Related papers: Optimal Multi-Paragraph Text Segmentation by Dynam…
In this paper we introduce a dynamic programming algorithm to perform linear text segmentation by global minimization of a segmentation cost function which consists of: (a) within-segment word similarity and (b) prior information about…
Analyzing large graph data is an essential part of many modern applications, such as social networks. Due to its large computational complexity, distributed processing is frequently employed. This requires graph data to be divided across…
Sentence splitting is a major simplification operator. Here we present a simple and efficient splitting algorithm based on an automatic semantic parser. After splitting, the text is amenable for further fine-tuned simplification operations.…
We propose an exact algorithm for solving the longest simple path problem between two given vertices in undirected weighted graphs. By using graph partitioning and dynamic programming, we obtain an algorithm that is significantly faster…
Computing an optimal chain of fragments is a classical problem in string algorithms, with important applications in computational biology. There exist two efficient dynamic programming algorithms solving this problem, based on different…
Sequence segmentation is a well-studied problem, where given a sequence of elements, an integer K, and some measure of homogeneity, the task is to split the sequence into K contiguous segments that are maximally homogeneous. A classic…
We consider least squares approximation of a function of one variable by a continuous, piecewise-linear approximand that has a small number of breakpoints. This problem was notably considered by Bellman who proposed an approximate algorithm…
Distributed computing excels at processing large scale data, but the communication cost for synchronizing the shared parameters may slow down the overall performance. Fortunately, the interactions between parameter and data in many problems…
This paper describes TextTiling, an algorithm for partitioning expository texts into coherent multi-paragraph discourse units which reflect the subtopic structure of the texts. The algorithm uses domain-independent lexical frequency and…
Computing an optimal classification tree that provably maximizes training performance within a given size limit, is NP-hard, and in practice, most state-of-the-art methods do not scale beyond computing optimal trees of depth three.…
Time-evolving large graph has received attention due to their participation in real-world applications such as social networks and PageRank calculation. It is necessary to partition a large-scale dynamic graph in a streaming manner to…
Segmental structure is a common pattern in many types of sequences such as phrases in human languages. In this paper, we present a probabilistic model for sequences via their segmentations. The probability of a segmented sequence is…
This paper introduces a new statistical approach to partitioning text automatically into coherent segments. Our approach enlists both short-range and long-range language models to help it sniff out likely sites of topic changes in text. To…
Text segmentation, the task of dividing a document into sections, is often a prerequisite for performing additional natural language processing tasks. Existing text segmentation methods have typically been developed and tested using clean,…
In the multiple changepoint setting, various search methods have been proposed which involve optimising either a constrained or penalised cost function over possible numbers and locations of changepoints using dynamic programming. Such…
We propose a new method for shape recognition and retrieval based on dynamic programming. Our approach uses the dynamic programming algorithm to compute the optimal score and to find the optimal alignment between two strings. First, each…
Easy Read text is one of the main forms of access to information for people with reading difficulties. One of the key characteristics of this type of text is the requirement to split sentences into smaller grammatical segments, to…
Edge-centric distributed computations have appeared as a recent technique to improve the shortcomings of think-like-a-vertex algorithms on large scale-free networks. In order to increase parallelism on this model, edge partitioning -…
For distributed graph processing on massive graphs, a graph is partitioned into multiple equally-sized parts which are distributed among machines in a compute cluster. In the last decade, many partitioning algorithms have been developed…
Many optimization problems can be naturally represented as (hyper) graphs, where vertices correspond to variables and edges to tasks, whose cost depends on the values of the adjacent variables. Capitalizing on the structure of the graph,…