English
Related papers

Related papers: Bootstrapping Structure using Similarity

200 papers

This paper introduces a new type of unsupervised learning algorithm, based on the alignment of sentences and Harris's (1951) notion of interchangeability. The algorithm is applied to an untagged, unstructured corpus of natural language…

Machine Learning · Computer Science 2009-09-25 Menno van Zaanen

This paper introduces a new type of grammar learning algorithm, inspired by string edit distance (Wagner and Fischer, 1974). The algorithm takes a corpus of flat sentences as input and returns a corpus of labelled, bracketed sentences. The…

Machine Learning · Computer Science 2007-05-23 Menno van Zaanen

This thesis introduces a new unsupervised learning framework, called Alignment-Based Learning, which is based on the alignment of sentences and Harris's (1951) notion of substitutability. Instances of the framework can be applied to an…

Machine Learning · Computer Science 2007-05-23 Menno M. van Zaanen

Emphasis Selection is a newly proposed task which focuses on choosing words for emphasis in short sentences. Traditional methods only consider the sequence information of a sentence while ignoring the rich sentence structure and word…

Computation and Language · Computer Science 2021-08-31 Haoran Yang , Wai Lam

In many applications, it is necessary to determine the similarity of two strings. A widely-used notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one…

cmp-lg · Computer Science 2008-02-03 Eric Sven Ristad , Peter N. Yianilos

Measuring sentence similarity is a classic topic in natural language processing. Light-weighted similarities are still of particular practical significance even when deep learning models have succeeded in many other tasks. Some…

Computation and Language · Computer Science 2020-02-04 Zihao Wang , Yong Zhang , Hao Wu

The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social science. This is commonly approached by training word embeddings on each…

Computation and Language · Computer Science 2021-12-30 Hila Gonen , Ganesh Jawahar , Djamé Seddah , Yoav Goldberg

In this paper we describe an algorithm for aligning sentences with their translations in a bilingual corpus using lexical information of the languages. Existing efficient algorithms ignore word identities and consider only the sentence…

Computation and Language · Computer Science 2007-05-23 Akshar Bharati , V. Sriram , A. Vamshi Krishna , Rajeev Sangal , S. M. Bendre

An important task in NLP applications such as sentence simplification is the ability to take a long, complex sentence and split it into shorter sentences, rephrasing as necessary. We introduce a novel dataset and a new model for this `split…

Computation and Language · Computer Science 2021-09-13 Joongwon Kim , Mounica Maddela , Reno Kriz , Wei Xu , Chris Callison-Burch

This paper proposes a mechanism for learning pattern correspondences between two languages from a corpus of translated sentence pairs. The proposed mechanism uses analogical reasoning between two translations. Given a pair of translations,…

cmp-lg · Computer Science 2008-02-03 Ilyas Cicekli , H. Altay Guvenir

Building systems with capability of natural language understanding (NLU) has been one of the oldest areas of AI. An essential component of NLU is to detect logical succession of events contained in a text. The task of sentence ordering is…

Computation and Language · Computer Science 2021-08-30 Melika Golestani , Seyedeh Zahra Razavi , Heshaam Faili

This study is to review the approaches used for measuring sentences similarity. Measuring similarity between natural language sentences is a crucial task for many Natural Language Processing applications such as text classification,…

Computation and Language · Computer Science 2019-10-10 Mamdouh Farouk

We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence. There are two types of classifiers, an inside classifier that acts on a span, and an…

Computation and Language · Computer Science 2022-03-22 Nickil Maveli , Shay B. Cohen

We present an algorithm that takes an unannotated corpus as its input, and returns a ranked list of probable morphologically related pairs as its output. The algorithm tries to discover morphologically related pairs by looking for pairs…

Computation and Language · Computer Science 2007-05-23 Marco Baroni , Johannes Matiasek , Harald Trost

A central goal of neuroscience is to understand how activity in the nervous system is related to features of the external world, or to features of the nervous system itself. A common approach is to model neural responses as a weighted…

Machine Learning · Statistics 2015-05-14 Kristofer E. Bouchard

While cross-lingual word embeddings have been studied extensively in recent years, the qualitative differences between the different algorithms remain vague. We observe that whether or not an algorithm uses a particular feature set…

Computation and Language · Computer Science 2017-01-11 Omer Levy , Anders Søgaard , Yoav Goldberg

We propose Quootstrap, a method for extracting quotations, as well as the names of the speakers who uttered them, from large news corpora. Whereas prior work has addressed this problem primarily with supervised machine learning, our…

Social and Information Networks · Computer Science 2018-04-10 Dario Pavllo , Tiziano Piccardi , Robert West

We propose a novel model architecture and training algorithm to learn bilingual sentence embeddings from a combination of parallel and monolingual data. Our method connects autoencoding and neural machine translation to force the source and…

Computation and Language · Computer Science 2019-06-06 Yunsu Kim , Hendrik Rosendahl , Nick Rossenbach , Jan Rosendahl , Shahram Khadivi , Hermann Ney

Generic sentence embeddings provide a coarse-grained approximation of semantic textual similarity but ignore specific aspects that make texts similar. Conversely, aspect-based sentence embeddings provide similarities between texts based on…

Computation and Language · Computer Science 2023-09-26 Tim Schopf , Emanuel Gerber , Malte Ostendorff , Florian Matthes

This paper introduces STRASS: Summarization by TRAnsformation Selection and Scoring. It is an extractive text summarization method which leverages the semantic information in existing sentence embedding spaces. Our method creates an…

Computation and Language · Computer Science 2019-07-18 Léo Bouscarrat , Antoine Bonnefoy , Thomas Peel , Cécile Pereira
‹ Prev 1 2 3 10 Next ›