English
Related papers

Related papers: Bootstrapping Structure into Language: Alignment-B…

200 papers

This paper introduces a new type of unsupervised learning algorithm, based on the alignment of sentences and Harris's (1951) notion of interchangeability. The algorithm is applied to an untagged, unstructured corpus of natural language…

Machine Learning · Computer Science 2009-09-25 Menno van Zaanen

This paper introduces a new type of grammar learning algorithm, inspired by string edit distance (Wagner and Fischer, 1974). The algorithm takes a corpus of flat sentences as input and returns a corpus of labelled, bracketed sentences. The…

Machine Learning · Computer Science 2007-05-23 Menno van Zaanen

In this paper a new similarity-based learning algorithm, inspired by string edit-distance (Wagner and Fischer, 1974), is applied to the problem of bootstrapping structure from scratch. The algorithm takes a corpus of unannotated sentences…

Machine Learning · Computer Science 2007-05-23 Menno van Zaanen

Recent research has shown that word embedding spaces learned from text corpora of different languages can be aligned without any parallel data supervision. Inspired by the success in unsupervised cross-lingual word embeddings, in this paper…

Computation and Language · Computer Science 2018-09-24 Yu-An Chung , Wei-Hung Weng , Schrasing Tong , James Glass

We address the text-to-text generation problem of sentence-level paraphrasing -- a phenomenon distinct from and more difficult than word- or phrase-level paraphrasing. Our approach applies multiple-sequence alignment to sentences gathered…

Computation and Language · Computer Science 2007-05-23 Regina Barzilay , Lillian Lee

In what ways might statistical signals in linguistic input assist with the acquisition of syntax? Here we hypothesize a mechanism called collocational bootstrapping, in which regularities in word co-occurrence patterns can provide cues to…

Computation and Language · Computer Science 2026-05-21 Claire Hobbs , R. Thomas McCoy

Though language model text embeddings have revolutionized NLP research, their ability to capture high-level semantic information, such as relations between entities in text, is limited. In this paper, we propose a novel contrastive learning…

Computation and Language · Computer Science 2023-10-10 Christos Theodoropoulos , James Henderson , Andrei C. Coman , Marie-Francine Moens

Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings. Previous studies have typically learned sentence embeddings either through the use of human-annotated natural language inference (NLI) data…

Computation and Language · Computer Science 2023-10-25 Junlei Zhang , Zhenzhong Lan , Junxian He

Protein language models often take into consideration the alignment between a protein sequence and its textual description. However, they do not take structural information into consideration. Traditional methods treat sequence and…

Machine Learning · Computer Science 2026-03-10 Aditya Ranganath , Hasin Us Sami , Kowshik Thopalli , Bhavya Kailkhura , Wesam Sakla

We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence. There are two types of classifiers, an inside classifier that acts on a span, and an…

Computation and Language · Computer Science 2022-03-22 Nickil Maveli , Shay B. Cohen

We propose a novel approach to logic-based learning which generates assumption-based argumentation (ABA) frameworks from positive and negative examples, using a given background knowledge. These ABA frameworks can be mapped onto logic…

Artificial Intelligence · Computer Science 2023-05-26 Maurizio Proietti , Francesca Toni

Naturally-occurring bracketings, such as answer fragments to natural language questions and hyperlinks on webpages, can reflect human syntactic intuition regarding phrasal boundaries. Their availability and approximate correspondence to…

Computation and Language · Computer Science 2021-04-30 Tianze Shi , Ozan İrsoy , Igor Malioutov , Lillian Lee

This paper presents SimCSE, a simple contrastive learning framework that greatly advances state-of-the-art sentence embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive…

Computation and Language · Computer Science 2022-05-19 Tianyu Gao , Xingcheng Yao , Danqi Chen

In-context learning is a surprising and important phenomenon that emerged when modern language models were scaled to billions of learned parameters. Without modifying a large language model's weights, it can be tuned to perform various…

Computation and Language · Computer Science 2023-03-15 Noam Wies , Yoav Levine , Amnon Shashua

An important component of any generation system is the mapping dictionary, a lexicon of elementary semantic expressions and corresponding natural language realizations. Typically, labor-intensive knowledge-based methods are used to…

Computation and Language · Computer Science 2007-05-23 Regina Barzilay , Lillian Lee

We introduce a neural network that represents sentences by composing their words according to induced binary parse trees. We use Tree-LSTM as our composition function, applied along a tree structure found by a fully differentiable natural…

Computation and Language · Computer Science 2020-01-16 Jean Maillard , Stephen Clark , Dani Yogatama

The emergence of large language models (LLMs) has sparked significant interest in extending their remarkable language capabilities to speech. However, modality alignment between speech and text still remains an open problem. Current…

Computation and Language · Computer Science 2024-05-29 Chen Wang , Minpeng Liao , Zhongqiang Huang , Jinliang Lu , Junhong Wu , Yuchen Liu , Chengqing Zong , Jiajun Zhang

Text style transfer is an important task in controllable language generation. Supervised approaches have pushed performance improvement on style-oriented rewriting such as formality conversion. However, challenges remain due to the scarcity…

Computation and Language · Computer Science 2022-05-20 Zhengyuan Liu , Nancy F. Chen

Large Language Models produce sequences learned as statistical patterns from large corpora. In order not to reproduce corpus biases, after initial training models must be aligned with human values, preferencing certain continuations over…

Computation and Language · Computer Science 2024-01-02 Tsvetelina Hristova , Liam Magee , Karen Soldatic

Language is highly structured, with syntactic and semantic structures, to some extent, agreed upon by speakers of the same language. With implicit or explicit awareness of such structures, humans can learn and use language efficiently and…

Computation and Language · Computer Science 2024-10-23 Freda Shi
‹ Prev 1 2 3 10 Next ›