Related papers: Bootstrapping Structure into Language: Alignment-B…

Bootstrapping Syntax and Recursion using Alignment-Based Learning

This paper introduces a new type of unsupervised learning algorithm, based on the alignment of sentences and Harris's (1951) notion of interchangeability. The algorithm is applied to an untagged, unstructured corpus of natural language…

Machine Learning · Computer Science 2009-09-25 Menno van Zaanen

ABL: Alignment-Based Learning

This paper introduces a new type of grammar learning algorithm, inspired by string edit distance (Wagner and Fischer, 1974). The algorithm takes a corpus of flat sentences as input and returns a corpus of labelled, bracketed sentences. The…

Machine Learning · Computer Science 2007-05-23 Menno van Zaanen

Bootstrapping Structure using Similarity

In this paper a new similarity-based learning algorithm, inspired by string edit-distance (Wagner and Fischer, 1974), is applied to the problem of bootstrapping structure from scratch. The algorithm takes a corpus of unannotated sentences…

Machine Learning · Computer Science 2007-05-23 Menno van Zaanen

Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

Recent research has shown that word embedding spaces learned from text corpora of different languages can be aligned without any parallel data supervision. Inspired by the success in unsupervised cross-lingual word embeddings, in this paper…

Computation and Language · Computer Science 2018-09-24 Yu-An Chung , Wei-Hung Weng , Schrasing Tong , James Glass

Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment

We address the text-to-text generation problem of sentence-level paraphrasing -- a phenomenon distinct from and more difficult than word- or phrase-level paraphrasing. Our approach applies multiple-sequence alignment to sentences gathered…

Computation and Language · Computer Science 2007-05-23 Regina Barzilay , Lillian Lee

Collocational bootstrapping: A hypothesis about the learning of subject-verb agreement in humans and neural networks

In what ways might statistical signals in linguistic input assist with the acquisition of syntax? Here we hypothesize a mechanism called collocational bootstrapping, in which regularities in word co-occurrence patterns can provide cues to…

Computation and Language · Computer Science 2026-05-21 Claire Hobbs , R. Thomas McCoy

Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning

Though language model text embeddings have revolutionized NLP research, their ability to capture high-level semantic information, such as relations between entities in text, is limited. In this paper, we propose a novel contrastive learning…

Computation and Language · Computer Science 2023-10-10 Christos Theodoropoulos , James Henderson , Andrei C. Coman , Marie-Francine Moens

Contrastive Learning of Sentence Embeddings from Scratch

Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings. Previous studies have typically learned sentence embeddings either through the use of human-annotated natural language inference (NLI) data…

Computation and Language · Computer Science 2023-10-25 Junlei Zhang , Zhenzhong Lan , Junxian He

ProtAlign: Contrastive learning paradigm for Sequence and structure alignment

Protein language models often take into consideration the alignment between a protein sequence and its textual description. However, they do not take structural information into consideration. Traditional methods treat sequence and…

Machine Learning · Computer Science 2026-03-10 Aditya Ranganath , Hasin Us Sami , Kowshik Thopalli , Bhavya Kailkhura , Wesam Sakla

Co-training an Unsupervised Constituency Parser with Weak Supervision

We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence. There are two types of classifiers, an inside classifier that acts on a span, and an…

Computation and Language · Computer Science 2022-03-22 Nickil Maveli , Shay B. Cohen

Learning Assumption-based Argumentation Frameworks

We propose a novel approach to logic-based learning which generates assumption-based argumentation (ABA) frameworks from positive and negative examples, using a given background knowledge. These ABA frameworks can be mapped onto logic…

Artificial Intelligence · Computer Science 2023-05-26 Maurizio Proietti , Francesca Toni

Learning Syntax from Naturally-Occurring Bracketings

Naturally-occurring bracketings, such as answer fragments to natural language questions and hyperlinks on webpages, can reflect human syntactic intuition regarding phrasal boundaries. Their availability and approximate correspondence to…

Computation and Language · Computer Science 2021-04-30 Tianze Shi , Ozan İrsoy , Igor Malioutov , Lillian Lee

SimCSE: Simple Contrastive Learning of Sentence Embeddings

This paper presents SimCSE, a simple contrastive learning framework that greatly advances state-of-the-art sentence embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive…

Computation and Language · Computer Science 2022-05-19 Tianyu Gao , Xingcheng Yao , Danqi Chen

The Learnability of In-Context Learning

In-context learning is a surprising and important phenomenon that emerged when modern language models were scaled to billions of learned parameters. Without modifying a large language model's weights, it can be tuned to perform various…

Computation and Language · Computer Science 2023-03-15 Noam Wies , Yoav Levine , Amnon Shashua

Bootstrapping Lexical Choice via Multiple-Sequence Alignment

An important component of any generation system is the mapping dictionary, a lexicon of elementary semantic expressions and corresponding natural language realizations. Typically, labor-intensive knowledge-based methods are used to…

Computation and Language · Computer Science 2007-05-23 Regina Barzilay , Lillian Lee

Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs

We introduce a neural network that represents sentences by composing their words according to induced binary parse trees. We use Tree-LSTM as our composition function, applied along a tree structure found by a fully differentiable natural…

Computation and Language · Computer Science 2020-01-16 Jean Maillard , Stephen Clark , Dani Yogatama

BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing

The emergence of large language models (LLMs) has sparked significant interest in extending their remarkable language capabilities to speech. However, modality alignment between speech and text still remains an open problem. Current…

Computation and Language · Computer Science 2024-05-29 Chen Wang , Minpeng Liao , Zhongqiang Huang , Jinliang Lu , Junhong Wu , Yuchen Liu , Chengqing Zong , Jiajun Zhang

Learning from Bootstrapping and Stepwise Reinforcement Reward: A Semi-Supervised Framework for Text Style Transfer

Text style transfer is an important task in controllable language generation. Supervised approaches have pushed performance improvement on style-oriented rewriting such as formality conversion. However, challenges remain due to the scarcity…

Computation and Language · Computer Science 2022-05-20 Zhengyuan Liu , Nancy F. Chen

The Problem of Alignment

Large Language Models produce sequences learned as statistical patterns from large corpora. In order not to reproduce corpus biases, after initial training models must be aligned with human values, preferencing certain continuations over…

Computation and Language · Computer Science 2024-01-02 Tsvetelina Hristova , Liam Magee , Karen Soldatic

Learning Language Structures through Grounding

Language is highly structured, with syntactic and semantic structures, to some extent, agreed upon by speakers of the same language. With implicit or explicit awareness of such structures, humans can learn and use language efficiently and…

Computation and Language · Computer Science 2024-10-23 Freda Shi