Related papers: Bootstrapping Syntax and Recursion using Alignment…

Bootstrapping Structure into Language: Alignment-Based Learning

This thesis introduces a new unsupervised learning framework, called Alignment-Based Learning, which is based on the alignment of sentences and Harris's (1951) notion of substitutability. Instances of the framework can be applied to an…

Machine Learning · Computer Science 2007-05-23 Menno M. van Zaanen

Bootstrapping Structure using Similarity

In this paper a new similarity-based learning algorithm, inspired by string edit-distance (Wagner and Fischer, 1974), is applied to the problem of bootstrapping structure from scratch. The algorithm takes a corpus of unannotated sentences…

Machine Learning · Computer Science 2007-05-23 Menno van Zaanen

ABL: Alignment-Based Learning

This paper introduces a new type of grammar learning algorithm, inspired by string edit distance (Wagner and Fischer, 1974). The algorithm takes a corpus of flat sentences as input and returns a corpus of labelled, bracketed sentences. The…

Machine Learning · Computer Science 2007-05-23 Menno van Zaanen

An Algorithm for Aligning Sentences in Bilingual Corpora Using Lexical Information

In this paper we describe an algorithm for aligning sentences with their translations in a bilingual corpus using lexical information of the languages. Existing efficient algorithms ignore word identities and consider only the sentence…

Computation and Language · Computer Science 2007-05-23 Akshar Bharati , V. Sriram , A. Vamshi Krishna , Rajeev Sangal , S. M. Bendre

Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

Recent research has shown that word embedding spaces learned from text corpora of different languages can be aligned without any parallel data supervision. Inspired by the success in unsupervised cross-lingual word embeddings, in this paper…

Computation and Language · Computer Science 2018-09-24 Yu-An Chung , Wei-Hung Weng , Schrasing Tong , James Glass

Bootstrapping Lexical Choice via Multiple-Sequence Alignment

An important component of any generation system is the mapping dictionary, a lexicon of elementary semantic expressions and corresponding natural language realizations. Typically, labor-intensive knowledge-based methods are used to…

Computation and Language · Computer Science 2007-05-23 Regina Barzilay , Lillian Lee

SimCSE: Simple Contrastive Learning of Sentence Embeddings

This paper presents SimCSE, a simple contrastive learning framework that greatly advances state-of-the-art sentence embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive…

Computation and Language · Computer Science 2022-05-19 Tianyu Gao , Xingcheng Yao , Danqi Chen

Learning Syntax from Naturally-Occurring Bracketings

Naturally-occurring bracketings, such as answer fragments to natural language questions and hyperlinks on webpages, can reflect human syntactic intuition regarding phrasal boundaries. Their availability and approximate correspondence to…

Computation and Language · Computer Science 2021-04-30 Tianze Shi , Ozan İrsoy , Igor Malioutov , Lillian Lee

Co-training an Unsupervised Constituency Parser with Weak Supervision

We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence. There are two types of classifiers, an inside classifier that acts on a span, and an…

Computation and Language · Computer Science 2022-03-22 Nickil Maveli , Shay B. Cohen

Collocational bootstrapping: A hypothesis about the learning of subject-verb agreement in humans and neural networks

In what ways might statistical signals in linguistic input assist with the acquisition of syntax? Here we hypothesize a mechanism called collocational bootstrapping, in which regularities in word co-occurrence patterns can provide cues to…

Computation and Language · Computer Science 2026-05-21 Claire Hobbs , R. Thomas McCoy

RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank

Unsupervised sentence representation learning is one of the fundamental problems in natural language processing with various downstream applications. Recently, contrastive learning has been widely adopted which derives high-quality sentence…

Computation and Language · Computer Science 2023-05-29 Jiduan Liu , Jiahao Liu , Qifan Wang , Jingang Wang , Wei Wu , Yunsen Xian , Dongyan Zhao , Kai Chen , Rui Yan

REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

Unsupervised automatic speech recognition (ASR) aims to learn the mapping between the speech signal and its corresponding textual transcription without the supervision of paired speech-text data. A word/phoneme in the speech signal is…

Audio and Speech Processing · Electrical Eng. & Systems 2024-11-18 Liang-Hsuan Tseng , En-Pei Hu , Cheng-Han Chiang , Yuan Tseng , Hung-yi Lee , Lin-shan Lee , Shao-Hua Sun

Using BERT Encoding and Sentence-Level Language Model for Sentence Ordering

Discovering the logical sequence of events is one of the cornerstones in Natural Language Understanding. One approach to learn the sequence of events is to study the order of sentences in a coherent text. Sentence ordering can be applied in…

Computation and Language · Computer Science 2021-08-26 Melika Golestani , Seyedeh Zahra Razavi , Zeinab Borhanifard , Farnaz Tahmasebian , Hesham Faili

A Bootstrap Algorithm for Fast Supervised Learning

Training a neural network (NN) typically relies on some type of curve-following method, such as gradient descent (GD) (and stochastic gradient descent (SGD)), ADADELTA, ADAM or limited memory algorithms. Convergence for these algorithms…

Machine Learning · Computer Science 2023-05-08 Michael A Kouritzin , Stephen Styles , Beatrice-Helen Vritsiou

Learning from Bootstrapping and Stepwise Reinforcement Reward: A Semi-Supervised Framework for Text Style Transfer

Text style transfer is an important task in controllable language generation. Supervised approaches have pushed performance improvement on style-oriented rewriting such as formality conversion. However, challenges remain due to the scarcity…

Computation and Language · Computer Science 2022-05-20 Zhengyuan Liu , Nancy F. Chen

Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only

Automatic speech recognition (ASR) has been widely researched with supervised approaches, while many low-resourced languages lack audio-text aligned data, and supervised methods cannot be applied on them. In this work, we propose a…

Computation and Language · Computer Science 2018-08-14 Yi-Chen Chen , Chia-Hao Shen , Sung-Feng Huang , Hung-yi Lee

Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment

We address the text-to-text generation problem of sentence-level paraphrasing -- a phenomenon distinct from and more difficult than word- or phrase-level paraphrasing. Our approach applies multiple-sequence alignment to sentences gathered…

Computation and Language · Computer Science 2007-05-23 Regina Barzilay , Lillian Lee

Contrastive Learning of Sentence Embeddings from Scratch

Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings. Previous studies have typically learned sentence embeddings either through the use of human-annotated natural language inference (NLI) data…

Computation and Language · Computer Science 2023-10-25 Junlei Zhang , Zhenzhong Lan , Junxian He

Unsupervised Sentence Textual Similarity with Compositional Phrase Semantics

Measuring Sentence Textual Similarity (STS) is a classic task that can be applied to many downstream NLP applications such as text generation and retrieval. In this paper, we focus on unsupervised STS that works on various domains but only…

Computation and Language · Computer Science 2022-10-06 Zihao Wang , Jiaheng Dou , Yong Zhang

Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

Selecting input features of top relevance has become a popular method for building self-explaining models. In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text…

Machine Learning · Computer Science 2020-05-28 Kyle Swanson , Lili Yu , Tao Lei