Related papers: Graph Algorithms for Multiparallel Word Alignment
After a period of decrease, interest in word alignments is increasing again for their usefulness in domains such as typological research, cross-lingual annotation projection, and machine translation. Generally, alignment algorithms only use…
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs. The great…
We present a family of neural-network--inspired models for computing continuous word representations, specifically designed to exploit both monolingual and multilingual text. This framework allows us to perform unsupervised training of…
Graph matching, also known as network alignment, refers to finding a bijection between the vertex sets of two given graphs so as to maximally align their edges. This fundamental computational problem arises frequently in multiple fields…
Neural embedding approaches have become a staple in the fields of computer vision, natural language processing, and more recently, graph analytics. Given the pervasive nature of these algorithms, the natural question becomes how to exploit…
While alignment of texts on the sentential level is often seen as being too coarse, and word alignment as being too fine-grained, bi- or multilingual texts which are aligned on a level in-between are a useful resource for many purposes.…
Word alignment has proven to benefit many-to-many neural machine translation (NMT). However, high-quality ground-truth bilingual dictionaries were used for pre-editing in previous methods, which are unavailable for most language pairs.…
Word embeddings, which represent a word as a point in a vector space, have become ubiquitous to several NLP tasks. A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by…
Sentence matching is a fundamental task of natural language processing with various applications. Most recent approaches adopt attention-based neural models to build word- or phrase-level alignment between two sentences. However, these…
Word alignments identify translational correspondences between words in a parallel sentence pair and is used, for instance, to learn bilingual dictionaries, to train statistical machine translation systems , or to perform quality…
While neural machine translation (NMT) has become the new paradigm, the parameter optimization requires large-scale parallel data which is scarce in many domains and language pairs. In this paper, we address a new translation scenario in…
In this paper we describe an algorithm for aligning sentences with their translations in a bilingual corpus using lexical information of the languages. Existing efficient algorithms ignore word identities and consider only the sentence…
Graph matching consists of aligning the vertices of two unlabeled graphs in order to maximize the shared structure across networks; when the graphs are unipartite, this is commonly formulated as minimizing their edge disagreements. In this…
In this paper, we present an adaptive bitextual alignment system called AIlign. This aligner relies on sentence embeddings to extract reliable anchor points that can guide the alignment path, even for texts whose parallelism is fragmentary…
Translations capture important information about languages that can be used as implicit supervision in learning linguistic properties and semantic representations. In an information-centric view, translated texts may be considered as…
Graph-structured data is prevalent in the real world. Recently, due to the powerful emergent capabilities, Large Language Models (LLMs) have shown promising performance in modeling graphs. The key to effectively applying LLMs on graphs is…
Conformal prediction has become increasingly popular for quantifying the uncertainty associated with machine learning models. Recent work in graph uncertainty quantification has built upon this approach for conformal graph prediction. The…
Text alignment is crucial to the accuracy of Machine Translation (MT) systems, some NLP tools or any other text processing tasks requiring bilingual data. This research proposes a language independent sentence alignment approach based on…
Large language models (LLMs) have presented significant opportunities to enhance various machine learning applications, including graph neural networks (GNNs). By leveraging the vast open-world knowledge within LLMs, we can more effectively…
Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not…