Related papers: On aligning trees

How compatible are our discourse annotations? Insights from mapping RST-DT and PDTB annotations

Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks. This makes comparison of the annotations difficult, thereby also preventing researchers from searching…

Computation and Language · Computer Science 2018-03-16 Vera Demberg , Fatemeh Torabi Asr , Merel Scholman

A Model for Fine-Grained Alignment of Multilingual Texts

While alignment of texts on the sentential level is often seen as being too coarse, and word alignment as being too fine-grained, bi- or multilingual texts which are aligned on a level in-between are a useful resource for many purposes.…

Computation and Language · Computer Science 2007-05-23 Lea Cyrus , Hendrik Feddes

An Integrated Framework for Treebanks and Multilayer Annotations

Treebank formats and associated software tools are proliferating rapidly, with little consideration for interoperability. We survey a wide variety of treebank structures and operations, and show how they can be mapped onto the annotation…

Computation and Language · Computer Science 2007-05-23 Scott Cotton , Steven Bird

A Concise Query Language with Search and Transform Operations for Corpora with Multiple Levels of Annotation

The usefulness of annotated corpora is greatly increased if there is an associated tool that can allow various kinds of operations to be performed in a simple way. Different kinds of annotation frameworks and many query languages for them…

Computation and Language · Computer Science 2011-08-10 Anil Kumar Singh

Learning to Compose Words into Sentences with Reinforcement Learning

We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences. In contrast with prior work on tree-structured models in which the trees are either provided as input or…

Computation and Language · Computer Science 2016-11-29 Dani Yogatama , Phil Blunsom , Chris Dyer , Edward Grefenstette , Wang Ling

An Annotation Scheme for Free Word Order Languages

We describe an annotation scheme and a tool developed for creating linguistically annotated corpora for non-configurational languages. Since the requirements for such a formalism differ from those posited for configurational languages,…

cmp-lg · Computer Science 2008-02-03 Wojciech Skut , Brigitte Krenn , Thorsten Brants , Hans Uszkoreit

Counting, generating and sampling tree alignments

Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical…

Quantitative Methods · Quantitative Biology 2016-03-08 Cedric Chauve , Julien Courtiel , Yann Ponty

Annotating Predicate-Argument Structure for a Parallel Treebank

We report on a recently initiated project which aims at building a multi-layered parallel treebank of English and German. Particular attention is devoted to a dedicated predicate-argument layer which is used for aligning translationally…

Computation and Language · Computer Science 2007-05-23 Lea Cyrus , Hendrik Feddes , Frank Schumacher

Analyzing Text Representations under Tight Annotation Budgets: Measuring Structural Alignment

Annotating large collections of textual data can be time consuming and expensive. That is why the ability to train models with limited annotation budgets is of great importance. In this context, it has been shown that under tight annotation…

Computation and Language · Computer Science 2022-10-13 César González-Gutiérrez , Audi Primadhanty , Francesco Cazzaro , Ariadna Quattoni

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach

It is commonly believed that knowledge of syntactic structure should improve language modeling. However, effectively and computationally efficiently incorporating syntactic structure into neural language models has been a challenging topic.…

Computation and Language · Computer Science 2020-05-13 Wenyu Du , Zhouhan Lin , Yikang Shen , Timothy J. O'Donnell , Yoshua Bengio , Yue Zhang

Structured Learning from Partial Annotations

Structured learning is appropriate when predicting structured outputs such as trees, graphs, or sequences. Most prior work requires the training set to consist of complete trees, graphs or sequences. Specifying such detailed ground truth…

Machine Learning · Computer Science 2012-07-03 Xinghua Lou , Fred Hamprecht

Inducing Alignment Structure with Gated Graph Attention Networks for Sentence Matching

Sentence matching is a fundamental task of natural language processing with various applications. Most recent approaches adopt attention-based neural models to build word- or phrase-level alignment between two sentences. However, these…

Computation and Language · Computer Science 2021-10-22 Peng Cui , Le Hu , Yuanchao Liu

Towards Unification of Discourse Annotation Frameworks

Discourse information is difficult to represent and annotate. Among the major frameworks for annotating discourse information, RST, PDTB and SDRT are widely discussed and used, each having its own theoretical foundation and focus. Corpora…

Computation and Language · Computer Science 2022-04-19 Yingxue Fu

Capturing divergence in dependency trees to improve syntactic projection

Obtaining syntactic parses is a crucial part of many NLP pipelines. However, most of the world's languages do not have large amounts of syntactically annotated corpora available for building parsers. Syntactic projection techniques attempt…

Computation and Language · Computer Science 2016-05-17 Ryan Georgi , Fei Xia , William D. Lewis

On Tree-Based Neural Sentence Modeling

Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing tree-based encoders adopt syntactic parsing trees as the explicit structure prior. To study the effectiveness of…

Computation and Language · Computer Science 2018-08-30 Haoyue Shi , Hao Zhou , Jiaze Chen , Lei Li

Automatic Alignment of Discourse Relations of Different Discourse Annotation Frameworks

Existing discourse corpora are annotated based on different frameworks, which show significant dissimilarities in definitions of arguments and relations and structural constraints. Despite surface differences, these frameworks share basic…

Computation and Language · Computer Science 2024-04-09 Yingxue Fu

Generating Information Extraction Patterns from Overlapping and Variable Length Annotations using Sequence Alignment

Sequence alignments are used to capture patterns composed of elements representing multiple conceptual levels through the alignment of sequences that contain overlapping and variable length annotations. The alignments also determine the…

Computation and Language · Computer Science 2019-09-19 Frank Meng , Craig A. Morioka , Danne C. Elbers

Coordination Annotation Extension in the Penn Tree Bank

Coordination is an important and common syntactic construction which is not handled well by state of the art parsers. Coordinations in the Penn Treebank are missing internal structure in many cases, do not include explicit marking of the…

Computation and Language · Computer Science 2016-06-09 Jessica Ficler , Yoav Goldberg

One model, two languages: training bilingual parsers with harmonized treebanks

We introduce an approach to train lexicalized parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that can analyze sentences in either of the learned languages, or even…

Computation and Language · Computer Science 2016-05-20 David Vilares , Carlos Gómez-Rodríguez , Miguel A. Alonso

Counting trees: A treebank-driven exploration of syntactic variation in speech and writing across languages

This paper presents a novel treebank-driven approach to comparing syntactic structures in speech and writing using dependency-parsed corpora. Adopting a fully inductive, bottom-up method, we define syntactic structures as delexicalized…

Computation and Language · Computer Science 2026-02-24 Kaja Dobrovoljc