Related papers: Two Questions about Data-Oriented Parsing

Data-Oriented Language Processing. An Overview

During the last few years, a new approach to language processing has started to emerge, which has become known under various labels such as "data-oriented parsing", "corpus-based interpretation", and "tree-bank grammar" (cf. van den Berg et…

cmp-lg · Computer Science 2008-02-03 Rens Bod , Remko Scha

Efficient Algorithms for Parsing the DOP Model

Excellent results have been reported for Data-Oriented Parsing (DOP) of natural language texts (Bod, 1993). Unfortunately, existing algorithms are both computationally intensive and difficult to implement. Previous algorithms are expensive…

cmp-lg · Computer Science 2008-02-03 Joshua Goodman

Aspects of Pattern-Matching in Data-Oriented Parsing

Data-Oriented Parsing (dop) ranks among the best parsing schemes, pairing state-of-the art parsing accuracy to the psycholinguistic insight that larger chunks of syntactic structures are relevant grammatical and probabilistic units. Parsing…

Computation and Language · Computer Science 2007-05-23 Guy De Pauw

A Data-Oriented Approach to Semantic Interpretation

In Data-Oriented Parsing (DOP), an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new input sentence is constructed by combining sub-analyses from the corpus in the most probable way. This…

cmp-lg · Computer Science 2008-02-03 Rens Bod , Remko Bonnema , Remko Scha

Combining semantic and syntactic structure for language modeling

Structured language models for speech recognition have been shown to remedy the weaknesses of n-gram models. All current structured language models are, however, limited in that they do not take into account dependencies between…

Computation and Language · Computer Science 2007-05-23 Rens Bod

Learning Efficient Disambiguation

This dissertation analyses the computational properties of current performance-models of natural language parsing, in particular Data Oriented Parsing (DOP), points out some of their major shortcomings and suggests suitable solutions. It…

Computation and Language · Computer Science 2007-05-23 Khalil Sima'an

ListOps: A Diagnostic Dataset for Latent Tree Learning

Latent tree learning models learn to parse a sentence without syntactic supervision, and use that parse to build the sentence representation. Existing work on such models has shown that, while they perform well on tasks like sentence…

Computation and Language · Computer Science 2018-04-18 Nikita Nangia , Samuel R. Bowman

Ending-based Strategies for Part-of-speech Tagging

Probabilistic approaches to part-of-speech tagging rely primarily on whole-word statistics about word/tag combinations as well as contextual information. But experience shows about 4 per cent of tokens encountered in test sets are unknown…

Computation and Language · Computer Science 2013-02-28 Greg Adams , Beth Millar , Eric Neufeld , Tim Philip

Learning from Uncurated Regular Expressions

Significant work has been done on learning regular expressions from a set of data values. Depending on the domain, this approach can be very successful. However, significant time is required to learn these expressions and the resulting…

Databases · Computer Science 2024-03-18 Michael J. Mior

DOP: Deep Optimistic Planning with Approximate Value Function Evaluation

Research on reinforcement learning has demonstrated promising results in manifold applications and domains. Still, efficiently learning effective robot behaviors is very difficult, due to unstructured scenarios, high uncertainties, and…

Robotics · Computer Science 2018-03-26 Francesco Riccio , Roberto Capobianco , Daniele Nardi

Learning to Paraphrase Sentences to Different Complexity Levels

While sentence simplification is an active research topic in NLP, its adjacent tasks of sentence complexification and same-level paraphrasing are not. To train models on all three tasks, we present two new unsupervised datasets. We compare…

Computation and Language · Computer Science 2023-11-22 Alison Chi , Li-Kuang Chen , Yi-Chen Chang , Shu-Hui Lee , Jason S. Chang

Big Data Small Data, In Domain Out-of Domain, Known Word Unknown Word: The Impact of Word Representation on Sequence Labelling Tasks

Word embeddings -- distributed word representations that can be learned from unlabelled data -- have been shown to have high utility in many natural language processing applications. In this paper, we perform an extrinsic evaluation of five…

Computation and Language · Computer Science 2015-05-21 Lizhen Qu , Gabriela Ferraro , Liyuan Zhou , Weiwei Hou , Nathan Schneider , Timothy Baldwin

On Tree-Based Neural Sentence Modeling

Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing tree-based encoders adopt syntactic parsing trees as the explicit structure prior. To study the effectiveness of…

Computation and Language · Computer Science 2018-08-30 Haoyue Shi , Hao Zhou , Jiaze Chen , Lei Li

Assessing Data Efficiency in Task-Oriented Semantic Parsing

Data efficiency, despite being an attractive characteristic, is often challenging to measure and optimize for in task-oriented semantic parsing; unlike exact match, it can require both model- and domain-specific setups, which have,…

Computation and Language · Computer Science 2021-07-13 Shrey Desai , Akshat Shrivastava , Justin Rill , Brian Moran , Safiyyah Saleem , Alexander Zotov , Ahmed Aly

Learning Unification-Based Natural Language Grammars

When parsing unrestricted language, wide-covering grammars often undergenerate. Undergeneration can be tackled either by sentence correction, or by grammar correction. This thesis concentrates upon automatic grammar correction (or machine…

cmp-lg · Computer Science 2016-08-31 Miles Osborne

Conditional probing: measuring usable information beyond a baseline

Probing experiments investigate the extent to which neural representations make properties -- like part-of-speech -- predictable. One suggests that a representation encodes a property if probing that representation produces higher accuracy…

Computation and Language · Computer Science 2021-09-21 John Hewitt , Kawin Ethayarajh , Percy Liang , Christopher D. Manning

Parsing with the Shortest Derivation

Common wisdom has it that the bias of stochastic grammars in favor of shorter derivations of a sentence is harmful and should be redressed. We show that the common wisdom is wrong for stochastic grammars that use elementary trees instead of…

Computation and Language · Computer Science 2007-05-23 Rens Bod

The Word is Mightier than the Label: Learning without Pointillistic Labels using Data Programming

Most advanced supervised Machine Learning (ML) models rely on vast amounts of point-by-point labelled training examples. Hand-labelling vast amounts of data may be tedious, expensive, and error-prone. Recently, some studies have explored…

Machine Learning · Computer Science 2021-08-27 Chufan Gao , Mononito Goswami

Semi-Supervised Methods for Out-of-Domain Dependency Parsing

Dependency parsing is one of the important natural language processing tasks that assigns syntactic trees to texts. Due to the wider availability of dependency corpora and improved parsing and machine learning techniques, parsing accuracies…

Computation and Language · Computer Science 2018-10-05 Juntao Yu

Semantic-Oriented Unlabeled Priming for Large-Scale Language Models

Due to the high costs associated with finetuning large language models, various recent works propose to adapt them to specific tasks without any parameter updates through in-context learning. Unfortunately, for in-context learning there is…

Computation and Language · Computer Science 2022-02-15 Yanchen Liu , Timo Schick , Hinrich Schütze