English
Related papers

Related papers: Standards for Language Resources

200 papers

The goal of this paper is two-fold: to present an abstract data model for linguistic annotations and its implementation using XML, RDF and related standards; and to outline the work of a newly formed committee of the International Standards…

Computation and Language · Computer Science 2009-11-11 Nancy Ide , Laurent Romary

This paper describes the Linguistic Annotation Framework under development within ISO TC37 SC4 WG1. The Linguistic Annotation Framework is intended to serve as a basis for harmonizing existing language resources as well as developing new…

Computation and Language · Computer Science 2007-07-24 Laurent Romary , Nancy Ide

It is widely recognized that the proliferation of annotation schemes runs counter to the need to re-use language resources, and that standards for linguistic annotation are becoming increasingly mandatory. To answer this need, we have…

Computation and Language · Computer Science 2009-09-16 Nancy Ide , Laurent Romary , Tomaz Erjavec

This paper provides an overview of the various projects carried out within ISO committee TC 37/SC 4 dealing with the management of language (digital) resources. On the basis of the technical experience gained in the committee and the wider…

Computation and Language · Computer Science 2015-10-28 Laurent Romary

`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added…

Computation and Language · Computer Science 2007-05-23 Steven Bird , Mark Liberman

`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions - audio, video and/or physiological recordings - or it may be textual. The added…

Computation and Language · Computer Science 2007-05-23 Steven Bird , Mark Liberman

Data annotation and synthesis generally refers to the labeling or generating of raw data with relevant information, which could be used for improving the efficacy of machine learning models. The process, however, is labor-intensive and…

Computation and Language · Computer Science 2024-12-04 Zhen Tan , Dawei Li , Song Wang , Alimohammad Beigi , Bohan Jiang , Amrita Bhattacharjee , Mansooreh Karami , Jundong Li , Lu Cheng , Huan Liu

Annotation graphs and annotation servers offer infrastructure to support the analysis of human language resources in the form of time-series data such as text, audio and video. This paper outlines areas of common need among empirical…

Computation and Language · Computer Science 2007-05-23 Christopher Cieri , Steven Bird

This study introduces a prescriptive annotation benchmark grounded in humanities research to ensure consistent, unbiased labeling of offensive language, particularly for casual and non-mainstream language uses. We contribute two newly…

Computation and Language · Computer Science 2024-10-18 Xinmeng Hou

This paper addresses the harmonization of metadata from diverse repositories of language resources (LRs). Leveraging linked data and RDF techniques, we integrate data from multiple sources into a unified model based on DCAT and META-SHARE…

Computation and Language · Computer Science 2025-01-13 Zixuan Liang

Many formal languages have been proposed to express or represent Ontologies, including RDF, RDFS, DAML+OIL and OWL. Most of these languages are based on XML syntax, but with various terminologies and expressiveness. Therefore, choosing a…

Artificial Intelligence · Computer Science 2010-06-24 Mohammad Mustafa Taye

Problems faced by international standardization bodies become more and more crucial as the number and the size of the standards they produce increase. Sometimes, also, the lack of coordination among the committees in charge of the…

Software Engineering · Computer Science 2018-06-19 A. F. Cutting-Decelle , A. Digeon , R. I. Young , J. L. Barraud , P. Lamboley

Human annotation of natural language facilitates standardized evaluation of natural language processing systems and supports automated feature extraction. This document consists of instructions for annotating the temporal information in…

cmp-lg · Computer Science 2016-08-31 Tom O'Hara , Janyce Wiebe , Karen Payne

The annotation of textual information is a fundamental activity in Linguistics and Computational Linguistics. This article presents various observations on annotations. It approaches the topic from several angles including Hypertext,…

Computation and Language · Computer Science 2020-04-23 Georg Rehm

Task-oriented conversational datasets often lack topic variability and linguistic diversity. However, with the advent of Large Language Models (LLMs) pretrained on extensive, multilingual and diverse text data, these limitations seem…

Low-resource languages face significant barriers in AI development due to limited linguistic resources and expertise for data labeling, rendering them rare and costly. The scarcity of data and the absence of preexisting tools exacerbate…

Computation and Language · Computer Science 2024-06-25 Nataliia Kholodna , Sahib Julka , Mohammad Khodadadi , Muhammed Nurullah Gumus , Michael Granitzer

The paper describes the ALVIS annotation format designed for the indexing of large collections of documents in topic-specific search engines. This paper is exemplified on the biological domain and on MedLine abstracts, as developing a…

Artificial Intelligence · Computer Science 2016-08-16 Adeline Nazarenko , Erick Alphonse , Julien Derivière , Thierry Hamon , Guillaume Vauvert , Davy Weissenbacher

This paper introduces a novel annotation framework for the fine-grained modeling of Noun Phrases' (NPs) genericity in natural language. The framework is designed to be simple and intuitive, making it accessible to non-expert annotators and…

Computation and Language · Computer Science 2024-04-02 Claudia Collacciani , Andrea Amelio Ravelli , Marianna Marcella Bolognesi

The International Standards Organization (ISO) is developing a new standard for Graph Query Language, with a particular focus on graph patterns with repeating paths. The Linked Database Benchmark Council (LDBC) has developed benchmarks to…

Databases · Computer Science 2024-07-16 Malcolm Crowe , Fritz Laux

SMCalFlow is a large corpus of semantically detailed annotations of task-oriented natural dialogues. The annotations use a dataflow approach, in which the annotations are programs which represent user requests. Despite the availability,…

Computation and Language · Computer Science 2022-06-29 Joram Meron
‹ Prev 1 2 3 10 Next ›