Related papers: Standards for Language Resources

Standards for Language Resources

The goal of this paper is two-fold: to present an abstract data model for linguistic annotations and its implementation using XML, RDF and related standards; and to outline the work of a newly formed committee of the International Standards…

Computation and Language · Computer Science 2009-11-11 Nancy Ide , Laurent Romary

International Standard for a Linguistic Annotation Framework

This paper describes the Linguistic Annotation Framework under development within ISO TC37 SC4 WG1. The Linguistic Annotation Framework is intended to serve as a basis for harmonizing existing language resources as well as developing new…

Computation and Language · Computer Science 2007-07-24 Laurent Romary , Nancy Ide

A Common XML-based Framework for Syntactic Annotations

It is widely recognized that the proliferation of annotation schemes runs counter to the need to re-use language resources, and that standards for linguistic annotation are becoming increasingly mandatory. To answer this need, we have…

Computation and Language · Computer Science 2009-09-16 Nancy Ide , Laurent Romary , Tomaz Erjavec

Standards for language resources in ISO -- Looking back at 13 fruitful years

This paper provides an overview of the various projects carried out within ISO committee TC 37/SC 4 dealing with the management of language (digital) resources. On the basis of the technical experience gained in the committee and the wider…

Computation and Language · Computer Science 2015-10-28 Laurent Romary

A Formal Framework for Linguistic Annotation

`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added…

Computation and Language · Computer Science 2007-05-23 Steven Bird , Mark Liberman

A Formal Framework for Linguistic Annotation (revised version)

`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions - audio, video and/or physiological recordings - or it may be textual. The added…

Computation and Language · Computer Science 2007-05-23 Steven Bird , Mark Liberman

Large Language Models for Data Annotation and Synthesis: A Survey

Data annotation and synthesis generally refers to the labeling or generating of raw data with relevant information, which could be used for improving the efficacy of machine learning models. The process, however, is labor-intensive and…

Computation and Language · Computer Science 2024-12-04 Zhen Tan , Dawei Li , Song Wang , Alimohammad Beigi , Bohan Jiang , Amrita Bhattacharjee , Mansooreh Karami , Jundong Li , Lu Cheng , Huan Liu

Annotation Graphs and Servers and Multi-Modal Resources: Infrastructure for Interdisciplinary Education, Research and Development

Annotation graphs and annotation servers offer infrastructure to support the analysis of human language resources in the form of time-series data such as text, audio and video. This paper outlines areas of common need among empirical…

Computation and Language · Computer Science 2007-05-23 Christopher Cieri , Steven Bird

Mitigating Biases to Embrace Diversity: A Comprehensive Annotation Benchmark for Toxic Language

This study introduces a prescriptive annotation benchmark grounded in humanities research to ensure consistent, unbiased labeling of offensive language, particularly for casual and non-mainstream language uses. We contribute two newly…

Computation and Language · Computer Science 2024-10-18 Xinmeng Hou

Harmonizing Metadata of Language Resources for Enhanced Querying and Accessibility

This paper addresses the harmonization of metadata from diverse repositories of language resources (LRs). Leveraging linked data and RDF techniques, we integrate data from multiple sources into a unified model based on DCAT and META-SHARE…

Computation and Language · Computer Science 2025-01-13 Zixuan Liang

The State of the Art: Ontology Web-Based Languages: XML Based

Many formal languages have been proposed to express or represent Ontologies, including RDF, RDFS, DAML+OIL and OWL. Most of these languages are based on XML syntax, but with various terminologies and expressiveness. Therefore, choosing a…

Artificial Intelligence · Computer Science 2010-06-24 Mohammad Mustafa Taye

Extraction Of Technical Information From Normative Documents Using Automated Methods Based On Ontologies : Application To The Iso 15531 Mandate Standard - Methodology And First Results

Problems faced by international standardization bodies become more and more crucial as the number and the size of the standards they produce increase. Sometimes, also, the lack of coordination among the committees in charge of the…

Software Engineering · Computer Science 2018-06-19 A. F. Cutting-Decelle , A. Digeon , R. I. Young , J. L. Barraud , P. Lamboley

Instructions for Temporal Annotation of Scheduling Dialogs

Human annotation of natural language facilitates standardized evaluation of natural language processing systems and supports automated feature extraction. This document consists of instructions for annotating the temporal information in…

cmp-lg · Computer Science 2016-08-31 Tom O'Hara , Janyce Wiebe , Karen Payne

Observations on Annotations

The annotation of textual information is a fundamental activity in Linguistics and Computational Linguistics. This article presents various observations on annotations. It approaches the topic from several angles including Hypertext,…

Computation and Language · Computer Science 2020-04-23 Georg Rehm

Dialogue Quality and Emotion Annotations for Customer Support Conversations

Task-oriented conversational datasets often lack topic variability and linguistic diversity. However, with the advent of Large Language Models (LLMs) pretrained on extensive, multilingual and diverse text data, these limitations seem…

Computation and Language · Computer Science 2023-11-27 John Mendonça , Patrícia Pereira , Miguel Menezes , Vera Cabarrão , Ana C. Farinha , Helena Moniz , João Paulo Carvalho , Alon Lavie , Isabel Trancoso

LLMs in the Loop: Leveraging Large Language Model Annotations for Active Learning in Low-Resource Languages

Low-resource languages face significant barriers in AI development due to limited linguistic resources and expertise for data labeling, rendering them rare and costly. The scarcity of data and the absence of preexisting tools exacerbate…

Computation and Language · Computer Science 2024-06-25 Nataliia Kholodna , Sahib Julka , Mohammad Khodadadi , Muhammed Nurullah Gumus , Michael Granitzer

The ALVIS Format for Linguistically Annotated Documents

The paper describes the ALVIS annotation format designed for the indexing of large collections of documents in topic-specific search engines. This paper is exemplified on the biological domain and on MedLine abstracts, as developing a…

Artificial Intelligence · Computer Science 2016-08-16 Adeline Nazarenko , Erick Alphonse , Julien Derivière , Thierry Hamon , Guillaume Vauvert , Davy Weissenbacher

Specifying Genericity through Inclusiveness and Abstractness Continuous Scales

This paper introduces a novel annotation framework for the fine-grained modeling of Noun Phrases' (NPs) genericity in natural language. The framework is designed to be simple and intuitive, making it accessible to non-expert annotators and…

Computation and Language · Computer Science 2024-04-02 Claudia Collacciani , Andrea Amelio Ravelli , Marianna Marcella Bolognesi

Implementing the draft Graph Query Language Standard

The International Standards Organization (ISO) is developing a new standard for Graph Query Language, with a particular focus on graph patterns with repeating paths. The Linked Database Benchmark Council (LDBC) has developed benchmarks to…

Databases · Computer Science 2024-07-16 Malcolm Crowe , Fritz Laux

Simplifying Semantic Annotations of SMCalFlow

SMCalFlow is a large corpus of semantically detailed annotations of task-oriented natural dialogues. The annotations use a dataflow approach, in which the annotations are programs which represent user requests. Despite the availability,…

Computation and Language · Computer Science 2022-06-29 Joram Meron