English
Related papers

Related papers: A Straightforward Approach to Morphological Analys…

200 papers

Morpho-syntactic lexicons provide information about the morphological and syntactic roles of words in a language. Such lexicons are not available for all languages and even when available, their coverage can be limited. We present a…

Computation and Language · Computer Science 2016-01-26 Manaal Faruqui , Ryan McDonald , Radu Soricut

Grapheme-to-Phoneme (G2P) is an essential first step in any modern, high-quality Text-to-Speech (TTS) system. Most of the current G2P systems rely on carefully hand-crafted lexicons developed by experts. This poses a two-fold problem.…

Computation and Language · Computer Science 2024-01-22 Abhinav Garg , Jiyeon Kim , Sushil Khyalia , Chanwoo Kim , Dhananjaya Gowda

We propose the task of unsupervised morphological paradigm completion. Given only raw text and a lemma list, the task consists of generating the morphological paradigms, i.e., all inflected forms, of the lemmas. From a natural language…

Computation and Language · Computer Science 2020-05-22 Huiming Jin , Liwei Cai , Yihui Peng , Chen Xia , Arya D. McCarthy , Katharina Kann

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages. Having long been multilingual, the…

Computation and Language · Computer Science 2022-03-18 Adam Wiemerslage , Miikka Silfverberg , Changbing Yang , Arya D. McCarthy , Garrett Nicolai , Eliana Colunga , Katharina Kann

This paper presents a morphological lexicon for English that handles more than 317000 inflected forms derived from over 90000 stems. The lexicon is available in two formats. The first can be used by an implementation of a two-level…

cmp-lg · Computer Science 2008-02-03 Daniel Karp , Yves Schabes , Martin Zaidel , Dania Egedi

We present an integrated architecture for word-level and sentence-level processing in a unification-based paradigm. The core of the system is a CLP implementation of a unification engine for feature structures supporting relational values.…

cmp-lg · Computer Science 2008-02-03 Harald Trost , Johannes Matiasek

We describe an automated method for identifying classes of morphologically related words in an on-line dictionary, and for linking individual senses in the derived form to one or more senses in the base form by means of morphological…

cmp-lg · Computer Science 2008-02-03 Joseph Pentheroudakis , Lucy Vanderwende , Microsoft Corporation

In this paper we present Morphy, an integrated tool for German morphology, part-of-speech tagging and context-sensitive lemmatization. Its large lexicon of more than 320,000 word forms plus its ability to process German compound nouns…

Computation and Language · Computer Science 2007-05-23 Wolfgang Lezius , Reinhard Rapp , Manfred Wettler

Fully data-driven, deep learning-based models are usually designed as language-independent and have been shown to be successful for many natural language processing tasks. However, when the studied language is low-resourced and the amount…

Computation and Language · Computer Science 2022-09-21 Şaziye Betül Özateş , Arzucan Özgür , Tunga Güngör , Balkız Öztürk

We consider construction of the suffix tree and the directed acyclic word graph (DAWG) indexing data structures for a collection $\mathcal{T}$ of texts, where a new symbol may be appended to any text in $\mathcal{T} = \{T_1, \ldots, T_K\}$,…

Data Structures and Algorithms · Computer Science 2018-07-13 Takuya Takagi , Shunsuke Inenaga , Hiroki Arimura , Dany Breslauer , Diptarama Hendrian

Morphological analysis involves predicting the syntactic traits of a word (e.g. {POS: Noun, Case: Acc, Gender: Fem}). Previous work in morphological tagging improves performance for low-resource languages (LRLs) through cross-lingual…

Computation and Language · Computer Science 2018-07-12 Chaitanya Malaviya , Matthew R. Gormley , Graham Neubig

Creating a descriptive grammar of a language is an indispensable step for language documentation and preservation. However, at the same time it is a tedious, time-consuming task. In this paper, we take steps towards automating this process…

Computation and Language · Computer Science 2020-10-07 Aditi Chaudhary , Antonios Anastasopoulos , Adithya Pratapa , David R. Mortensen , Zaid Sheikh , Yulia Tsvetkov , Graham Neubig

Morphological analysis and disambiguation is an important task and a crucial preprocessing step in natural language processing of morphologically rich languages. Kinyarwanda, a morphologically rich language, currently lacks tools for…

Computation and Language · Computer Science 2022-03-18 Antoine Nzeyimana

A model for the full treatment of Spanish inflection for verbs, nouns and adjectives is presented. This model is based on feature unification and it relies upon a lexicon of allomorphs both for stems and morphemes. Word forms are built by…

cmp-lg · Computer Science 2016-08-15 Antonio Moreno , José M. Goñi

Large language models (LLMs) exhibit strong semantic understanding, yet struggle when user instructions involve ambiguous or conceptually misaligned terms. We propose the Language Graph Model (LGM) to enhance conceptual clarity by…

Computation and Language · Computer Science 2025-11-06 Wenchang Lei , Ping Zou , Yue Wang , Feng Sun , Lei Zhao

Recent works on form understanding mostly employ multimodal transformers or large-scale pre-trained language models. These models need ample data for pre-training. In contrast, humans can usually identify key-value pairings from a form only…

Computation and Language · Computer Science 2023-05-09 Bhanu Prakash Voutharoja , Lizhen Qu , Fatemeh Shiri

Canonical morphological segmentation is the process of analyzing words into the standard (aka underlying) forms of their constituent morphemes. This is a core task in language documentation, and NLP systems have the potential to…

Computation and Language · Computer Science 2024-10-16 Enora Rice , Ali Marashian , Luke Gessler , Alexis Palmer , Katharina von der Wense

In some contexts, well-formed natural language cannot be expected as input to information or communication systems. In these contexts, the use of grammar-independent input (sequences of uninflected semantic units like e.g.…

Computation and Language · Computer Science 2007-05-23 Pascal Vaillant

Within the context of the mathematical formulation of Merge and the Strong Minimalist Thesis, we present a mathematical model of the morphology-syntax interface. In this setting, morphology has compositional properties responsible for word…

Computation and Language · Computer Science 2025-07-02 Isabella Senturia , Matilde Marcolli

We argue that grammatical analysis is a viable alternative to concept spotting for processing spoken input in a practical spoken dialogue system. We discuss the structure of the grammar, and a model for robust parsing which combines…

Computation and Language · Computer Science 2016-08-31 Gertjan van Noord , Gosse Bouma , Rob Koeling , Mark-Jan Nederhof
‹ Prev 1 2 3 10 Next ›