Related papers: Morphology with a Null-Interface
Computational morphology handles the language processing at the word level. It is one of the foundational tasks in the NLP pipeline for the development of higher level NLP applications. It mainly deals with the processing of words and word…
This paper defines unification based ID/LP grammars based on typed feature structures as nonterminals and proposes a variant of Earley's algorithm to decide whether a given input sentence is a member of the language generated by a…
This paper describes an abstract machine for linguistic formalisms that are based on typed feature structures, such as HPSG. The core design of the abstract machine is given in detail, including the compilation process from a high-level…
In this paper we present a lexicon-based approach to the problem of morphological processing. Full-form words, lemmas and grammatical tags are interconnected in a DAWG. Thus, the process of analysis/synthesis is reduced to a search in the…
This paper presents a joint model for performing unsupervised morphological analysis on words, and learning a character-level composition function from morphemes to word embeddings. Our model splits individual words into segments, and…
Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages. Having long been multilingual, the…
A model for the full treatment of Spanish inflection for verbs, nouns and adjectives is presented. This model is based on feature unification and it relies upon a lexicon of allomorphs both for stems and morphemes. Word forms are built by…
Collecting and annotating morphological data present significant challenges, requiring linguistic expertise, methodological rigour, and substantial resources. These barriers are particularly acute for low-resource languages and varieties.…
This paper presents a scalable method for integrating compositional morphological representations into a vector-based probabilistic language model. Our approach is evaluated in the context of log-bilinear language models, rendered suitably…
The Universal Morphology UniMorph project is a collaborative effort to improve how NLP handles complex morphology across the world's languages. The project releases annotated morphological data using a universal tagset, the UniMorph schema.…
This paper describes a first step towards the definition of an abstract machine for linguistic formalisms that are based on typed feature structures, such as HPSG. The core design of the abstract machine is given in detail, including the…
This paper focuses on unsupervised modeling of morphological families, collectively comprising a forest over the language vocabulary. This formulation enables us to capture edgewise properties reflecting single-step morphological…
This technical report describes a new prototype architecture designed to integrate top-down and bottom-up analysis of non-standard linguistic input, where a semantic model of the context of an utterance is used to guide the analysis of the…
This research introduces a new parsing approach, based on earlier syntactic work on context free grammar (CFG) and generalized phrase structure grammar (GPSG). The approach comprises both a new parsing algorithm and a set of syntactic rules…
Linguistic typology aims to capture structural and semantic variation across the world's languages. A large-scale typology could provide excellent guidance for multilingual Natural Language Processing (NLP), particularly for languages that…
Large language models trained predominantly on high-resource languages exhibit systematic biases toward dominant typological patterns, leading to structural non-conformance when translating into typologically divergent low-resource…
Morphological tasks use large multi-lingual datasets that organize words into inflection tables, which then serve as training and evaluation data for various tasks. However, a closer inspection of these data reveals profound…
I describe a compiler and development environment for feature-augmented two-level morphology rules integrated into a full NLP system. The compiler is optimized for a class of languages including many or most European ones, and for rapid…
We describe an automated method for identifying classes of morphologically related words in an on-line dictionary, and for linking individual senses in the derived form to one or more senses in the base form by means of morphological…
We propose the task of unsupervised morphological paradigm completion. Given only raw text and a lemma list, the task consists of generating the morphological paradigms, i.e., all inflected forms, of the lemmas. From a natural language…