Related papers: Natural Language Processing (almost) from Scratch
Natural language processing covers a wide variety of tasks predicting syntax, semantics, and information content, and usually each type of output is generated with specially designed architectures. In this paper, we provide the simple…
This paper presents a novel method that allows a machine learning algorithm following the transformation-based learning paradigm \cite{brill95:tagging} to be applied to multiple classification tasks by training jointly and simultaneously on…
Natural language processing (NLP) tasks (e.g. question-answering in English) benefit from knowledge of other tasks (e.g. named entity recognition in English) and knowledge of other languages (e.g. question-answering in Spanish). Such shared…
With a focus on natural language processing (NLP) and the role of large language models (LLMs), we explore the intersection of machine learning, deep learning, and artificial intelligence. As artificial intelligence continues to…
Natural language processing for programming aims to use NLP techniques to assist programming. It is increasingly prevalent for its effectiveness in improving productivity. Distinct from natural language, a programming language is highly…
In Natural Language Processing (NLP), one traditionally considers a single task (e.g. part-of-speech tagging) for a single language (e.g. English) at a time. However, recent work has shown that it can be beneficial to take advantage of…
Sequence labeling (SL) is a fundamental research problem encompassing a variety of tasks, e.g., part-of-speech (POS) tagging, named entity recognition (NER), text chunking, etc. Though prevalent and effective in many downstream applications…
The design of complex engineering systems is an often long and articulated process that highly relies on engineers' expertise and professional judgment. As such, the typical pitfalls of activities involving the human factor often manifest…
Sequence labeling is an important technique employed for many Natural Language Processing (NLP) tasks, such as Named Entity Recognition (NER), slot tagging for dialog systems and semantic parsing. Large-scale pre-trained language models…
Semi-supervised classification is an interesting idea where classification models are learned from both labeled and unlabeled data. It has several advantages over supervised classification in natural language processing domain. For…
Artificial Intelligence (AI) has huge impact on our daily lives with applications such as voice assistants, facial recognition, chatbots, autonomously driving cars, etc. Natural Language Processing (NLP) is a cross-discipline of AI and…
We describe an LSTM-based model which we call Byte-to-Span (BTS) that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in our vocabulary.…
We present a deep hierarchical recurrent neural network for sequence tagging. Given a sequence of words, our model employs deep gated recurrent units on both character and word levels to encode morphology and context information, and…
Linguistic sequence labeling is a general modeling approach that encompasses a variety of problems, such as part-of-speech tagging and named entity recognition. Recent advances in neural networks (NNs) make it possible to build reliable…
Data processing is an important step in various natural language processing tasks. As the commonly used datasets in named entity recognition contain only a limited number of samples, it is important to obtain additional labeled data in an…
Many natural language understanding (NLU) tasks, such as shallow parsing (i.e., text chunking) and semantic slot filling, require the assignment of representative labels to the meaningful chunks in a sentence. Most of the current deep…
Deep neural networks excel at learning from labeled data and achieve state-of-the-art resultson a wide array of Natural Language Processing tasks. In contrast, learning from unlabeled data, especially under domain shift, remains a…
Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for modeling and predicting sequential data, e.g. speech utterances or handwritten documents. In this study, we propose to use…
Concept tagging is a type of structured learning needed for natural language understanding (NLU) systems. In this task, meaning labels from a domain ontology are assigned to word sequences. In this paper, we review the algorithms developed…
Learning to respond to voice-text input involves the subject's ability in understanding the phonetic and text based contents and his/her ability to communicate based on his/her experience. The neuro-cognitive facility of the subject has to…