English
Related papers

Related papers: PyThaiNLP: Thai Natural Language Processing in Pyt…

200 papers

We present mahaNLP, an open-source natural language processing (NLP) library specifically built for the Marathi language. It aims to enhance the support for the low-resource Indian language Marathi in the field of NLP. It is an easy-to-use,…

Computation and Language · Computer Science 2023-11-07 Vidula Magdum , Omkar Dhekane , Sharayu Hiwarkhedkar , Saloni Mittal , Raviraj Joshi

The success of Pre-Trained Models (PTMs) has reshaped the development of Natural Language Processing (NLP). Yet, it is not easy to obtain high-performing models and deploy them online for industrial practitioners. To bridge this gap,…

Computation and Language · Computer Science 2023-03-14 Chengyu Wang , Minghui Qiu , Chen Shi , Taolin Zhang , Tingting Liu , Lei Li , Jianing Wang , Ming Wang , Jun Huang , Wei Lin

The Tajik language, written in Cyrillic script, remains severely under-resourced in terms of publicly available natural language processing (NLP) toolkits, hindering both linguistic research and applied development. This paper introduces…

Computation and Language · Computer Science 2026-05-29 Mullosharaf K. Arabov

Online data streams make training machine learning models hard because of distribution shift and new patterns emerging over time. For natural language processing (NLP) tasks that utilize a collection of features based on lexicons and rules,…

Computation and Language · Computer Science 2022-11-28 Shubhanshu Mishra , Jana Diesner

This paper presents a distributed platform for Natural Language Processing called PyPLN. PyPLN leverages a vast array of NLP and text processing open source tools, managing the distribution of the workload on a variety of configurations:…

Computation and Language · Computer Science 2013-02-20 Flávio Codeço Coelho , Renato Rocha Souza , Álvaro Justen , Flávio Amieiro , Heliana Mello

Typhoon is a series of Thai large language models (LLMs) developed specifically for the Thai language. This technical report presents challenges and insights in developing Thai LLMs, including data preparation, pretraining,…

Textless spoken language processing research aims to extend the applicability of standard NLP toolset onto spoken language and languages with few or no textual resources. In this paper, we introduce textless-lib, a PyTorch-based library…

We present VietNormalizer1, an open-source, zero-dependency Python library for Vietnamese text normalization targeting Text-to-Speech (TTS) and Natural Language Processing (NLP) applications. Vietnamese text normalization is a critical yet…

Computation and Language · Computer Science 2026-03-05 Hung Vu Nguyen , Loan Do , Thanh Ngoc Nguyen , Ushik Shrestha Khwakhali , Thanh Pham , Vinh Do , Charlotte Nguyen , Hien Nguyen

Python is one of the most commonly used programming languages in industry and education. Its English keywords and built-in functions/modules allow it to come close to pseudo-code in terms of its readability and ease of writing. However,…

Computation and Language · Computer Science 2025-04-17 Joshua Otten , Antonios Anastasopoulos , Kevin Moran

In this paper, we introduce HugNLP, a unified and comprehensive library for natural language processing (NLP) with the prevalent backend of HuggingFace Transformers, which is designed for NLP researchers to easily utilize off-the-shelf…

Computation and Language · Computer Science 2023-03-01 Jianing Wang , Nuo Chen , Qiushi Sun , Wenkang Huang , Chengyu Wang , Ming Gao

The rise in usage of Large Language Models to near ubiquitousness in recent years has risen societal concern about their applications in decision-making contexts, such as organizational justice or healthcare. This, in turn, poses questions…

Computation and Language · Computer Science 2025-08-06 Arturo Pérez-Peralta , Sandra Benítez-Peña , Rosa E. Lillo

In this paper we present TweetNLP, an integrated platform for Natural Language Processing (NLP) in social media. TweetNLP supports a diverse set of NLP tasks, including generic focus areas such as sentiment analysis and named entity…

This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily. It is…

Computation and Language · Computer Science 2018-06-01 Matt Gardner , Joel Grus , Mark Neumann , Oyvind Tafjord , Pradeep Dasigi , Nelson Liu , Matthew Peters , Michael Schmitz , Luke Zettlemoyer

BNLP is an open source language processing toolkit for Bengali language consisting with tokenization, word embedding, POS tagging, NER tagging facilities. BNLP provides pre-trained model with high accuracy to do model based tokenization,…

Computation and Language · Computer Science 2021-12-02 Sagor Sarker

Dependency parsing (DP) is a task that analyzes text for syntactic structure and relationship between words. DP is widely used to improve natural language processing (NLP) applications in many languages such as English. Previous works on DP…

Computation and Language · Computer Science 2020-05-05 Sattaya Singkul , Borirat Khampingyot , Nattasit Maharattamalai , Supawat Taerungruang , Tawunrat Chalothorn

With the increased availability of rich tactile sensors, there is an equally proportional need for open-source and integrated software capable of efficiently and effectively processing raw touch measurements into high-level signals that can…

Robotics · Computer Science 2021-05-28 Mike Lambeta , Huazhe Xu , Jingwei Xu , Po-Wei Chou , Shaoxiong Wang , Trevor Darrell , Roberto Calandra

In recent years, the extraction of opinions and information from user-generated text has attracted a lot of interest, largely due to the unprecedented volume of content in Social Media. However, social researchers face some issues in…

Natural language processing for the Turkic language family, spoken by over 200 million people across Eurasia, remains fragmented, with most languages lacking unified tooling and resources. We present TurkicNLP, an open-source Python library…

Computation and Language · Computer Science 2026-05-25 Sherzod Hakimov

Despite impressive success of machine learning algorithms in clinical natural language processing (cNLP), rule-based approaches still have a prominent role. In this paper, we introduce medspaCy, an extensible, open-source cNLP library based…

Computation and Language · Computer Science 2021-06-16 Hannah Eyre , Alec B Chapman , Kelly S Peterson , Jianlin Shi , Patrick R Alba , Makoto M Jones , Tamara L Box , Scott L DuVall , Olga V Patterson

Natural Language Processing (NLP) systems often make use of machine learning techniques that are unfamiliar to end-users who are interested in analyzing clinical records. Although NLP has been widely used in extracting information from…

Human-Computer Interaction · Computer Science 2017-07-10 Gaurav Trivedi , Phuong Pham , Wendy Chapman , Rebecca Hwa , Janyce Wiebe , Harry Hochheiser
‹ Prev 1 2 3 10 Next ›