Related papers: Progressive Sentiment Analysis for Code-Switched T…

Adapting Deep Learning for Sentiment Classification of Code-Switched Informal Short Text

Nowadays, an abundance of short text is being generated that uses nonstandard writing styles influenced by regional languages. Such informal and code-switched content are under-resourced in terms of labeled datasets and language models even…

Computation and Language · Computer Science 2020-04-07 Muhammad Haroon Shakeel , Asim Karim

Sentiment Classification of Code-Switched Text using Pre-trained Multilingual Embeddings and Segmentation

With increasing globalization and immigration, various studies have estimated that about half of the world population is bilingual. Consequently, individuals concurrently use two or more languages or dialects in casual conversational…

Computation and Language · Computer Science 2022-11-01 Saurav K. Aryal , Howard Prioleau , Gloria Washington

Improving Zero-Shot Cross-Lingual Transfer via Progressive Code-Switching

Code-switching is a data augmentation scheme mixing words from multiple languages into source lingual text. It has achieved considerable generalization performance of cross-lingual transfer tasks by aligning cross-lingual contextual word…

Computation and Language · Computer Science 2024-06-21 Zhuoran Li , Chunming Hu , Junfan Chen , Zhijun Chen , Xiaohui Guo , Richong Zhang

Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text

Multilingual writers and speakers often alternate between two languages in a single discourse, a practice called "code-switching". Existing sentiment detection methods are usually trained on sentiment-labeled monolingual text. Manually…

Computation and Language · Computer Science 2019-06-14 Bidisha Samanta , Niloy Ganguly , Soumen Chakrabarti

Code-Mixed Probes Show How Pre-Trained Models Generalise On Code-Switched Text

Code-switching is a prevalent linguistic phenomenon in which multilingual individuals seamlessly alternate between languages. Despite its widespread use online and recent research trends in this area, research in code-switching presents…

Computation and Language · Computer Science 2024-05-08 Frances A. Laureano De Leon , Harish Tayyar Madabushi , Mark Lee

Meta-Transfer Learning for Code-Switched Speech Recognition

An increasing number of people in the world today speak a mixed-language as a result of being multilingual. However, building a speech recognition system for code-switching remains difficult due to the availability of limited resources and…

Computation and Language · Computer Science 2020-04-30 Genta Indra Winata , Samuel Cahyawijaya , Zhaojiang Lin , Zihan Liu , Peng Xu , Pascale Fung

A Survey of Code-switched Speech and Language Processing

Code-switching, the alternation of languages within a conversation or utterance, is a common communicative phenomenon that occurs in multilingual communities across the world. This survey reviews computational approaches for code-switched…

Computation and Language · Computer Science 2020-07-24 Sunayana Sitaram , Khyathi Raghavi Chandu , Sai Krishna Rallabandi , Alan W Black

Enhancing Multilingual Language Models for Code-Switched Input Data

Code-switching, or alternating between languages within a single conversation, presents challenges for multilingual language models on NLP tasks. This research investigates if pre-training Multilingual BERT (mBERT) on code-switched datasets…

Computation and Language · Computer Science 2025-03-12 Katherine Xie , Nitya Babbar , Vicky Chen , Yoanna Turura

Towards Code-switched Classification Exploiting Constituent Language Resources

Code-switching is a commonly observed communicative phenomenon denoting a shift from one language to another within the same speech exchange. The analysis of code-switched data often becomes an assiduous task, owing to the limited…

Computation and Language · Computer Science 2020-11-04 Tanvi Dadu , Kartikey Pant

Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon

Improving multilingual language models capabilities in low-resource languages is generally difficult due to the scarcity of large-scale data in those languages. In this paper, we relax the reliance on texts in low-resource languages by…

Computation and Language · Computer Science 2024-02-06 Fajri Koto , Tilman Beck , Zeerak Talat , Iryna Gurevych , Timothy Baldwin

Share What You Already Know: Cross-Language-Script Transfer and Alignment for Sentiment Detection in Code-Mixed Data

Code-switching entails mixing multiple languages. It is an increasingly occurring phenomenon in social media texts. Usually, code-mixed texts are written in a single script, even though the languages involved have different scripts.…

Computation and Language · Computer Science 2025-11-24 Niraj Pahari , Kazutaka Shimada

Semi-supervised and Transfer learning approaches for low resource sentiment classification

Sentiment classification involves quantifying the affective reaction of a human to a document, media item or an event. Although researchers have investigated several methods to reliably infer sentiment from lexical, speech and body language…

Information Retrieval · Computer Science 2018-06-11 Rahul Gupta , Saurabh Sahu , Carol Espy-Wilson , Shrikanth Narayanan

MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning

The combination of multilingual pre-trained representations and cross-lingual transfer learning is one of the most effective methods for building functional NLP systems for low-resource languages. However, for extremely low-resource…

Computation and Language · Computer Science 2021-04-19 Mengzhou Xia , Guoqing Zheng , Subhabrata Mukherjee , Milad Shokouhi , Graham Neubig , Ahmed Hassan Awadallah

Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training

Large language models (LLMs) exhibit remarkable multilingual capabilities despite the extreme language imbalance in the pre-training data. In this paper, we closely examine the reasons behind this phenomenon, focusing on the pre-training…

Computation and Language · Computer Science 2025-04-23 Zhijun Wang , Jiahuan Li , Hao Zhou , Rongxiang Weng , Jingang Wang , Xin Huang , Xue Han , Junlan Feng , Chao Deng , Shujian Huang

Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification

This paper presents a novel approach for multi-lingual sentiment classification in short texts. This is a challenging task as the amount of training data in languages other than English is very limited. Previously proposed multi-lingual…

Computation and Language · Computer Science 2017-03-08 Jan Deriu , Aurelien Lucchi , Valeria De Luca , Aliaksei Severyn , Simon Müller , Mark Cieliebak , Thomas Hofmann , Martin Jaggi

T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text Classification

Cross-lingual text classification leverages text classifiers trained in a high-resource language to perform text classification in other languages with no or minimal fine-tuning (zero/few-shots cross-lingual transfer). Nowadays,…

Computation and Language · Computer Science 2023-06-09 Inigo Jauregi Unanue , Gholamreza Haffari , Massimo Piccardi

Embedding Projection for Targeted Cross-Lingual Sentiment: Model Comparisons and a Real-World Study

Sentiment analysis benefits from large, hand-annotated resources in order to train and test machine learning models, which are often data hungry. While some languages, e.g., English, have a vast array of these resources, most…

Computation and Language · Computer Science 2019-06-26 Jeremy Barnes , Roman Klinger

Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary

Cross-lingual model transfer is a compelling and popular method for predicting annotations in a low-resource language, whereby parallel corpora provide a bridge to a high-resource language and its associated annotated corpora. However,…

Computation and Language · Computer Science 2017-05-02 Meng Fang , Trevor Cohn

GLUECoS : An Evaluation Benchmark for Code-Switched NLP

Code-switching is the use of more than one language in the same conversation or utterance. Recently, multilingual contextual embedding models, trained on multiple monolingual corpora, have shown promising results on cross-lingual and…

Computation and Language · Computer Science 2020-05-15 Simran Khanuja , Sandipan Dandapat , Anirudh Srinivasan , Sunayana Sitaram , Monojit Choudhury

Unsupervised Self-Training for Sentiment Analysis of Code-Switched Data

Sentiment analysis is an important task in understanding social media content like customer reviews, Twitter and Facebook feeds etc. In multilingual communities around the world, a large amount of social media text is characterized by the…

Computation and Language · Computer Science 2021-10-04 Akshat Gupta , Sargam Menghani , Sai Krishna Rallabandi , Alan W Black