Related papers: Native Language Identification using i-vector

Unravelling Interlanguage Facts via Explainable Machine Learning

Native language identification (NLI) is the task of training (via supervised machine learning) a classifier that guesses the native language of the author of a text. This task has been extensively researched in the last decade, and the…

Computation and Language · Computer Science 2022-08-03 Barbara Berti , Andrea Esuli , Fabrizio Sebastiani

Native Language Identification on Text and Speech

This paper presents an ensemble system combining the output of multiple SVM classifiers to native language identification (NLI). The system was submitted to the NLI Shared Task 2017 fusion track which featured students essays and spoken…

Computation and Language · Computer Science 2017-07-25 Marcos Zampieri , Alina Maria Ciobanu , Liviu P. Dinu

Scaling Native Language Identification with Transformer Adapters

Native language identification (NLI) is the task of automatically identifying the native language (L1) of an individual based on their language production in a learned language. It is useful for a variety of purposes including marketing,…

Computation and Language · Computer Science 2022-11-21 Ahmet Yavuz Uluslu , Gerold Schneider

Native Language Identification with Large Language Models

We present the first experiments on Native Language Identification (NLI) using LLMs such as GPT-4. NLI is the task of predicting a writer's first language by analyzing their writings in a second language, and is used in second language…

Computation and Language · Computer Science 2023-12-14 Wei Zhang , Alexandre Salle

Leveraging Open-Source Large Language Models for Native Language Identification

Native Language Identification (NLI) - the task of identifying the native language (L1) of a person based on their writing in the second language (L2) - has applications in forensics, marketing, and second language acquisition.…

Computation and Language · Computer Science 2025-01-22 Yee Man Ng , Ilia Markov

Native Language Identification with Big Bird Embeddings

Native Language Identification (NLI) intends to classify an author's native language based on their writing in another language. Historically, the task has heavily relied on time-consuming linguistic feature engineering, and…

Computation and Language · Computer Science 2023-09-14 Sergey Kramp , Giovanni Cassani , Chris Emmery

Speaker Identification by GMM based i Vector

Speaker Identification process is to identify a particular vocal cord from a set of existing speakers. In the speaker identification processes, unknown speaker voice sample targets each of the existing speakers present in the system and…

Sound · Computer Science 2017-04-14 Soumen Kanrar

Robust Native Language Identification through Agentic Decomposition

Large language models (LLMs) often achieve high performance in native language identification (NLI) benchmarks by leveraging superficial contextual clues such as names, locations, and cultural stereotypes, rather than the underlying…

Computation and Language · Computer Science 2025-09-23 Ahmet Yavuz Uluslu , Tannon Kew , Tilia Ellendorff , Gerold Schneider , Rico Sennrich

Turkish Native Language Identification V2

This paper presents the first application of Native Language Identification (NLI) for the Turkish language. NLI is the task of automatically identifying an individual's native language (L1) based on their writing or speech in a non-native…

Computation and Language · Computer Science 2025-11-10 Ahmet Yavuz Uluslu , Gerold Schneider

Evaluating the Effectiveness of Natural Language Inference for Hate Speech Detection in Languages with Limited Labeled Data

Most research on hate speech detection has focused on English where a sizeable amount of labeled training data is available. However, to expand hate speech detection into more languages, approaches that require minimal training data are…

Computation and Language · Computer Science 2023-06-13 Janis Goldzycher , Moritz Preisig , Chantal Amrhein , Gerold Schneider

Leveraging Native Language Speech for Accent Identification using Deep Siamese Networks

The problem of automatic accent identification is important for several applications like speaker profiling and recognition as well as for improving speech recognition systems. The accented nature of speech can be primarily attributed to…

Computation and Language · Computer Science 2018-06-20 Aditya Siddhant , Preethi Jyothi , Sriram Ganapathy

Improving Natural Language Inference with a Pretrained Parser

We introduce a novel approach to incorporate syntax into natural language inference (NLI) models. Our method uses contextual token-level vector representations from a pretrained dependency parser. Like other contextual embedders, our method…

Computation and Language · Computer Science 2019-09-19 Deric Pang , Lucy H. Lin , Noah A. Smith

I-vector Based Within Speaker Voice Quality Identification on connected speech

Voice disorders affect a large portion of the population, especially heavy voice users such as teachers or call-center workers. Most voice disorders can be treated effectively with behavioral voice therapy, which teaches patients to replace…

Sound · Computer Science 2021-02-16 Chuyao Feng , Eva van Leer , Mackenzie Lee Curtis , David V. Anderson

Improving Language Identification of Accented Speech

Language identification from speech is a common preprocessing step in many spoken language processing systems. In recent years, this field has seen fast progress, mostly due to the use of self-supervised models pretrained on multilingual…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-04 Kunnar Kukk , Tanel Alumäe

I can tell whether you are a Native Hawl\^eri Speaker! How ANN, CNN, and RNN perform in NLI-Native Language Identification

Native Language Identification (NLI) is a task in Natural Language Processing (NLP) that typically determines the native language of an author through their writing or a speaker through their speaking. It has various applications in…

Computation and Language · Computer Science 2026-02-12 Hardi Garari , Hossein Hassani

Spoken Language Identification using ConvNets

Language Identification (LI) is an important first step in several speech processing systems. With a growing number of voice-based assistants, speech LI has emerged as a widely researched field. To approach the problem of identifying…

Computation and Language · Computer Science 2019-10-11 Sarthak , Shikhar Shukla , Govind Mittal

Language identification as improvement for lip-based biometric visual systems

Language has always been one of humanity's defining characteristics. Visual Language Identification (VLI) is a relatively new field of research that is complex and largely understudied. In this paper, we present a preliminary study in which…

Computer Vision and Pattern Recognition · Computer Science 2023-02-28 Lucia Cascone , Michele Nappi , Fabio Narducci

A Semisupervised Approach for Language Identification based on Ladder Networks

In this study we address the problem of training a neuralnetwork for language identification using both labeled and unlabeled speech samples in the form of i-vectors. We propose a neural network architecture that can also handle out-of-set…

Computation and Language · Computer Science 2016-04-04 Ehud Ben-Reuven , Jacob Goldberger

Open-Set Language Identification

We present the first open-set language identification experiments using one-class classification. We first highlight the shortcomings of traditional feature extraction methods and propose a hashing-based feature vectorization approach as a…

Computation and Language · Computer Science 2017-07-18 Shervin Malmasi

Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models

In this paper, we extend previous self-supervised approaches for language identification by experimenting with Conformer based architecture in a multilingual pre-training paradigm. We find that pre-trained speech models optimally encode…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-14 Travis M. Bartley , Fei Jia , Krishna C. Puvvada , Samuel Kriman , Boris Ginsburg