English
Related papers

Related papers: Open-Set Language Identification

200 papers

Language identification (LID) is a fundamental step in many natural language processing pipelines. However, current LID systems are far from perfect, particularly on lower-resource languages. We present a LID model which achieves a…

Computation and Language · Computer Science 2023-08-31 Laurie Burchell , Alexandra Birch , Nikolay Bogoychev , Kenneth Heafield

Most state-of-the-art spoken language identification models are closed-set; in other words, they can only output a language label from the set of classes they were trained on. Open-set spoken language identification systems, however, gain…

Computation and Language · Computer Science 2023-08-30 Mustafa Eyceoz , Justin Lee , Siddharth Pittie , Homayoon Beigi

While most modern speech Language Identification methods are closed-set, we want to see if they can be modified and adapted for the open-set problem. When switching to the open-set problem, the solution gains the ability to reject an audio…

Computation and Language · Computer Science 2022-05-24 Mustafa Eyceoz , Justin Lee , Homayoon Beigi

Random Indexing is a simple implementation of Random Projections with a wide range of applications. It can solve a variety of problems with good accuracy without introducing much complexity. Here we use it for identifying the language of…

Computation and Language · Computer Science 2015-03-02 Aditya Joshi , Johan Halseth , Pentti Kanerva

Open-set learning and discovery (OSLD) is a challenging machine learning task in which samples from new (unknown) classes can appear at test time. It can be seen as a generalization of zero-shot learning, where the new classes are not known…

A common problem for automatic speech recognition systems is how to recognize words that they did not see during training. Currently there is no established method of evaluating different techniques for tackling this problem. We propose…

Computation and Language · Computer Science 2021-07-20 Rudolf A. Braun , Srikanth Madikeri , Petr Motlicek

In this research, we advanced a spoken language recognition system, moving beyond traditional feature vector-based models. Our improvements focused on effectively capturing language characteristics over extended periods using a specialized…

Sound · Computer Science 2025-01-22 Or Haim Anidjar , Roi Yozevitch

In this work, we propose an open-vocabulary object detection method that, based on image-caption pairs, learns to detect novel object classes along with a given set of known classes. It is a two-stage training approach that first uses a…

Computer Vision and Pattern Recognition · Computer Science 2022-07-29 Maria A. Bravo , Sudhanshu Mittal , Thomas Brox

Hate speech detection has become an important research topic within the past decade. More private corporations are needing to regulate user generated content on different platforms across the globe. In this paper, we introduce a study of…

Computation and Language · Computer Science 2022-01-28 Neha Deshpande , Nicholas Farris , Vidhur Kumar

This paper proposes a method to use deep neural networks as end-to-end open-set classifiers. It is based on intra-class data splitting. In open-set recognition, only samples from a limited number of known classes are available for training.…

Machine Learning · Computer Science 2019-11-21 Patrick Schlachter , Yiwen Liao , Bin Yang

Hate speech detection is a challenging problem with most of the datasets available in only one language: English. In this paper, we conduct a large scale analysis of multilingual hate speech in 9 languages from 16 different sources. We…

Social and Information Networks · Computer Science 2020-12-10 Sai Saketh Aluru , Binny Mathew , Punyajoy Saha , Animesh Mukherjee

Language Identification (LI) is an important first step in several speech processing systems. With a growing number of voice-based assistants, speech LI has emerged as a widely researched field. To approach the problem of identifying…

Computation and Language · Computer Science 2019-10-11 Sarthak , Shikhar Shukla , Govind Mittal

We introduce a few-shot transfer learning method for keyword spotting in any language. Leveraging open speech corpora in nine languages, we automate the extraction of a large multilingual keyword bank and use it to train an embedding model.…

Computation and Language · Computer Science 2021-09-13 Mark Mazumder , Colby Banbury , Josh Meyer , Pete Warden , Vijay Janapa Reddi

The task of determining a speaker's native language based only on his speeches in a second language is known as Native Language Identification or NLI. Due to its increasing applications in various domains of speech signal processing, this…

Computation and Language · Computer Science 2018-11-15 Ahmed Nazim Uddin , Md Ashequr Rahman , Md. Rafidul Islam , Mohammad Ariful Haque

Open set recognition (OSR) is a critical aspect of machine learning, addressing the challenge of detecting novel classes during inference. Within the realm of deep learning, neural classifiers trained on a closed set of data typically…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Jiawen Xu , Margret Keuper

The pre-trained multi-lingual XLSR model generalizes well for language identification after fine-tuning on unseen languages. However, the performance significantly degrades when the languages are not very distinct from each other, for…

Machine Learning · Computer Science 2023-02-17 Shangeth Rajaa , Kriti Anandan , Swaraj Dalmia , Tarun Gupta , Eng Siong Chng

We propose a method that can perform one-class classification given only a small number of examples from the target class and none from the others. We formulate the learning of meaningful features for one-class classification as a…

Computer Vision and Pattern Recognition · Computer Science 2021-04-26 Gabriel Dahia , Maurício Pamplona Segundo

This paper proposes a novel generic one-class feature learning method based on intra-class splitting. In one-class classification, feature learning is challenging, because only samples of one class are available during training. Hence,…

Machine Learning · Computer Science 2019-11-21 Patrick Schlachter , Yiwen Liao , Bin Yang

Real-world classification tasks are frequently required to work in an open-set setting. This is especially challenging for few-shot learning problems due to the small sample size for each known category, which prevents existing open-set…

Computer Vision and Pattern Recognition · Computer Science 2021-09-15 Jedrzej Kozerawski , Matthew Turk

Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines. In this work, we propose a novel cross-lingual transfer method for…

Computation and Language · Computer Science 2017-07-24 Ivan Vulić , Nikola Mrkšić , Anna Korhonen
‹ Prev 1 2 3 10 Next ›