English
Related papers

Related papers: Toxicity Classification in Ukrainian

200 papers

Prior works in cross-lingual named entity recognition (NER) with no/little labeled data fall into two primary categories: model transfer based and data transfer based methods. In this paper we find that both method types can complement each…

Computation and Language · Computer Science 2020-07-16 Qianhui Wu , Zijia Lin , Börje F. Karlsson , Biqing Huang , Jian-Guang Lou

The open-endedness of large language models (LLMs) combined with their impressive capabilities may lead to new safety issues when being exploited for malicious use. While recent studies primarily focus on probing toxic outputs that can be…

Computation and Language · Computer Science 2023-11-30 Jiaxin Wen , Pei Ke , Hao Sun , Zhexin Zhang , Chengfei Li , Jinfeng Bai , Minlie Huang

While the evaluation of multimodal English-centric models is an active area of research with numerous benchmarks, there is a profound lack of benchmarks or evaluation suites for low- and mid-resource languages. We introduce ZNO-Vision, a…

Computation and Language · Computer Science 2024-11-25 Yurii Paniv , Artur Kiulian , Dmytro Chaplynskyi , Mykola Khandoga , Anton Polishko , Tetiana Bas , Guillermo Gabrielli

With the recent rise of toxicity in online conversations on social media platforms, using modern machine learning algorithms for toxic comment detection has become a central focus of many online applications. Researchers and companies have…

Artificial Intelligence · Computer Science 2020-03-30 Ameya Vaidya , Feng Mai , Yue Ning

The spectacular expansion of the Internet has led to the development of a new research problem in the field of natural language processing: automatic toxic comment detection, since many countries prohibit hate speech in public media. There…

Machine Learning · Computer Science 2020-09-18 Ashwin Geet D'Sa , Irina Illina , Dominique Fohr

For languages with no annotated resources, unsupervised transfer of natural language processing models such as named-entity recognition (NER) from resource-rich languages would be an appealing capability. However, differences in words and…

Computation and Language · Computer Science 2018-09-13 Jiateng Xie , Zhilin Yang , Graham Neubig , Noah A. Smith , Jaime Carbonell

In this work, we investigated how one can use the LLM to transfer the dataset and its annotation from one language to another. This is crucial since sharing the knowledge between different languages could boost certain underresourced…

Computation and Language · Computer Science 2024-10-21 Dmitrii Popov , Egor Terentev , Igor Buyanov

Toxicity detection is crucial for maintaining the peace of the society. While existing methods perform well on normal toxic contents or those generated by specific perturbation methods, they are vulnerable to evolving perturbation patterns.…

Cryptography and Security · Computer Science 2025-03-05 Hankun Kang , Jianhao Chen , Yongqi Li , Xin Miao , Mayi Xu , Ming Zhong , Yuanyuan Zhu , Tieyun Qian

In the rapidly advancing field of AI and NLP, generative large language models (LLMs) stand at the forefront of innovation, showcasing unparalleled abilities in text understanding and generation. However, the limited representation of…

Computation and Language · Computer Science 2024-04-16 Artur Kiulian , Anton Polishko , Mykola Khandoga , Oryna Chubych , Jack Connor , Raghav Ravishankar , Adarsh Shirawalmath

Free-text responses are commonly collected in psychological studies, providing rich qualitative insights that quantitative measures may not capture. Labeling curated topics of research interest in free-text data by multiple trained human…

Biomedical concept normalization links concept mentions in texts to a semantically equivalent concept in a biomedical knowledge base. This task is challenging as concepts can have different expressions in natural languages, e.g.…

Computation and Language · Computer Science 2018-07-10 Roland Roller , Madeleine Kittner , Dirk Weissenborn , Ulf Leser

Existing Chinese toxic content detection methods mainly target sentence-level classification but often fail to provide readable and contiguous toxic evidence spans. We propose \textbf{ToxiTrace}, an explainability-oriented method for…

Computation and Language · Computer Science 2026-04-15 Boyang Li , Hongzhe Shou , Yuanyuan Liang , Jingbin Zhang , Fang Zhou

We introduce aligned probing, a novel interpretability framework that aligns the behavior of language models (LMs), based on their outputs, and their internal representations (internals). Using this framework, we examine over 20 OLMo,…

Computation and Language · Computer Science 2025-09-25 Andreas Waldis , Vagrant Gautam , Anne Lauscher , Dietrich Klakow , Iryna Gurevych

Large Language Models (LLMs) have fundamentally transformed approaches to Natural Language Processing (NLP) tasks across diverse domains. In healthcare, accurate and cost-efficient text classification is crucial, whether for clinical notes…

Computation and Language · Computer Science 2026-02-16 Hajar Sakai , Sarah S. Lam

Fake news detection is a challenging task aiming to reduce human time and effort to check the truthfulness of news. Automated approaches to combat fake news, however, are limited by the lack of labeled benchmark datasets, especially in…

Computation and Language · Computer Science 2021-03-02 Inna Vogel , Jeong-Eun Choi , Meghana Meghana

Due to the growing role of the SEO technologies, it is necessary to perform an automated analysis of the article's quality. Such approach helps both to return the most intelligible pages for the user's query and to raise the web sites…

Computation and Language · Computer Science 2020-11-03 S. D. Pogorilyy , A. A. Kramov

The rapid growth of social media platforms has raised significant concerns regarding online content toxicity. When Large Language Models (LLMs) are used for toxicity detection, two key challenges emerge: 1) the absence of domain-specific…

Computation and Language · Computer Science 2025-06-03 Yibo Zhao , Jiapeng Zhu , Can Xu , Yao Liu , Xiang Li

Cross-lingual transfer (CLT) is of various applications. However, labeled cross-lingual corpus is expensive or even inaccessible, especially in the fields where labels are private, such as diagnostic results of symptoms in medicine and user…

Computation and Language · Computer Science 2022-06-15 Yinpeng Guo , Liangyou Li , Xin Jiang , Qun Liu

In recent years, fake news detection has received increasing attention in public debate and scientific research. Despite advances in detection techniques, the production and spread of false information have become more sophisticated, driven…

Computation and Language · Computer Science 2026-03-27 Pietro Dell'Oglio , Alessandro Bondielli , Francesco Marcelloni , Lucia C. Passaro
‹ Prev 1 4 5 6 7 8 10 Next ›