English
Related papers

Related papers: Summarizing Indian Languages using Multilingual Tr…

200 papers

With the advent of Deep Learning based Artificial Neural Networks models, Natural Language Processing (NLP) has witnessed significant improvements in textual data processing in terms of its efficiency and accuracy. However, the research is…

Computation and Language · Computer Science 2023-10-05 Mubashir Munaf , Hammad Afzal , Naima Iltaf , Khawir Mahmood

We present the MahaSUM dataset, a large-scale collection of diverse news articles in Marathi, designed to facilitate the training and evaluation of models for abstractive summarization tasks in Indic languages. The dataset, containing 25k…

Computation and Language · Computer Science 2024-10-15 Pranita Deshmukh , Nikita Kulkarni , Sanhita Kulkarni , Kareena Manghani , Raviraj Joshi

In this paper, we study pre-trained sequence-to-sequence models for a group of related languages, with a focus on Indic languages. We present IndicBART, a multilingual, sequence-to-sequence pre-trained model focusing on 11 Indic languages…

Computation and Language · Computer Science 2022-10-28 Raj Dabre , Himani Shrotriya , Anoop Kunchukuttan , Ratish Puduppully , Mitesh M. Khapra , Pratyush Kumar

The research on text summarization for low-resource Indian languages has been limited due to the availability of relevant datasets. This paper presents a summary of various deep-learning approaches used for the ILSUM 2022 Indic language…

Computation and Language · Computer Science 2022-12-13 Rahul Tangsali , Aabha Pingle , Aditya Vyawahare , Isha Joshi , Raviraj Joshi

Text summarization plays a crucial role in natural language processing by condensing large volumes of text into concise and coherent summaries. As digital content continues to grow rapidly and the demand for effective information retrieval…

Computation and Language · Computer Science 2025-03-14 Tohida Rehman , Soumabha Ghosh , Kuntal Das , Souvik Bhattacharjee , Debarshi Kumar Sanyal , Samiran Chattopadhyay

The rapid growth of machine translation (MT) systems has necessitated comprehensive studies to meta-evaluate evaluation metrics being used, which enables a better selection of metrics that best reflect MT quality. Unfortunately, most of the…

Computation and Language · Computer Science 2023-07-04 Ananya B. Sai , Vignesh Nagarajan , Tanay Dixit , Raj Dabre , Anoop Kunchukuttan , Pratyush Kumar , Mitesh M. Khapra

Recently, with the rapid development in the fields of technology and the increasing amount of text t available on the internet, it has become urgent to develop effective tools for processing and understanding texts in a way that summaries…

Computation and Language · Computer Science 2024-06-13 Sari Masri , Yaqeen Raddad , Fidaa Khandaqji , Huthaifa I. Ashqar , Mohammed Elhenawy

Being less resource languages, Indian-Indian and English-Indian language MT system developments faces the difficulty to translate various lexical phenomena. In this paper, we present our work on a comparative study of 440 phrase-based…

Computation and Language · Computer Science 2017-10-09 Sreelekha S , Pushpak Bhattacharyya

Automatic text summarization has achieved high performance in high-resourced languages like English, but comparatively less attention has been given to summarization in less-resourced languages. This work compares a variety of different…

Computation and Language · Computer Science 2026-01-01 Chester Palen-Michel , Constantine Lignos

In this study, we present an analysis regarding the performance of the state-of-art Phrase-based Statistical Machine Translation (SMT) on multiple Indian languages. We report baseline systems on several language pairs. The motivation of…

Computation and Language · Computer Science 2017-01-17 Nadeem Jadoon Khan , Waqas Anwar , Nadir Durrani

What can pre-trained multilingual sequence-to-sequence models like mBART contribute to translating low-resource languages? We conduct a thorough empirical experiment in 10 languages to ascertain this, considering five factors: (1) the…

Automatic text summarization in Nepali language is an unexplored area in natural language processing (NLP). Although considerable research has been dedicated to extractive summarization, the area of abstractive summarization, especially for…

Computation and Language · Computer Science 2024-10-01 Prakash Dhakal , Daya Sagar Baral

Transformer based language models have led to impressive results across all domains in Natural Language Processing. Pretraining these models on language modeling tasks and finetuning them on downstream tasks such as Text Classification,…

Computation and Language · Computer Science 2021-12-06 Shaily Desai , Atharva Kshirsagar , Manisha Marathe

In this paper, we introduce Neural Information Retrieval resources for 11 widely spoken Indian Languages (Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu) from two major Indian language…

Information Retrieval · Computer Science 2023-12-18 Saiful Haq , Ashutosh Sharma , Pushpak Bhattacharyya

Machine Translation (MT) system generally aims at automatic representation of source language into target language retaining the originality of context using various Natural Language Processing (NLP) techniques. Among various NLP methods,…

Computation and Language · Computer Science 2026-03-04 Sudhansu Bala Das , Divyajoti Panda , Tapas Kumar Mishra , Bidyut Kr. Patra

Language models based on the Transformer architecture have achieved state-of-the-art performance on a wide range of NLP tasks such as text classification, question-answering, and token classification. However, this performance is usually…

Computation and Language · Computer Science 2020-11-05 Kushal Jain , Adwait Deshpande , Kumar Shridhar , Felix Laumann , Ayushman Dash

Given the recent introduction of multiple language models and the ongoing demand for improved Natural Language Processing tasks, particularly summarization, this work provides a comprehensive benchmarking of 20 recent language models,…

Computation and Language · Computer Science 2025-01-31 Abdurrahman Odabaşı , Göksel Biricik

In this paper, we report the results of the TeamNRC's participation in the BHASHA-Task 1 Grammatical Error Correction shared task https://github.com/BHASHA-Workshop/IndicGEC2025/ for 5 Indian languages. Our approach, focusing on…

Computation and Language · Computer Science 2025-11-20 Sowmya Vajjala

We present our work on developing a multilingual, efficient text-to-text transformer that is suitable for handling long inputs. This model, called mLongT5, builds upon the architecture of LongT5, while leveraging the multilingual datasets…

Computation and Language · Computer Science 2023-10-30 David Uthus , Santiago Ontañón , Joshua Ainslie , Mandy Guo

Multimodal Machine Translation (MMT) enriches the source text with visual information for translation. It has gained popularity in recent years, and several pipelines have been proposed in the same direction. Yet, the task lacks quality…

Computation and Language · Computer Science 2021-06-29 Kshitij Gupta , Devansh Gautam , Radhika Mamidi
‹ Prev 1 2 3 10 Next ›