English
Related papers

Related papers: Distributed NLP

200 papers

The readability assessment deals with estimating the level of difficulty in reading texts.Many readability tests, which do not indicate execution efficiency, have been applied on specific texts to measure the reading grade level in science…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-13 Betul Karakus , Ibrahim Riza Hallac , Galip Aydin

In this work, we present VNLP: the first dedicated, complete, open-source, well-documented, lightweight, production-ready, state-of-the-art Natural Language Processing (NLP) package for the Turkish language. It contains a wide variety of…

Computation and Language · Computer Science 2024-03-05 Meliksah Turker , Mehmet Erdi Ari , Aydin Han

In this note, we present preliminary results on the use of "network calculus" for parallel processing systems, specifically MapReduce.

Performance · Computer Science 2015-02-03 G. Kesidis , B. Urgaonkar , Y. Shan , S. Kamarava , J. Liebeherr

The growth of the amount of medical image data produced on a daily basis in modern hospitals forces the adaptation of traditional medical image analysis and indexing approaches towards scalable solutions. The number of images and their…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-26 Dimitrios Markonis , Roger Schaer , Ivan Eggel , Henning Müller , Adrien Depeursinge

Recent advances in neural machine translation (NMT) have pushed the quality of machine translation systems to the point where they are becoming widely adopted to build competitive systems. However, there is still a large number of languages…

Machine Learning (ML) techniques are indispensable in a wide range of fields. Unfortunately, the exponential increase of dataset sizes are rapidly extending the runtime of sequential algorithms and threatening to slow future progress in ML.…

Machine Learning · Computer Science 2011-07-06 Yucheng Low , Joseph Gonzalez , Aapo Kyrola , Danny Bickson , Carlos Guestrin

In this paper, we investigate the performance and success rates of Na\"ive Bayes Classification Algorithm for automatic classification of Turkish news into predetermined categories like economy, life, health etc. We use Apache Big Data…

Information Retrieval · Computer Science 2018-02-13 Galip Aydin , Ibrahim Riza Hallac

Research in NLP for Central Asian Turkic languages - Kazakh, Uzbek, Kyrgyz, and Turkmen - faces typical low-resource language challenges like data scarcity, limited linguistic resources and technology development. However, recent…

Computation and Language · Computer Science 2026-02-17 Yana Veitsman , Mareike Hartmann

Cloud infrastructures enable the efficient parallel execution of data-intensive tasks such as entity resolution on large datasets. We investigate challenges and possible solutions of using the MapReduce programming model for parallel entity…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-10-18 Lars Kolb , Andreas Thor , Erhard Rahm

Recently, due to rapid development of information and communication technologies, the data are created and consumed in the avalanche way. Distributed computing create preconditions for analyzing and processing such Big Data by distributing…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-30 Vladyslav Taran , Oleg Alienin , Sergii Stirenko , A. Rojbi , Yuri Gordienko

Natural language processing, as a data analytics related technology, is used widely in many research areas such as artificial intelligence, human language processing, and translation. At present, due to explosive growth of data, there are…

Computation and Language · Computer Science 2016-08-17 Emre Erturk , Hong Shi

Spectral clustering and cloud computing is emerging branch of computer science or related discipline. It overcome the shortcomings of some traditional clustering algorithm and guarantee the convergence to the optimal solution, thus have to…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-02 Yajun Cui , Yang Zhao , Kafei Xiao , Chenglong Zhang , Lei Wang

Word2Vec is a widely used algorithm for extracting low-dimensional vector representations of words. It generated considerable excitement in the machine learning and natural language processing (NLP) communities recently due to its…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-09 Shihao Ji , Nadathur Satish , Sheng Li , Pradeep Dubey

NLP Workbench is a web-based platform for text mining that allows non-expert users to obtain semantic understanding of large-scale corpora using state-of-the-art text mining models. The platform is built upon latest pre-trained models and…

Computation and Language · Computer Science 2024-03-06 Peiran Yao , Matej Kosmajac , Abeer Waheed , Kostyantyn Guzhva , Natalie Hervieux , Denilson Barbosa

Hadoop is an open source implementation of the MapReduce Framework in the realm of distributed processing. A Hadoop cluster is a unique type of computational cluster designed for storing and analyzing large data sets across cluster of…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-11-10 Muralikrishnan Ramane , Sharmila Krishnamoorthy , Sasikala Gowtham

As ML applications are becoming ever more pervasive, fully-trained systems are made increasingly available to a wide public, allowing end-users to submit queries with their own data, and to efficiently retrieve results. With increasingly…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-01 Daniela Loreti , Marco Lippi , Paolo Torroni

Natural language processing for the Turkic language family, spoken by over 200 million people across Eurasia, remains fragmented, with most languages lacking unified tooling and resources. We present TurkicNLP, an open-source Python library…

Computation and Language · Computer Science 2026-05-25 Sherzod Hakimov

Having sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish…

Computation and Language · Computer Science 2022-03-17 Ali Safaya , Emirhan Kurtuluş , Arda Göktoğan , Deniz Yuret

Linguistic diversity across the world creates a disparity with the availability of good quality digital language resources thereby restricting the technological benefits to majority of human population. The lack or absence of data resources…

Computation and Language · Computer Science 2025-10-16 Prawaal Sharma , Navneet Goyal , Poonam Goyal , Vishnupriyan R

There are a lot of tools and resources available for processing Finnish. In this paper, we survey recent papers focusing on Finnish NLP related to many different subcategories of NLP such as parsing, generation, semantics and speech. NLP…

Computation and Language · Computer Science 2021-09-24 Mika Hämäläinen , Khalid Alnajjar
‹ Prev 1 2 3 10 Next ›