English
Related papers

Related papers: Exploring text datasets by visualizing relevant wo…

200 papers

When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then…

Computation and Language · Computer Science 2017-07-20 Franziska Horn , Leila Arras , Grégoire Montavon , Klaus-Robert Müller , Wojciech Samek

Traditionally a document is visualized by a word cloud. Recently, distributed representation methods for documents have been developed, which map a document to a set of topic embeddings. Visualizing such a representation is useful to…

Information Retrieval · Computer Science 2017-02-07 Shaohua Li , Tat-Seng Chua

Generating word (tag) clouds is a powerful data visualization technique that allows people to get easily acquainted with the content of a large collection of textual documents and identify their subject domains for a matter of seconds,…

Information Retrieval · Computer Science 2022-01-03 Yordan Kalmukov

Word clouds and text visualization is one of the recent most popular and widely used types of visualizations. Despite the attractiveness and simplicity of producing word clouds, they do not provide a thorough visualization for the…

Human-Computer Interaction · Computer Science 2014-12-19 Fereshteh Sadeghi , Hamid Izadinia

We are presenting a text analysis tool set that allows analysts in various fields to sieve through large collections of multilingual news items quickly and to find information that is of relevance to them. For a given document collection,…

Computation and Language · Computer Science 2007-05-23 Ralf Steinberger , Bruno Pouliquen , Camelia Ignat

This paper presents a procedure to retrieve subsets of relevant documents from large text collections for Content Analysis, e.g. in social sciences. Document retrieval for this purpose needs to take account of the fact that analysts often…

Information Retrieval · Computer Science 2017-07-12 Gregor Wiedemann , Andreas Niekler

Content based Document Classification is one of the biggest challenges in the context of free text mining. Current algorithms on document classifications mostly rely on cluster analysis based on bag-of-words approach. However that method is…

Information Retrieval · Computer Science 2015-12-15 Koushiki Sarkar , Ritwika Law

We present a hierarchical convolutional document model with an architecture designed to support introspection of the document structure. Using this model, we show how to use visualisation techniques from the computer vision literature to…

Computation and Language · Computer Science 2015-03-03 Misha Denil , Alban Demiraj , Nando de Freitas

Keyword extraction is used for summarizing the content of a document and supports efficient document retrieval, and is as such an indispensable part of modern text-based systems. We explore how load centrality, a graph-theoretic measure…

Computation and Language · Computer Science 2019-11-12 Blaž Škrlj , Andraž Repar , Senja Pollak

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to…

Computation and Language · Computer Science 2022-12-20 Mina Samizadeh

Word clouds are a popular tool for visualizing documents, but they are not a good tool for comparing documents, because identical words are not presented consistently across different clouds. We introduce the concept of word storms, a…

Information Retrieval · Computer Science 2013-01-04 Quim Castella , Charles Sutton

"Keyword Extraction" refers to the task of automatically identifying the most relevant and informative phrases in natural language text. As we are deluged with large amounts of text data in many different forms and content - emails, blogs,…

Computation and Language · Computer Science 2019-08-22 Shibamouli Lahiri

In this paper, we propose an alternative to deep neural networks for semantic information retrieval for the case of long documents. This new approach exploiting clustering techniques to take into account the meaning of words in Information…

Information Retrieval · Computer Science 2025-07-29 Paul Mbathe Mekontchou , Armel Fotsoh , Bernabe Batchakui , Eddy Ella

Point feature labeling is a classical problem in cartography and GIS that has been extensively studied for geospatial point data. At the same time, word clouds are a popular visualization tool to show the most important words in text data…

Computational Geometry · Computer Science 2021-09-10 Sujoy Bhore , Robert Ganian , Guangping Li , Martin Nöllenburg , Jules Wulms

In this paper, we propose a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, which is widely used in computer vision. After each word in a collection of documents is…

Computation and Language · Computer Science 2017-07-26 Andrei M. Butnaru , Radu Tudor Ionescu

Keyword extraction is an important document process that aims at finding a small set of terms that concisely describe a document's topics. The most popular state-of-the-art unsupervised approaches belong to the family of the graph-based…

Computation and Language · Computer Science 2020-08-24 Eirini Papagiannopoulou , Grigorios Tsoumakas , Apostolos N. Papadopoulos

Now a days, the text document is spontaneously increasing over the internet, e-mail and web pages and they are stored in the electronic database format. To arrange and browse the document it becomes difficult. To overcome such problem the…

Computation and Language · Computer Science 2013-03-05 Leena H. Patil , Mohammed Atique

We are presenting a set of multilingual text analysis tools that can help analysts in any field to explore large document collections quickly in order to determine whether the documents contain information of interest, and to find the…

Computation and Language · Computer Science 2007-05-23 Camelia Ignat , Bruno Pouliquen , Ralf Steinberger , Tomaz Erjavec

The avalanche quantity of the information developed by mankind has led to concept of automation of knowledge extraction - Data Mining ([1]). This direction is connected with a wide spectrum of problems - from recognition of the fuzzy set to…

Machine Learning · Computer Science 2009-06-05 A. A. Shumeyko , S. L. Sotnik

Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next…

Information Retrieval · Computer Science 2011-12-30 Muhammad Rafi , M. Shahid Shaikh , Amir Farooq
‹ Prev 1 2 3 10 Next ›