English
Related papers

Related papers: Document Visualization using Topic Clouds

200 papers

Generating word (tag) clouds is a powerful data visualization technique that allows people to get easily acquainted with the content of a large collection of textual documents and identify their subject domains for a matter of seconds,…

Information Retrieval · Computer Science 2022-01-03 Yordan Kalmukov

Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next…

Information Retrieval · Computer Science 2011-12-30 Muhammad Rafi , M. Shahid Shaikh , Amir Farooq

Word clouds are a popular tool for visualizing documents, but they are not a good tool for comparing documents, because identical words are not presented consistently across different clouds. We introduce the concept of word storms, a…

Information Retrieval · Computer Science 2013-01-04 Quim Castella , Charles Sutton

When working with a new dataset, it is important to first explore and familiarize oneself with it, before applying any advanced machine learning algorithms. However, to the best of our knowledge, no tools exist that quickly and reliably…

Computation and Language · Computer Science 2017-07-18 Franziska Horn , Leila Arras , Grégoire Montavon , Klaus-Robert Müller , Wojciech Samek

Probabilistic topic modeling is a popular and powerful family of tools for uncovering thematic structure in large sets of unstructured text documents. While much attention has been directed towards the modeling algorithms and their various…

Information Retrieval · Computer Science 2014-12-01 Samuel Rönnqvist , Xiaolu Wang , Peter Sarlin

Legacy software documents are hard to understand and visualize. The tag cloud technique helps software developers to visualize the contents of software documents. A tag cloud is a well-known and simple visualization technique. This paper…

Software Engineering · Computer Science 2021-10-01 Ra'Fat Al-Msie'deen

A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document…

Machine Learning · Computer Science 2022-03-16 Dongsheng Wang , Dandan Guo , He Zhao , Huangjie Zheng , Korawat Tanwisuth , Bo Chen , Mingyuan Zhou

When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then…

Computation and Language · Computer Science 2017-07-20 Franziska Horn , Leila Arras , Grégoire Montavon , Klaus-Robert Müller , Wojciech Samek

Current text visualization techniques typically provide overviews of document content and structure using intrinsic properties such as term frequencies, co-occurrences, and sentence structures. Such visualizations lack conceptual overviews…

Human-Computer Interaction · Computer Science 2021-03-03 Xiaoyu Zhang , Senthil Chandrasegaran , Kwan-Liu Ma

Topic modeling is used for discovering latent semantic structure, usually referred to as topics, in a large collection of documents. The most widely used methods are Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis.…

Computation and Language · Computer Science 2020-08-24 Dimo Angelov

We describe a new method for visualizing topics, the distributions over terms that are automatically extracted from large text corpora using latent variable models. Our method finds significant $n$-grams related to a topic, which are then…

Machine Learning · Statistics 2009-07-07 David M. Blei , John D. Lafferty

Word clouds became a standard tool for presenting results of natural language processing methods such as topic modelling. They exhibit most important words, where word size is often chosen proportional to the relevance of words within a…

Computation · Statistics 2023-02-14 Peter Winker

The number of documents available into Internet moves each day up. For this reason, processing this amount of information effectively and expressibly becomes a major concern for companies and scientists. Methods that represent a textual…

Information Retrieval · Computer Science 2017-03-21 Mohamed Morchid , Juan-Manuel Torres-Moreno , Richard Dufour , Javier Ramírez-Rodríguez , Georges Linarès

Tag clouds provide an aggregate of tag-usage statistics. They are typically sent as in-line HTML to browsers. However, display mechanisms suited for ordinary text are not ideal for tags, because font sizes may vary widely on a line. As…

Data Structures and Algorithms · Computer Science 2009-04-22 Owen Kaser , Daniel Lemire

Point feature labeling is a classical problem in cartography and GIS that has been extensively studied for geospatial point data. At the same time, word clouds are a popular visualization tool to show the most important words in text data…

Computational Geometry · Computer Science 2021-09-10 Sujoy Bhore , Robert Ganian , Guangping Li , Martin Nöllenburg , Jules Wulms

Software visualization helps software engineers to understand and manage the size and complexity of the object-oriented source code. The tag cloud is a simple and popular visualization technique. The main idea of the tag cloud is to…

Software Engineering · Computer Science 2019-07-01 Ra'Fat Al-Msie'deen

Word embedding maps words into a low-dimensional continuous embedding space by exploiting the local word collocation patterns in a small context window. On the other hand, topic modeling maps documents onto a low-dimensional topic space, by…

Computation and Language · Computer Science 2016-08-09 Shaohua Li , Tat-Seng Chua , Jun Zhu , Chunyan Miao

Document clustering is an unsupervised approach in which a large collection of documents (corpus) is subdivided into smaller, meaningful, identifiable, and verifiable sub-groups (clusters). Meaningful representation of documents and…

Information Retrieval · Computer Science 2014-12-08 Muhammad Rafi , Farnaz Amin , Mohammad Shahid Shaikh

Researchers got success in mining the Web usage data effectively and efficiently. But representation of the mined patterns is often not in a form suitable for direct human consumption. Hence mechanisms and tools that can represent mined…

Human-Computer Interaction · Computer Science 2009-09-01 Ratnesh Kumar Jain , Dr. Suresh Jain , Dr. R. S. Kasana

Dynamic topic modeling is useful at discovering the development and change in latent topics over time. However, present methodology relies on algorithms that separate document and word representations. This prevents the creation of a…

Computation and Language · Computer Science 2024-09-19 Daniel Palamarchuk , Lemara Williams , Brian Mayer , Thomas Danielson , Rebecca Faust , Larry Deschaine , Chris North
‹ Prev 1 2 3 10 Next ›