Related papers: Decoding the Text Encoding
Word clouds are frequently used to analyze and communicate text data in many domains. In order to help guide research on improving the legibility of word clouds, we have conducted a survey of their usage in Digital Humanities academia and…
When working with a new dataset, it is important to first explore and familiarize oneself with it, before applying any advanced machine learning algorithms. However, to the best of our knowledge, no tools exist that quickly and reliably…
Word clouds are a popular tool for visualizing documents, but they are not a good tool for comparing documents, because identical words are not presented consistently across different clouds. We introduce the concept of word storms, a…
Word clouds are a popular text visualization technique that summarize an input text by displaying its most important words in a compact image. The traditional layout methods do not take proximity effects between words into account; this has…
The Bag-of-Words (BoW) representation is widely used in computer vision. The size of the codebook impacts the time and space complexity of the applications that use BoW. Thus, given a training set for a particular computer vision task, a…
Traditionally a document is visualized by a word cloud. Recently, distributed representation methods for documents have been developed, which map a document to a set of topic embeddings. Visualizing such a representation is useful to…
Tag clouds provide an aggregate of tag-usage statistics. They are typically sent as in-line HTML to browsers. However, display mechanisms suited for ordinary text are not ideal for tags, because font sizes may vary widely on a line. As…
Software visualization helps software engineers to understand and manage the size and complexity of the object-oriented source code. The tag cloud is a simple and popular visualization technique. The main idea of the tag cloud is to…
Generating word (tag) clouds is a powerful data visualization technique that allows people to get easily acquainted with the content of a large collection of textual documents and identify their subject domains for a matter of seconds,…
Legacy software documents are hard to understand and visualize. The tag cloud technique helps software developers to visualize the contents of software documents. A tag cloud is a well-known and simple visualization technique. This paper…
Word clouds became a standard tool for presenting results of natural language processing methods such as topic modelling. They exhibit most important words, where word size is often chosen proportional to the relevance of words within a…
Software artifacts visualization helps software developers to manage the size and complexity of the software system. The tag cloud technique visualizes tags within the cloud according to their frequencies in software artifacts. A font size…
Shape-Text matching is an important task of high-level shape understanding. Current methods mainly represent a 3D shape as multiple 2D rendered views, which obviously can not be understood well due to the structural ambiguity caused by…
In the new era of internet systems and applications, a concept of detecting distinguished topics from huge amounts of text has gained a lot of attention. These methods use representation of text in a numerical format -- called embeddings --…
Data Visualization has become an important aspect of big data analytics and has grown in sophistication and variety. We specifically identify the need for an analytical framework for data visualization with textual information. Data…
Automatically translating images to texts involves image scene understanding and language modeling. In this paper, we propose a novel model, termed RefineCap, that refines the output vocabulary of the language decoder using decoder-guided…
This paper attacks the challenging problem of video retrieval by text. In such a retrieval paradigm, an end user searches for unlabeled videos by ad-hoc queries described exclusively in the form of a natural-language sentence, with no…
In data dominated systems and applications, a concept of representing words in a numerical format has gained a lot of attention. There are a few approaches used to generate such a representation. An interesting issue that should be…
Data visualization is the process by which data of any size or dimensionality is processed to produce an understandable set of data in a lower dimensionality, allowing it to be manipulated and understood more easily by people. The goal of…
In this paper, we propose a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, which is widely used in computer vision. After each word in a collection of documents is…