Related papers: Scattertext: a Browser-Based Tool for Visualizing …

Gatherplot: A Non-Overlapping Scatterplot

Scatterplots are a common tool for exploring multidimensional datasets, especially in the form of scatterplot matrices (SPLOMs). However, scatterplots suffer from overplotting when categorical variables are mapped to one or two axes, or the…

Human-Computer Interaction · Computer Science 2025-11-18 Deokgun Park , Sung-Hee Kim , Niklas Elmqvist

SCATTERSEARCH: Visual Querying of Scatterplot Visualizations

Scatterplots are one of the simplest and most commonly-used visualizations for understanding quantitative, multidimensional data. However, since scatterplots only depict two attributes at a time, analysts often need to manually generate and…

Human-Computer Interaction · Computer Science 2019-07-30 Doris Jung-Lin Lee , Jaewoo Kim , Renxuan Wang , Aditya Parameswaran

ChartText: Linking Text with Charts in Documents

Recent works show that interactive documents connecting text with visualizations facilitate reading comprehension. However, creating this type of content requires specialized knowledge. We present ChartText, a method that links text with…

Human-Computer Interaction · Computer Science 2022-01-14 Joao Pinheiro , Jorge Poco

ConceptScope: Organizing and Visualizing Knowledge in Documents based on Domain Ontology

Current text visualization techniques typically provide overviews of document content and structure using intrinsic properties such as term frequencies, co-occurrences, and sentence structures. Such visualizations lack conceptual overviews…

Human-Computer Interaction · Computer Science 2021-03-03 Xiaoyu Zhang , Senthil Chandrasegaran , Kwan-Liu Ma

PerspectroScope: A Window to the World of Diverse Perspectives

This work presents PerspectroScope, a web-based system which lets users query a discussion-worthy natural language claim, and extract and visualize various perspectives in support or against the claim, along with evidence supporting each…

Computation and Language · Computer Science 2019-06-13 Sihao Chen , Daniel Khashabi , Chris Callison-Burch , Dan Roth

ClassSPLOM -- A Scatterplot Matrix to Visualize Separation of Multiclass Multidimensional Data

In multiclass classification of multidimensional data, the user wants to build a model of the classes to predict the label of unseen data. The model is trained on the data and tested on unseen data with known labels to evaluate its quality.…

Human-Computer Interaction · Computer Science 2022-02-01 Michael Aupetit , Ahmed Ali

Word Storms: Multiples of Word Clouds for Visual Comparison of Documents

Word clouds are a popular tool for visualizing documents, but they are not a good tool for comparing documents, because identical words are not presented consistently across different clouds. We introduce the concept of word storms, a…

Information Retrieval · Computer Science 2013-01-04 Quim Castella , Charles Sutton

ScatterShot: Interactive In-context Example Curation for Text Transformation

The in-context learning capabilities of LLMs like GPT-3 allow annotators to customize an LLM to their specific tasks with a small number of examples. However, users tend to include only the most obvious patterns when crafting examples,…

Human-Computer Interaction · Computer Science 2023-02-16 Tongshuang Wu , Hua Shen , Daniel S. Weld , Jeffrey Heer , Marco Tulio Ribeiro

VisText: A Benchmark for Semantically Rich Chart Captioning

Captions that describe or explain charts help improve recall and comprehension of the depicted data and provide a more accessible medium for people with visual disabilities. However, current approaches for automatically generating such…

Computer Vision and Pattern Recognition · Computer Science 2023-07-12 Benny J. Tang , Angie Boggust , Arvind Satyanarayan

Generalized Word Shift Graphs: A Method for Visualizing and Explaining Pairwise Comparisons Between Texts

A common task in computational text analyses is to quantify how two corpora differ according to a measurement like word frequency, sentiment, or information content. However, collapsing the texts' rich stories into a single number is often…

Computation and Language · Computer Science 2021-02-05 Ryan J. Gallagher , Morgan R. Frank , Lewis Mitchell , Aaron J. Schwartz , Andrew J. Reagan , Christopher M. Danforth , Peter Sheridan Dodds

WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data

We introduce WordScape, a novel pipeline for the creation of cross-disciplinary, multilingual corpora comprising millions of pages with annotations for document layout detection. Relating visual and textual items on document pages has…

Machine Learning · Computer Science 2023-12-19 Maurice Weber , Carlo Siebenschuh , Rory Butler , Anton Alexandrov , Valdemar Thanner , Georgios Tsolakis , Haris Jabbar , Ian Foster , Bo Li , Rick Stevens , Ce Zhang

Glitter: Visualizing Lexical Surprisal for Readability in Administrative Texts

This work investigates how measuring information entropy of text can be used to estimate its readability. We propose a visualization framework that can be used to approximate information entropy of text using multiple language models and…

Computation and Language · Computer Science 2026-01-12 Jan Černý , Ivana Kvapilíková , Silvie Cinková

Text Windows and Phrases Differing by Discipline, Location in Document, and Syntactic Structure

Knowledge of window style, content, location and grammatical structure may be used to classify documents as originating within a particular discipline or may be used to place a document on a theory versus practice spectrum. This distinction…

cmp-lg · Computer Science 2008-02-03 Robert M. Losee

On The Spatiotemporal Burstiness of Terms

Thousands of documents are made available to the users via the web on a daily basis. One of the most extensively studied problems in the context of such document streams is burst identification. Given a term t, a burst is generally…

Databases · Computer Science 2012-05-31 Theodoros Lappas , Marcos R. Vieira , Dimitrios Gunopulos , Vassilis J. Tsotras

SpaText: Spatio-Textual Representation for Controllable Image Generation

Recent text-to-image diffusion models are able to generate convincing results of unprecedented quality. However, it is nearly impossible to control the shapes of different regions/objects or their layout in a fine-grained fashion. Previous…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Omri Avrahami , Thomas Hayes , Oran Gafni , Sonal Gupta , Yaniv Taigman , Devi Parikh , Dani Lischinski , Ohad Fried , Xi Yin

TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora

Embeddings of words and concepts capture syntactic and semantic regularities of language; however, they have seen limited use as tools to study characteristics of different corpora and how they relate to one another. We introduce…

Computation and Language · Computer Science 2021-03-23 Denis Newman-Griffis , Venkatesh Sivaraman , Adam Perer , Eric Fosler-Lussier , Harry Hochheiser

Dialectograms: Machine Learning Differences between Discursive Communities

Word embeddings provide an unsupervised way to understand differences in word usage between discursive communities. A number of recent papers have focused on identifying words that are used differently by two or more communities. But word…

Computation and Language · Computer Science 2023-02-14 Thyge Enggaard , August Lohse , Morten Axel Pedersen , Sune Lehmann

WordBias: An Interactive Visual Tool for Discovering Intersectional Biases Encoded in Word Embeddings

Intersectional bias is a bias caused by an overlap of multiple social factors like gender, sexuality, race, disability, religion, etc. A recent study has shown that word embedding models can be laden with biases against intersectional…

Computation and Language · Computer Science 2021-09-08 Bhavya Ghai , Md Naimul Hoque , Klaus Mueller

Visualizing Linguistic Shift

Neural network based models are a very powerful tool for creating word embeddings, the objective of these models is to group similar words together. These embeddings have been used as features to improve results in various applications such…

Computation and Language · Computer Science 2016-11-27 Salman Mahmood , Rami Al-Rfou , Klaus Mueller

Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings

A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document…

Machine Learning · Computer Science 2022-03-16 Dongsheng Wang , Dandan Guo , He Zhao , Huangjie Zheng , Korawat Tanwisuth , Bo Chen , Mingyuan Zhou