Related papers: How Do Captions Affect Visualization Reading?

Towards Understanding How Readers Integrate Charts and Captions: A Case Study with Line Charts

Charts often contain visually prominent features that draw attention to aspects of the data and include text captions that emphasize aspects of the data. Through a crowdsourced study, we explore how readers gather takeaways when considering…

Human-Computer Interaction · Computer Science 2021-01-21 Dae Hyun Kim , Vidya Setlur , Maneesh Agrawala

Striking a Balance: Reader Takeaways and Preferences when Integrating Text and Charts

While visualizations are an effective way to represent insights about information, they rarely stand alone. When designing a visualization, text is often added to provide additional context and guidance for the reader. However, there is…

Human-Computer Interaction · Computer Science 2022-09-23 Chase Stokes , Vidya Setlur , Bridget Cogley , Arvind Satyanarayan , Marti Hearst

Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention

Image captioning has been recently gaining a lot of attention thanks to the impressive achievements shown by deep captioning architectures, which combine Convolutional Neural Networks to extract image representations, and Recurrent Neural…

Computer Vision and Pattern Recognition · Computer Science 2018-05-22 Marcella Cornia , Lorenzo Baraldi , Giuseppe Serra , Rita Cucchiara

VisText: A Benchmark for Semantically Rich Chart Captioning

Captions that describe or explain charts help improve recall and comprehension of the depicted data and provide a more accessible medium for people with visual disabilities. However, current approaches for automatically generating such…

Computer Vision and Pattern Recognition · Computer Science 2023-07-12 Benny J. Tang , Angie Boggust , Arvind Satyanarayan

What Is the Difference Between a Mountain and a Molehill? Quantifying Semantic Labeling of Visual Features in Line Charts

Relevant language describing visual features in charts can be useful for authoring captions and summaries about the charts to help with readers' takeaways. To better understand the interplay between concepts that describe visual features…

Human-Computer Interaction · Computer Science 2023-08-04 Dennis Bromley , Vidya Setlur

Figure Captioning with Reasoning and Sequence-Level Training

Figures, such as bar charts, pie charts, and line plots, are widely used to convey important information in a concise format. They are usually human-friendly but difficult for computers to process automatically. In this work, we investigate…

Computer Vision and Pattern Recognition · Computer Science 2019-06-10 Charles Chen , Ruiyi Zhang , Eunyee Koh , Sungchul Kim , Scott Cohen , Tong Yu , Ryan Rossi , Razvan Bunescu

Improving Image Captioning with Better Use of Captions

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community. In this paper, we present a novel image captioning architecture to better explore semantics…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Zhan Shi , Xu Zhou , Xipeng Qiu , Xiaodan Zhu

Same Data, Diverging Perspectives: The Power of Visualizations to Elicit Competing Interpretations

People routinely rely on data to make decisions, but the process can be riddled with biases. We show that patterns in data might be noticed first or more strongly, depending on how the data is visually represented or what the viewer finds…

Human-Computer Interaction · Computer Science 2024-01-18 Cindy Xiong Bearfield , Lisanne van Weelden , Adam Waytz , Steven Franconeri

What Does the Chart Say? Grouping Cues Guide Viewer Comparisons and Conclusions in Bar Charts

Reading a visualization is like reading a paragraph. Each sentence is a comparison: the mean of these is higher than those; this difference is smaller than that. What determines which comparisons are made first? The viewer's goals and…

Human-Computer Interaction · Computer Science 2023-10-04 Cindy Xiong Bearfield , Chase Stokes , Andrew Lovett , Steven Franconeri

Boost Image Captioning with Knowledge Reasoning

Automatically generating a human-like description for a given image is a potential research in artificial intelligence, which has attracted a great of attention recently. Most of the existing attention methods explore the mapping…

Computer Vision and Pattern Recognition · Computer Science 2020-11-03 Feicheng Huang , Zhixin Li , Haiyang Wei , Canlong Zhang , Huifang Ma

Look, Read and Enrich. Learning from Scientific Figures and their Captions

Compared to natural images, understanding scientific figures is particularly hard for machines. However, there is a valuable source of information in scientific literature that until now has remained untapped: the correspondence between a…

Artificial Intelligence · Computer Science 2019-09-20 Jose Manuel Gomez-Perez , Raul Ortega

Paying Attention to Descriptions Generated by Image Captioning Models

To bridge the gap between humans and machines in image understanding and describing, we need further insight into how people describe a perceived scene. In this paper, we study the agreement between bottom-up saliency-based visual attention…

Computer Vision and Pattern Recognition · Computer Science 2017-08-07 Hamed R. Tavakoli , Rakshith Shetty , Ali Borji , Jorma Laaksonen

Senti-Attend: Image Captioning using Sentiment and Attention

There has been much recent work on image captioning models that describe the factual aspects of an image. Recently, some models have incorporated non-factual aspects into the captions, such as sentiment or style. However, such models…

Computer Vision and Pattern Recognition · Computer Science 2018-11-27 Omid Mohamad Nezami , Mark Dras , Stephen Wan , Cecile Paris

Image Captioning using Facial Expression and Attention

Benefiting from advances in machine vision and natural language processing techniques, current image captioning systems are able to generate detailed visual descriptions. For the most part, these descriptions represent an objective…

Computer Vision and Pattern Recognition · Computer Science 2020-04-16 Omid Mohamad Nezami , Mark Dras , Stephen Wan , Cecile Paris

Contextualization or Rationalization? The Effect of Causal Priors on Data Visualization Interpretation

Understanding how individuals interpret charts is a crucial concern for visual data communication. This imperative has motivated a number of studies, including past work demonstrating that causal priors -- a priori beliefs about causal…

Human-Computer Interaction · Computer Science 2026-02-11 Arran Zeyu Wang , David Borland , Estella Calcaterra , David Gotz

LineCap: Line Charts for Data Visualization Captioning Models

Data visualization captions help readers understand the purpose of a visualization and are crucial for individuals with visual impairments. The prevalence of poor figure captions and the successful application of deep learning approaches to…

Computer Vision and Pattern Recognition · Computer Science 2022-07-18 Anita Mahinpei , Zona Kostic , Chris Tanner

Are scene graphs good enough to improve Image Captioning?

Many top-performing image captioning models rely solely on object features computed with an object detection model to generate image descriptions. However, recent studies propose to directly use scene graphs to introduce information about…

Computer Vision and Pattern Recognition · Computer Science 2020-10-28 Victor Milewski , Marie-Francine Moens , Iacer Calixto

Face-Cap: Image Captioning using Facial Expression Analysis

Image captioning is the process of generating a natural language description of an image. Most current image captioning models, however, do not take into account the emotional aspect of an image, which is very relevant to activities and…

Computer Vision and Pattern Recognition · Computer Science 2019-01-28 Omid Mohamad Nezami , Mark Dras , Peter Anderson , Len Hamey

ChartCap: Mitigating Hallucination of Dense Chart Captioning

Generating accurate, informative, and hallucination-free captions for charts remains challenging for vision language models, primarily due to the lack of large-scale, high-quality datasets of real-world charts. However, existing real-world…

Computer Vision and Pattern Recognition · Computer Science 2025-08-06 Junyoung Lim , Jaewoo Ahn , Gunhee Kim

Linguistic Structures as Weak Supervision for Visual Scene Graph Generation

Prior work in scene graph generation requires categorical supervision at the level of triplets - subjects and objects, and predicates that relate them, either with or without bounding box information. However, scene graph generation is a…

Computer Vision and Pattern Recognition · Computer Science 2021-05-31 Keren Ye , Adriana Kovashka