English
Related papers

Related papers: AVscript: Accessible Video Editing with Audio-Visu…

200 papers

Audio Description (AD) provides essential access to visual media for blind and low vision (BLV) audiences. Yet current AD production tools remain largely inaccessible to BLV video creators, who possess valuable expertise but face barriers…

Human-Computer Interaction · Computer Science 2026-02-10 Franklin Mingzhe Li , Michael Xieyang Liu , Cynthia L. Bennett , Shaun K. Kane

Audio descriptions (AD) make videos accessible for blind and low vision (BLV) users by describing visual elements that cannot be understood from the main audio track. AD created by professionals or novice describers is time-consuming and…

Human-Computer Interaction · Computer Science 2025-05-29 Maryam Cheema , Hasti Seifi , Pooyan Fazli

Short videos on platforms such as TikTok, Instagram Reels, and YouTube Shorts (i.e. short-form videos) have become a primary source of information and entertainment. Many short-form videos are inaccessible to blind and low vision (BLV)…

Human-Computer Interaction · Computer Science 2024-02-19 Tess Van Daele , Akhil Iyer , Yuning Zhang , Jalyn C. Derry , Mina Huh , Amy Pavel

Video content remains largely inaccessible to blind and low-vision (BLV) users. To address this, we introduce a prototype that leverages a multimodal agent - powered by a novel conversational architecture using a multimodal large language…

In video production, inserting B-roll is a widely used technique to enrich the story and make a video more engaging. However, determining the right content and positions of B-roll and actually inserting it within the main footage can be…

Human-Computer Interaction · Computer Science 2019-03-01 Bernd Huber , Hijung Valentina Shin , Bryan Russell , Oliver Wang , Gautham J. Mysore

In order to offer a customized script tool and inspire professional scriptwriters, we present VScript. It is a controllable pipeline that generates complete scripts, including dialogues and scene descriptions, as well as presents visually…

Computation and Language · Computer Science 2022-11-24 Ziwei Ji , Yan Xu , I-Tsun Cheng , Samuel Cahyawijaya , Rita Frieske , Etsuko Ishii , Min Zeng , Andrea Madotto , Pascale Fung

Modern visual effects (VFX) software has made it possible for skilled artists to create imagery of virtually anything. However, the creation process remains laborious, complex, and largely inaccessible to everyday users. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Hao-Yu Hsu , Zhi-Hao Lin , Albert Zhai , Hongchi Xia , Shenlong Wang

Blind and low vision (BLV) creators use images to communicate with sighted audiences. However, creating or retrieving images is challenging for BLV creators as it is difficult to use authoring tools or assess image search results. Thus,…

Human-Computer Interaction · Computer Science 2023-07-24 Mina Huh , Yi-Hao Peng , Amy Pavel

Image editing is an iterative process that requires precise visual evaluation and manipulation for the output to match the editing intent. However, current image editing tools do not provide accessible interaction nor sufficient feedback…

Human-Computer Interaction · Computer Science 2024-08-14 Ruei-Che Chang , Yuxuan Liu , Lotus Zhang , Anhong Guo

While audio description (AD) is the standard approach for making videos accessible to blind and low vision (BLV) people, existing AD guidelines do not consider BLV users' varied preferences across viewing scenarios. These scenarios range…

Human-Computer Interaction · Computer Science 2024-03-19 Lucy Jiang , Crescentia Jung , Mahika Phutane , Abigale Stangl , Shiri Azenkot

Advances in multimodal large language models enable automatic video narration and question answering (VQA), offering scalable alternatives to labor-intensive, human-authored audio descriptions (ADs) for blind and low vision (BLV) viewers.…

Human-Computer Interaction · Computer Science 2026-03-17 Maryam Cheema , Sina Elahimanesh , Pooyan Fazli , Hasti Seifi

Video descriptions are crucial for blind and low vision (BLV) users to access visual content. However, current artificial intelligence models for generating descriptions often fall short due to limitations in the quality of human…

Computer Vision and Pattern Recognition · Computer Science 2025-03-03 Chaoyu Li , Sid Padmanabhuni , Maryam Cheema , Hasti Seifi , Pooyan Fazli

Audio descriptions make videos accessible to those who cannot see them by describing visual content in audio. Producing audio descriptions is challenging due to the synchronous nature of the audio description that must fit into gaps of…

Human-Computer Interaction · Computer Science 2020-10-09 Amy Pavel , Gabriel Reyes , Jeffrey P. Bigham

People use videos to learn new recipes, exercises, and crafts. Such videos remain difficult for blind and low vision (BLV) people to follow as they rely on visual comparison. Our observations of visual rehabilitation therapists (VRTs)…

Human-Computer Interaction · Computer Science 2025-07-28 Mina Huh , Zihui Xue , Ujjaini Das , Kumar Ashutosh , Kristen Grauman , Amy Pavel

For blind and low-vision (BLV) individuals, digital math communication is uniquely difficult due to the lack of accessible tools. Currently, the state of the art is either code-based, like LaTeX, or WYSIWYG, like visual editors. However,…

Human-Computer Interaction · Computer Science 2026-03-17 Kenneth Ge , JooYoung Seo

Blind and low-vision (BLV) people use audio descriptions (ADs) to access videos. However, current ADs are unalterable by end users, thus are incapable of supporting BLV individuals' potentially diverse needs and preferences. This research…

Human-Computer Interaction · Computer Science 2024-08-22 Rosiana Natalie , Ruei-Che Chang , Smitha Sheshadri , Anhong Guo , Kotaro Hara

Blind and low vision (BLV) developers create websites to share knowledge and showcase their work. A well-designed website can engage audiences and deliver information effectively, yet it remains challenging for BLV developers to review…

Human-Computer Interaction · Computer Science 2024-07-26 Mina Huh , Amy Pavel

Effective visual accessibility in Virtual Reality (VR) is crucial for Blind and Low Vision (BLV) users. However, designing visual accessibility systems is challenging due to the complexity of 3D VR environments and the need for techniques…

Human-Computer Interaction · Computer Science 2025-02-07 Junlong Chen , Rosella P. Galindo Esparza , Vanja Garaj , Per Ola Kristensson , John Dudley

While image editing has advanced rapidly, video editing remains less explored, facing challenges in consistency, control, and generalization. We study the design space of data, architecture, and control, and introduce \emph{EasyV2V}, a…

Computer Vision and Pattern Recognition · Computer Science 2025-12-19 Jinjie Mai , Chaoyang Wang , Guocheng Gordon Qian , Willi Menapace , Sergey Tulyakov , Bernard Ghanem , Peter Wonka , Ashkan Mirzaei

Audio description (AD) makes video content accessible to blind and low-vision (BLV) audiences, but producing high-quality descriptions is resource-intensive. Automated AD offers scalability, and prior studies show human-in-the-loop editing…

Human-Computer Interaction · Computer Science 2026-02-04 Lana Do , Shasta Ihorn , Charity Pitcher-Cooper , Juvenal Francisco Barajas , Gio Jung , Xuan Duy Anh Nguyen , Sanjay Mirani , Ilmi Yoon
‹ Prev 1 2 3 10 Next ›