Related papers: VScript: Controllable Script Generation with Visua…

ScriptWriter: Narrative-Guided Script Generation

It is appealing to have a system that generates a story or scripts automatically from a story-line, even though this is still out of our reach. In dialogue systems, it would also be useful to drive dialogues by a dialogue plan. In this…

Computation and Language · Computer Science 2020-11-17 Yutao Zhu , Ruihua Song , Zhicheng Dou , Jian-Yun Nie , Jin Zhou

DialogueScript: Using Dialogue Agents to Produce a Script

We present a novel approach to generating scripts by using agents with different personality types. To manage character interaction in the script, we employ simulated dramatic networks. Automatic and human evaluation on multiple criteria…

Computation and Language · Computer Science 2022-06-20 Patrícia Schmidtová , Dávid Javorský , Christián Mikláš , Tomáš Musil , Rudolf Rosa , Ondřej Dušek

Script2Screen: Supporting Dialogue Scriptwriting with Interactive Audiovisual Generation

Scriptwriting has traditionally been text-centric, a modality that only partially conveys the produced audiovisual experience. A formative study with professional writers informed us that connecting textual and audiovisual modalities can…

Human-Computer Interaction · Computer Science 2026-04-09 Zhecheng Wang , Jiaju Ma , Eitan Grinspun , Tovi Grossman , Bryan Wang

Visual Story Generation Based on Emotion and Keywords

Automated visual story generation aims to produce stories with corresponding illustrations that exhibit coherence, progression, and adherence to characters' emotional development. This work proposes a story generation pipeline to co-create…

Artificial Intelligence · Computer Science 2023-01-10 Yuetian Chen , Ruohua Li , Bowen Shi , Peiru Liu , Mei Si

MULTISCRIPT: Multimodal Script Learning for Supporting Open Domain Everyday Tasks

Automatically generating scripts (i.e. sequences of key steps described in text) from video demonstrations and reasoning about the subsequent steps are crucial to the modern AI virtual assistants to guide humans to complete everyday tasks,…

Computation and Language · Computer Science 2024-01-22 Jingyuan Qi , Minqian Liu , Ying Shen , Zhiyang Xu , Lifu Huang

AVscript: Accessible Video Editing with Audio-Visual Scripts

Sighted and blind and low vision (BLV) creators alike use videos to communicate with broad audiences. Yet, video editing remains inaccessible to BLV creators. Our formative study revealed that current video editing tools make it difficult…

Human-Computer Interaction · Computer Science 2023-03-01 Mina Huh , Saelyne Yang , Yi-Hao Peng , Xiang 'Anthony' Chen , Young-Ho Kim , Amy Pavel

OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video

Current multimodal large language models (MLLMs) have demonstrated remarkable capabilities in short-form video understanding, yet translating long-form cinematic videos into detailed, temporally grounded scripts remains a significant…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Junfu Pu , Yuxin Chen , Teng Wang , Ying Shan

Controlled Cue Generation for Play Scripts

In this paper, we use a large-scale play scripts dataset to propose the novel task of theatrical cue generation from dialogues. Using over one million lines of dialogue and cues, we approach the problem of cue generation as a controlled…

Computation and Language · Computer Science 2021-12-15 Alara Dirik , Hilal Donmez , Pinar Yanardag

Character-Centered Dialogue Generation from Scene-Level Prompts

Recent advances in scene-based video generation enable coherent visual narratives from structured prompts, yet a key aspect of storytelling -- character-driven dialogue and speech -- remains underexplored. We present a modular pipeline that…

Computer Vision and Pattern Recognition · Computer Science 2026-05-20 Taewon Kang , Ming C. Lin

SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model

We introduce SceneScript, a method that directly produces full scene models as a sequence of structured language commands using an autoregressive, token-based approach. Our proposed scene representation is inspired by recent successes in…

Computer Vision and Pattern Recognition · Computer Science 2024-03-21 Armen Avetisyan , Christopher Xie , Henry Howard-Jenkins , Tsun-Yi Yang , Samir Aroudj , Suvam Patra , Fuyang Zhang , Duncan Frost , Luke Holland , Campbell Orme , Jakob Engel , Edward Miller , Richard Newcombe , Vasileios Balntas

Generating Diverse Story Continuations with Controllable Semantics

We propose a simple and effective modeling framework for controlled generation of multiple, diverse outputs. We focus on the setting of generating the next sentence of a story given its context. As controllable dimensions, we consider…

Computation and Language · Computer Science 2020-06-03 Lifu Tu , Xiaoan Ding , Dong Yu , Kevin Gimpel

ViviDoc: Generating Interactive Documents through Human-Agent Collaboration

Interactive documents help readers engage with complex ideas through dynamic visualization, interactive animations, and exploratory interfaces. However, creating such documents remains costly, as it requires both domain expertise and web…

Human-Computer Interaction · Computer Science 2026-03-31 Yinghao Tang , Yupeng Xie , Yingchaojie Feng , Tingfeng Lan , Jiale Lao , Yue Cheng , Wei Chen

Telling Creative Stories Using Generative Visual Aids

Can visual artworks created using generative visual algorithms inspire human creativity in storytelling? We asked writers to write creative stories from a starting prompt, and provided them with visuals created by generative AI models from…

Human-Computer Interaction · Computer Science 2021-10-29 Safinah Ali , Devi Parikh

ScriptViz: A Visualization Tool to Aid Scriptwriting based on a Large Movie Database

Scriptwriters usually rely on their mental visualization to create a vivid story by using their imagination to see, feel, and experience the scenes they are writing. Besides mental visualization, they often refer to existing images or…

Human-Computer Interaction · Computer Science 2024-10-07 Anyi Rao , Jean-Peïc Chou , Maneesh Agrawala

Prompt-Driven Agentic Video Editing System: Autonomous Comprehension of Long-Form, Story-Driven Media

Creators struggle to edit long-form, narrative-rich videos not because of UI complexity, but due to the cognitive demands of searching, storyboarding, and sequencing hours of footage. Existing transcript- or embedding-based methods fall…

Artificial Intelligence · Computer Science 2025-09-30 Zihan Ding , Xinyi Wang , Junlong Chen , Per Ola Kristensson , Junxiao Shen

Sequentially Controlled Text Generation

While GPT-2 generates sentences that are remarkably human-like, longer documents can ramble and do not follow human-like writing structure. We study the problem of imposing structure on long-range text. We propose a novel controlled text…

Computation and Language · Computer Science 2023-01-09 Alexander Spangher , Xinyu Hua , Yao Ming , Nanyun Peng

ScenarioControl: Vision-Language Controllable Vectorized Latent Scenario Generation

We introduce ScenarioControl, the first vision-language control mechanism for learned driving scenario generation. Given a text prompt or an input image, Scenario-Control synthesizes diverse, realistic 3D scenario rollouts - including map,…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Lili Gao , Yanbo Xu , William Koch , Samuele Ruffino , Luke Rowe , Behdad Chalaki , Dmitriy Rivkin , Julian Ost , Roger Girgis , Mario Bijelic , Felix Heide

A Pipeline for Creative Visual Storytelling

Computational visual storytelling produces a textual description of events and interpretations depicted in a sequence of images. These texts are made possible by advances and cross-disciplinary approaches in natural language processing,…

Computation and Language · Computer Science 2018-07-24 Stephanie M. Lukin , Reginald Hobbs , Clare R. Voss

A Fast Text-Driven Approach for Generating Artistic Content

In this work, we propose a complete framework that generates visual art. Unlike previous stylization methods that are not flexible with style parameters (i.e., they allow stylization with only one style image, a single stylization text or…

Computer Vision and Pattern Recognition · Computer Science 2025-08-08 Marian Lupascu , Ryan Murdock , Ionut Mironica , Yijun Li

Dialogue Director: Bridging the Gap in Dialogue Visualization for Multimodal Storytelling

Recent advances in AI-driven storytelling have enhanced video generation and story visualization. However, translating dialogue-centric scripts into coherent storyboards remains a significant challenge due to limited script detail,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Min Zhang , Zilin Wang , Liyan Chen , Kunhong Liu , Juncong Lin