Related papers: Language-Driven Interactive Shadow Detection

Revisiting Shadow Detection from a Vision-Language Perspective

Shadow detection is commonly formulated as a vision-driven dense prediction problem, where models rely primarily on pixel-wise visual supervision to distinguish shadows from non-shadow regions. However, this formulation can become…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 Yonghui Wang , Wengang Zhou , Hao Feng , Houqiang Li

Triple-cooperative Video Shadow Detection

Shadow detection in a single image has received significant research interest in recent years. However, much fewer works have been explored in shadow detection over dynamic scenes. The bottleneck is the lack of a well-established dataset…

Computer Vision and Pattern Recognition · Computer Science 2021-03-12 Zhihao Chen , Liang Wan , Lei Zhu , Jia Shen , Huazhu Fu , Wennan Liu , Jing Qin

Fine-Context Shadow Detection using Shadow Removal

Current shadow detection methods perform poorly when detecting shadow regions that are small, unclear or have blurry edges. In this work, we attempt to address this problem on two fronts. First, we propose a Fine Context-aware Shadow…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Jeya Maria Jose Valanarasu , Vishal M. Patel

Sliding Window Recurrent Network for Efficient Video Super-Resolution

Video super-resolution (VSR) is the task of restoring high-resolution frames from a sequence of low-resolution inputs. Different from single image super-resolution, VSR can utilize frames' temporal information to reconstruct results with…

Image and Video Processing · Electrical Eng. & Systems 2022-08-25 Wenyi Lian , Wenjing Lian

Prompt-Aware Controllable Shadow Removal

Shadow removal aims to restore the image content in shadowed regions. While deep learning-based methods have shown promising results, they still face key challenges: 1) uncontrolled removal of all shadows, or 2) controllable removal but…

Computer Vision and Pattern Recognition · Computer Science 2025-02-04 Kerui Chen , Zhiliang Wu , Wenjin Hou , Kun Li , Hehe Fan , Yi Yang

Text Descriptions are Compressive and Invariant Representations for Visual Learning

Modern image classification is based upon directly predicting classes via large discriminative networks, which do not directly contain information about the intuitive visual features that may constitute a classification decision. Recently,…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Zhili Feng , Anna Bair , J. Zico Kolter

Learning Physical-Spatio-Temporal Features for Video Shadow Removal

Shadow removal in a single image has received increasing attention in recent years. However, removing shadows over dynamic scenes remains largely under-explored. In this paper, we propose the first data-driven video shadow removal model,…

Computer Vision and Pattern Recognition · Computer Science 2023-03-17 Zhihao Chen , Liang Wan , Yefan Xiao , Lei Zhu , Huazhu Fu

Video Instance Shadow Detection Under the Sun and Sky

Instance shadow detection, crucial for applications such as photo editing and light direction estimation, has undergone significant advancements in predicting shadow instances, object instances, and their associations. The extension of this…

Computer Vision and Pattern Recognition · Computer Science 2024-09-25 Zhenghao Xing , Tianyu Wang , Xiaowei Hu , Haoran Wu , Chi-Wing Fu , Pheng-Ann Heng

Regional Attention for Shadow Removal

Shadow, as a natural consequence of light interacting with objects, plays a crucial role in shaping the aesthetics of an image, which however also impairs the content visibility and overall visual quality. Recent shadow removal approaches…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Hengxing Liu , Mingjia Li , Xiaojie Guo

Exploiting Visual Semantic Reasoning for Video-Text Retrieval

Video retrieval is a challenging research topic bridging the vision and language areas and has attracted broad attention in recent years. Previous works have been devoted to representing videos by directly encoding from frame-level…

Computer Vision and Pattern Recognition · Computer Science 2020-06-17 Zerun Feng , Zhimin Zeng , Caili Guo , Zheng Li

R2SM: Referring and Reasoning for Selective Masks

We introduce a new task, Referring and Reasoning for Selective Masks (R2SM), which extends text-guided segmentation by incorporating mask-type selection driven by user intent. This task challenges vision-language models to determine whether…

Computer Vision and Pattern Recognition · Computer Science 2025-06-03 Yu-Lin Shih , Wei-En Tai , Cheng Sun , Yu-Chiang Frank Wang , Hwann-Tzong Chen

Weakly-Supervised Referring Video Object Segmentation through Text Supervision

Referring video object segmentation (RVOS) aims to segment the target instance in a video, referred by a text expression. Conventional approaches are mostly supervised learning, requiring expensive pixel-level mask annotations. To tackle…

Computer Vision and Pattern Recognition · Computer Science 2026-04-22 Miaojing Shi , Jun Huang , Zijie Yue , Hanli Wang

Enhancing Sa2VA for Referent Video Object Segmentation: 2nd Solution for 7th LSVOS RVOS Track

Referential Video Object Segmentation (RVOS) aims to segment all objects in a video that match a given natural language description, bridging the gap between vision and language understanding. Recent work, such as Sa2VA, combines Large…

Computer Vision and Pattern Recognition · Computer Science 2025-09-22 Ran Hong , Feng Lu , Leilei Cao , An Yan , Youhai Jiang , Fengjie Zhu

Referring Change Detection in Remote Sensing Imagery

Change detection in remote sensing imagery is essential for applications such as urban planning, environmental monitoring, and disaster management. Traditional change detection methods typically identify all changes between two temporal…

Computer Vision and Pattern Recognition · Computer Science 2025-12-15 Yilmaz Korkmaz , Jay N. Paranjape , Celso M. de Melo , Vishal M. Patel

Natural Language Guided Visual Relationship Detection

Reasoning about the relationships between object pairs in images is a crucial task for holistic scene understanding. Most of the existing works treat this task as a pure visual classification task: each type of relationship or phrase is…

Computer Vision and Pattern Recognition · Computer Science 2017-11-22 Wentong Liao , Lin Shuai , Bodo Rosenhahn , Michael Ying Yang

The Role of the Input in Natural Language Video Description

Natural Language Video Description (NLVD) has recently received strong interest in the Computer Vision, Natural Language Processing (NLP), Multimedia, and Autonomous Robotics communities. The State-of-the-Art (SotA) approaches obtained…

Computer Vision and Pattern Recognition · Computer Science 2021-02-11 Silvia Cascianelli , Gabriele Costante , Alessandro Devo , Thomas A. Ciarfuglia , Paolo Valigi , Mario L. Fravolini

DTTNet: Improving Video Shadow Detection via Dark-Aware Guidance and Tokenized Temporal Modeling

Video shadow detection confronts two entwined difficulties: distinguishing shadows from complex backgrounds and modeling dynamic shadow deformations under varying illumination. To address shadow-background ambiguity, we leverage linguistic…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Zhicheng Li , Kunyang Sun , Rui Yao , Hancheng Zhu , Fuyuan Hu , Jiaqi Zhao , Zhiwen Shao , Yong Zhou

SSVOD: Semi-Supervised Video Object Detection with Sparse Annotations

Despite significant progress in semi-supervised learning for image object detection, several key issues are yet to be addressed for video object detection: (1) Achieving good performance for supervised video object detection greatly depends…

Computer Vision and Pattern Recognition · Computer Science 2023-11-14 Tanvir Mahmud , Chun-Hao Liu , Burhaneddin Yaman , Diana Marculescu

Direction-aware Spatial Context Features for Shadow Detection

Shadow detection is a fundamental and challenging task, since it requires an understanding of global image semantics and there are various backgrounds around shadows. This paper presents a novel network for shadow detection by analyzing…

Computer Vision and Pattern Recognition · Computer Science 2020-05-19 Xiaowei Hu , Lei Zhu , Chi-Wing Fu , Jing Qin , Pheng-Ann Heng

SE-BSFV: Online Subspace Learning based Shadow Enhancement and Background Suppression for ViSAR under Complex Background

Video synthetic aperture radar (ViSAR) has attracted substantial attention in the moving target detection (MTD) field due to its ability to continuously monitor changes in the target area. In ViSAR, the moving targets' shadows will not…

Computer Vision and Pattern Recognition · Computer Science 2025-01-17 Shangqu Yan , Chenyang Luo , Yaowen Fu , Wenpeng Zhang , Wei Yang , Ruofeng Yu