English
Related papers

Related papers: DiffAugment: Diffusion based Long-Tailed Visual Re…

200 papers

Visual relation detection (VRD) aims to identify relationships (or interactions) between object pairs in an image. Although recent VRD models have achieved impressive performance, they are all restricted to pre-defined relation categories,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Kaifeng Gao , Siqi Chen , Hanwang Zhang , Jun Xiao , Yueting Zhuang , Qianru Sun

The challenge in fine-grained visual categorization lies in how to explore the subtle differences between different subclasses and achieve accurate discrimination. Previous research has relied on large-scale annotated data and pre-trained…

Computer Vision and Pattern Recognition · Computer Science 2024-05-16 Tianxu Wu , Shuo Ye , Shuhuang Chen , Qinmu Peng , Xinge You

We address the problem of Visual Relationship Detection (VRD) which aims to describe the relationships between pairs of objects in the form of triplets of (subject, predicate, object). We observe that given a pair of bounding box proposals,…

Computer Vision and Pattern Recognition · Computer Science 2019-11-25 Mohammed Haroon Dupty , Zhen Zhang , Wee Sun Lee

Visual Relation Detection (VRD) aims to detect relationships between objects for image understanding. Most existing VRD methods rely on thousands of training samples of each relationship to achieve satisfactory performance. Some recent…

Computer Vision and Pattern Recognition · Computer Science 2023-03-10 Tianyu Yu , Yangning Li , Jiaoyan Chen , Yinghui Li , Hai-Tao Zheng , Xi Chen , Qingbin Liu , Wenqiang Liu , Dongxiao Huang , Bei Wu , Yexin Wang

We introduce DiffAug, a simple and efficient diffusion-based augmentation technique to train image classifiers for the crucial yet challenging goal of improved classifier robustness. Applying DiffAug to a given example consists of one…

Computer Vision and Pattern Recognition · Computer Science 2024-05-30 Chandramouli Sastry , Sri Harsha Dumpala , Sageev Oore

Detecting visual relationships, i.e. <Subject, Predicate, Object> triplets, is a challenging Scene Understanding task approached in the past via linguistic priors or spatial information in a single feature branch. We introduce a new deeply…

Computer Vision and Pattern Recognition · Computer Science 2019-02-18 Nikolaos Gkanatsios , Vassilis Pitsikalis , Petros Koutras , Athanasia Zlatintsi , Petros Maragos

Diffusion models have shown preliminary success in virtual try-on (VTON) task. The typical dual-branch architecture comprises two UNets for implicit garment deformation and synthesized image generation respectively, and has emerged as the…

Computer Vision and Pattern Recognition · Computer Science 2025-05-23 Siqi Wan , Jingwen Chen , Yingwei Pan , Ting Yao , Tao Mei

The visual relationship recognition (VRR) task aims at understanding the pairwise visual relationships between interacting objects in an image. These relationships typically have a long-tail distribution due to their compositional nature.…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Jun Chen , Aniket Agarwal , Sherif Abdelkarim , Deyao Zhu , Mohamed Elhoseiny

Text-to-image diffusion models are a class of deep generative models that have demonstrated an impressive capacity for high-quality image generation. However, these models are susceptible to implicit biases that arise from web-scale…

Computer Vision and Pattern Recognition · Computer Science 2024-01-24 Yinan Zhang , Eric Tzeng , Yilun Du , Dmitry Kislyuk

This paper proposes a new pipeline for long-tail (LT) recognition. Instead of re-weighting or re-sampling, we utilize the long-tailed dataset itself to generate a balanced proxy that can be optimized through cross-entropy (CE).…

Computer Vision and Pattern Recognition · Computer Science 2024-03-11 Jie Shao , Ke Zhu , Hanxiao Zhang , Jianxin Wu

Long-tailed imbalance distribution is a common issue in practical computer vision applications. Previous works proposed methods to address this problem, which can be categorized into several classes: re-sampling, re-weighting, transfer…

Computer Vision and Pattern Recognition · Computer Science 2024-04-24 Pengxiao Han , Changkun Ye , Jieming Zhou , Jing Zhang , Jie Hong , Xuesong Li

Diffusion models excel at modeling complex data distributions, including those of images, proteins, and small molecules. However, in many cases, our goal is to model parts of the distribution that maximize certain properties: for example,…

Diffusion models gain increasing popularity for their generative capabilities. Recently, there have been surging needs to generate customized images by inverting diffusion models from exemplar images, and existing inversion methods mainly…

Computer Vision and Pattern Recognition · Computer Science 2024-12-03 Ziqi Huang , Tianxing Wu , Yuming Jiang , Kelvin C. K. Chan , Ziwei Liu

Image retouching aims to enhance the visual quality of photos. Considering the different aesthetic preferences of users, the target of retouching is subjective. However, current retouching methods mostly adopt deterministic models, which…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Zheng-Peng Duan , Jiawei zhang , Zheng Lin , Xin Jin , Dongqing Zou , Chunle Guo , Chongyi Li

One of the most difficult tasks in scene understanding is recognizing interactions between objects in an image. This task is often called visual relationship detection (VRD). We consider the question of whether, given auxiliary textual data…

Computer Vision and Pattern Recognition · Computer Science 2019-10-29 Gal Sadeh Kenigsfield , Ran El-Yaniv

Virtual try-on (VTON) aims to synthesize realistic images of a person wearing a target garment, with broad applications in e-commerce and digital fashion. While recent advances in latent diffusion models have substantially improved visual…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Xiang Xu

Long-tailed class imbalance remains a fundamental obstacle in semantic segmentation of high-resolution remote-sensing imagery, where dominant classes shape learned representations and rare classes are systematically under-segmented. This…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Buddhi Wijenayake , Nichula Wasalathilake , Roshan Godaliyadda , Vijitha Herath , Parakrama Ekanayake , Vishal M. Patel

Detecting visual anomalies in diverse, multi-class real-world images is a significant challenge. We introduce \ours, a novel unsupervised multi-class visual anomaly detection framework. It integrates a Latent Diffusion Model (LDM) with a…

Computer Vision and Pattern Recognition · Computer Science 2025-11-12 Samet Hicsonmez , Abd El Rahman Shabayek , Djamila Aouada

Applications of diffusion models for visual tasks have been quite noteworthy. This paper targets making classification models more robust to occlusions for the task of object recognition by proposing a pipeline that utilizes a frozen…

Computer Vision and Pattern Recognition · Computer Science 2025-04-14 Rupayan Mallick , Sibo Dong , Nataniel Ruiz , Sarah Adel Bargal

Diffusion models have emerged as powerful generative tools across various domains, yet tailoring pre-trained models to exhibit specific desirable properties remains challenging. While reinforcement learning (RL) offers a promising…

Computer Vision and Pattern Recognition · Computer Science 2025-06-03 Fengyuan Dai , Zifeng Zhuang , Yufei Huang , Siteng Huang , Bangyan Liao , Donglin Wang , Fajie Yuan
‹ Prev 1 2 3 10 Next ›