English
Related papers

Related papers: Relation Rectification in Diffusion Model

200 papers

Images produced by text-to-image diffusion models might not always faithfully represent the semantic intent of the provided text prompt, where the model might overlook or entirely fail to produce certain objects. Existing solutions often…

Computer Vision and Pattern Recognition · Computer Science 2023-12-12 Tuna Han Salih Meral , Enis Simsar , Federico Tombari , Pinar Yanardag

Large text-to-image diffusion models have impressive capabilities in generating photorealistic images from text prompts. How to effectively guide or control these powerful models to perform different downstream tasks becomes an important…

Computer Vision and Pattern Recognition · Computer Science 2024-03-15 Zeju Qiu , Weiyang Liu , Haiwen Feng , Yuxuan Xue , Yao Feng , Zhen Liu , Dan Zhang , Adrian Weller , Bernhard Schölkopf

Existing approaches for controlling text-to-image diffusion models, while powerful, do not allow for explicit 3D object-centric control, such as precise control of object orientation. In this work, we address the problem of multi-object…

Computer Vision and Pattern Recognition · Computer Science 2025-04-11 Rishubh Parihar , Vaibhav Agrawal , Sachidanand VS , R. Venkatesh Babu

Hypergraph neural networks (HNNs) using neural networks to encode hypergraphs provide a promising way to model higher-order relations in data and further solve relevant prediction tasks built upon such higher-order relations. However,…

Machine Learning · Computer Science 2023-02-16 Peihao Wang , Shenghao Yang , Yunyu Liu , Zhangyang Wang , Pan Li

This work aims to improve the applicability of diffusion models in realistic image restoration. Specifically, we enhance the diffusion model in several aspects such as network architecture, noise level, denoising steps, training image size,…

Computer Vision and Pattern Recognition · Computer Science 2023-04-18 Ziwei Luo , Fredrik K. Gustafsson , Zheng Zhao , Jens Sjölund , Thomas B. Schön

Text-to-video diffusion models have enabled high-quality video synthesis, yet often fail to generate temporally coherent and physically plausible motion. A key reason is the models' insufficient understanding of complex motions that natural…

Computer Vision and Pattern Recognition · Computer Science 2025-10-23 Aritra Bhowmik , Denis Korzhenkov , Cees G. M. Snoek , Amirhossein Habibian , Mohsen Ghafoorian

Heterogeneous graphs are pervasive in practical scenarios, where each graph consists of multiple types of nodes and edges. Representation learning on heterogeneous graphs aims to obtain low-dimensional node representations that could…

Machine Learning · Computer Science 2021-01-01 Le Yu , Leilei Sun , Bowen Du , Chuanren Liu , Weifeng Lv , Hui Xiong

Diffusion models have emerged as the leading approach for text-to-image generation. However, their iterative sampling process, which gradually morphs random noise into coherent images, introduces significant latency that limits their…

Computer Vision and Pattern Recognition · Computer Science 2026-02-16 Peijie Qiu , Hariharan Ramshankar , Arnau Ramisa , René Vidal , Amit Kumar K C , Vamsi Salaka , Rahul Bhagat

We focus on graph-to-sequence learning, which can be framed as transducing graph structures to sequences for text generation. To capture structural information associated with graphs, we investigate the problem of encoding graphs using…

Computation and Language · Computer Science 2019-09-10 Zhijiang Guo , Yan Zhang , Zhiyang Teng , Wei Lu

Text-to-image (T2I) diffusion models, when fine-tuned on a few personal images, can generate visuals with a high degree of consistency. However, such fine-tuned models are not robust; they often fail to compose with concepts of pretrained…

Computer Vision and Pattern Recognition · Computer Science 2024-12-13 Kyungmin Lee , Sangkyung Kwak , Kihyuk Sohn , Jinwoo Shin

Text-to-image diffusion models can generate visually stunning images, yet, controlling what appears and how it appears, remains surprisingly difficult, especially when operating solely within the constraints of the text-conditioning space.…

Computer Vision and Pattern Recognition · Computer Science 2026-05-11 Arani Roy , Shristi Das Biswas , Kaushik Roy

Remote sensing images are highly valued for their ability to address complex real-world issues such as risk management, security, and meteorology. However, manually captioning these images is challenging and requires specialized knowledge…

Machine Learning · Computer Science 2025-02-07 Swadhin Das , Raksha Sharma

Diffusion models have become a successful approach for solving various image inverse problems by providing a powerful diffusion prior. Many studies tried to combine the measurement into diffusion by score function replacement, matrix…

Computer Vision and Pattern Recognition · Computer Science 2024-05-20 Hanyu Chen , Zhixiu Hao , Liying Xiao

Knowledge graphs enable a wide variety of applications, including question answering and information retrieval. Despite the great effort invested in their creation and maintenance, even the largest (e.g., Yago, DBPedia or Wikidata) remain…

Machine Learning · Statistics 2017-10-30 Michael Schlichtkrull , Thomas N. Kipf , Peter Bloem , Rianne van den Berg , Ivan Titov , Max Welling

Despite the ability of existing large-scale text-to-image (T2I) models to generate high-quality images from detailed textual descriptions, they often lack the ability to precisely edit the generated or real images. In this paper, we propose…

Computer Vision and Pattern Recognition · Computer Science 2023-11-21 Chong Mou , Xintao Wang , Jiechong Song , Ying Shan , Jian Zhang

Deep domain adaptation methods have achieved appealing performance by learning transferable representations from a well-labeled source domain to a different but related unlabeled target domain. Most existing works assume source and target…

Computer Vision and Pattern Recognition · Computer Science 2020-04-13 Shuang Li , Chi Harold Liu , Qiuxia Lin , Qi Wen , Limin Su , Gao Huang , Zhengming Ding

One of the key issues of Visual Question Answering (VQA) is to reason with semantic clues in the visual content under the guidance of the question, how to model relational semantics still remains as a great challenge. To fully capture…

Multimedia · Computer Science 2019-08-22 Zhuoqian Yang , Zengchang Qin , Jing Yu , Yue Hu

Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple…

Computation and Language · Computer Science 2021-08-24 Tapas Nayak , Hwee Tou Ng

Dependency trees convey rich structural information that is proven useful for extracting relations among entities in text. However, how to effectively make use of relevant information while ignoring irrelevant information from the…

Computation and Language · Computer Science 2020-09-08 Zhijiang Guo , Yan Zhang , Wei Lu

The objective of image manipulation detection is to identify and locate the manipulated regions in the images. Recent approaches mostly adopt the sophisticated Convolutional Neural Networks (CNNs) to capture the tampering artifacts left in…

Computer Vision and Pattern Recognition · Computer Science 2022-01-19 Wenyan Pan , Zhili Zhou , Miaogen Ling , Xin Geng , Q. M. Jonathan Wu