English
Related papers

Related papers: Behavior Optimized Image Generation

200 papers

While Multimodal Large Language Models (MLLMs) are adept at answering what is in an image-identifying objects and describing scenes-they often lack the ability to understand how an image feels to a human observer. This gap is most evident…

Computer Vision and Pattern Recognition · Computer Science 2025-12-01 Yiming Chen , Junlin Han , Tianyi Bai , Shengbang Tong , Filippos Kokkinos , Philip Torr

In web data, advertising images are crucial for capturing user attention and improving advertising effectiveness. Most existing methods generate background for products primarily focus on the aesthetic quality, which may fail to achieve…

Conventional, classification-based AI-generated image detection methods cannot explain why an image is considered real or AI-generated in a way a human expert would, which reduces the trustworthiness and persuasiveness of these detection…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Michael Yang , Shijian Deng , William T. Doan , Kai Wang , Tianyu Yang , Harsh Singh , Yapeng Tian

Recent advancements in generative models have revolutionized the field of artificial intelligence, enabling the creation of highly-realistic and detailed images. In this study, we propose a novel Mask Conditional Text-to-Image Generative…

Computer Vision and Pattern Recognition · Computer Science 2024-10-02 Rami Skaik , Leonardo Rossi , Tomaso Fontanini , Andrea Prati

Motivated by the remarkable progress of large language models (LLMs) in objective tasks like mathematics and coding, there is growing interest in their potential to simulate human behavior--a capability with profound implications for…

Computation and Language · Computer Science 2026-01-23 Yuxuan Lei , Tianfu Wang , Jianxun Lian , Zhengyu Hu , Defu Lian , Xing Xie

Recent breakthroughs in large multimodal models (LMMs) have significantly advanced both text-to-image (T2I) generation and image-to-text (I2T) interpretation. However, many generated images still suffer from issues related to perceptual…

Computer Vision and Pattern Recognition · Computer Science 2025-04-14 Jiarui Wang , Huiyu Duan , Yu Zhao , Juntong Wang , Guangtao Zhai , Xiongkuo Min

Recent text-to-image models excel at generating high-quality object-centric images from instructions. However, images should also encapsulate rich interactions between objects, where existing models often fall short, likely due to limited…

Computer Vision and Pattern Recognition · Computer Science 2026-03-05 Xinyi Gu , Jiayuan Mao

The rapid evolution of Multi-modality Large Language Models (MLLMs) is driving significant advancements in visual understanding and generation. Nevertheless, a comprehensive assessment of their capabilities, concerning the fine-grained…

Computer Vision and Pattern Recognition · Computer Science 2025-08-06 Xiaorong Zhu , Ziheng Jia , Jiarui Wang , Xiangyu Zhao , Haodong Duan , Xiongkuo Min , Jia Wang , Zicheng Zhang , Guangtao Zhai

We introduce ImageGem, a dataset for studying generative models that understand fine-grained individual preferences. We posit that a key challenge hindering the development of such a generative model is the lack of in-the-wild and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-22 Yuanhe Guo , Linxi Xie , Zhuoran Chen , Kangrui Yu , Ryan Po , Guandao Yang , Gordon Wetztein , Hongyi Wen

Generating instructional images of human daily actions from an egocentric viewpoint serves as a key step towards efficient skill transfer. In this paper, we introduce a novel problem -- egocentric action frame generation. The goal is to…

Computer Vision and Pattern Recognition · Computer Science 2024-03-25 Bolin Lai , Xiaoliang Dai , Lawrence Chen , Guan Pang , James M. Rehg , Miao Liu

Quality assessment of AI-generated content is crucial for evaluating model capability and guiding model optimization. However, most existing quality assessment datasets and models provide only a single quality score, which is too coarse to…

Computer Vision and Pattern Recognition · Computer Science 2026-04-02 Shushi Wang , Zicheng Zhang , Chunyi Li , Wei Wang , Liya Ma , Fengjiao Chen , Xiaoyu Li , Xuezhi Cao , Guangtao Zhai , Xiaohong Liu

Automatic image generation is no longer just of interest to researchers, but also to practitioners. However, current models are sensitive to the settings used and automatic optimization methods often require human involvement. To bridge…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Dominik Sobania , Martin Briesch , Franz Rothlauf

While state-of-the-art image generation models achieve remarkable visual quality, their internal generative processes remain a "black box." This opacity limits human observation and intervention, and poses a barrier to ensuring model…

Computer Vision and Pattern Recognition · Computer Science 2025-12-10 Young Kyung Kim , Oded Schlesinger , Yuzhou Zhao , J. Matias Di Martino , Guillermo Sapiro

Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would…

Computer Vision and Pattern Recognition · Computer Science 2023-06-14 Krishna Sri Ipsit Mantri , Nevasini Sasikumar

Multimodal Large Language Models (MLLMs) are reshaping how modern agentic systems reason over sequential user-behavior data. However, whether textual or image representations of user behavior data are more effective for maximizing MLLM…

Artificial Intelligence · Computer Science 2025-11-07 Tianning Dong , Luyi Ma , Varun Vasudevan , Jason Cho , Sushant Kumar , Kannan Achan

Traditional supervised methods for detecting AI-generated images depend on large, curated datasets for training and fail to generalize to novel, out-of-domain image generators. As an alternative, we explore pre-trained Vision-Language…

Machine Learning · Computer Science 2026-01-27 Zoher Kachwala , Danishjeet Singh , Danielle Yang , Filippo Menczer

Recent text-to-image generation models have demonstrated incredible success in generating images that faithfully follow input prompts. However, the requirement of using words to describe a desired concept provides limited control over the…

Computer Vision and Pattern Recognition · Computer Science 2024-01-26 Senthil Purushwalkam , Akash Gokul , Shafiq Joty , Nikhil Naik

Inference of online social network users' attributes and interests has been an active research topic. Accurate identification of users' attributes and interests is crucial for improving the performance of personalization and recommender…

Social and Information Networks · Computer Science 2015-04-21 Quanzeng You , Sumit Bhatia , Jiebo Luo

The success of modern machine learning, particularly in facial translation networks, is highly dependent on the availability of high-quality, paired, large-scale datasets. However, acquiring sufficient data is often challenging and costly.…

Computer Vision and Pattern Recognition · Computer Science 2025-09-09 Leyang Wang , Joice Lin

This paper introduces the retrieval-augmented framework for automatic fashion caption and hashtag generation, combining multi-garment detection, attribute reasoning, and Large Language Model (LLM) prompting. The system aims to produce…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Moazzam Umer Gondal , Hamad Ul Qudous , Daniya Siddiqui , Asma Ahmad Farhan
‹ Prev 1 2 3 10 Next ›