Related papers: Semantic Manipulation Localization

Perceptual MAE for Image Manipulation Localization: A High-level Vision Learner Focusing on Low-level Features

Nowadays, multimedia forensics faces unprecedented challenges due to the rapid advancement of multimedia generation technology thereby making Image Manipulation Localization (IML) crucial in the pursuit of truth. The key to IML lies in…

Computer Vision and Pattern Recognition · Computer Science 2023-10-11 Xiaochen Ma , Jizhe Zhou , Xiong Xu , Zhuohang Jiang , Chi-Man Pun

PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning

Deceptive images can be shared in seconds with social networking services, posing substantial risks. Tampering traces, such as boundary artifacts and high-frequency information, have been significantly emphasized by massive networks in the…

Computer Vision and Pattern Recognition · Computer Science 2024-01-02 Xuntao Liu , Yuzhou Yang , Qichao Ying , Zhenxing Qian , Xinpeng Zhang , Sheng Li

Beyond Fully Supervised Pixel Annotations: Scribble-Driven Weakly-Supervised Framework for Image Manipulation Localization

Deep learning-based image manipulation localization (IML) methods have achieved remarkable performance in recent years, but typically rely on large-scale pixel-level annotated datasets. To address the challenge of acquiring high-quality…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Songlin Li , Guofeng Yu , Zhiqing Guo , Yunfeng Diao , Dan Ma , Gaobo Yang

IML-ViT: Benchmarking Image Manipulation Localization by Vision Transformer

Advanced image tampering techniques are increasingly challenging the trustworthiness of multimedia, leading to the development of Image Manipulation Localization (IML). But what makes a good IML model? The answer lies in the way to capture…

Computer Vision and Pattern Recognition · Computer Science 2024-11-27 Xiaochen Ma , Bo Du , Zhuohang Jiang , Xia Du , Ahmed Y. Al Hammadi , Jizhe Zhou

The Courtroom Trial of Pixels: Robust Image Manipulation Localization via Adversarial Evidence and Reinforcement Learning Judgment

Although some existing image manipulation localization (IML) methods incorporate authenticity-related supervision, this information is typically utilized merely as an auxiliary training signal to enhance the model's sensitivity to…

Computer Vision and Pattern Recognition · Computer Science 2026-04-17 Songlin Li , Zhiqing Guo , Dan Ma , Changtao Miao , Gaobo Yang

SAPL: Semantic-Agnostic Prompt Learning in CLIP for Weakly Supervised Image Manipulation Localization

Malicious image manipulation threatens public safety and requires efficient localization methods. Existing approaches depend on costly pixel-level annotations which make training expensive. Existing weakly supervised methods rely only on…

Computer Vision and Pattern Recognition · Computer Science 2026-01-13 Xinghao Wang , Changtao Miao , Dianmo Sheng , Tao Gong , Qi Chu , Nenghai Yu , Quanchen Zou , Deyue Zhang , Xiangzheng Zhang

From Passive Perception to Active Memory: A Weakly Supervised Image Manipulation Localization Framework Driven by Coarse-Grained Annotations

Image manipulation localization (IML) faces a fundamental trade-off between minimizing annotation cost and achieving fine-grained localization accuracy. Existing fully-supervised IML methods depend heavily on dense pixel-level mask…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Zhiqing Guo , Dongdong Xi , Songlin Li , Gaobo Yang

Multi-modal Semantic SLAM for Complex Dynamic Environments

Simultaneous Localization and Mapping (SLAM) is one of the most essential techniques in many real-world robotic applications. The assumption of static environments is common in most SLAM algorithms, which however, is not the case for most…

Robotics · Computer Science 2022-05-17 Han Wang , Jing Ying Ko , Lihua Xie

Bridging Semantic Logic Gaps: A Cognition Inspired Multimodal Boundary Preserving Network for Image Manipulation Localization

The existing image manipulation localization (IML) models mainly relies on visual cues, but ignores the semantic logical relationships between content features. In fact, the content semantics conveyed by real images often conform to human…

Computer Vision and Pattern Recognition · Computer Science 2025-10-08 Songlin Li , Zhiqing Guo , Yuanman Li , Zeyu Li , Yunfeng Diao , Gaobo Yang , Liejun Wang

Omni-IML: Towards Unified Image Manipulation Localization

Existing Image Manipulation Localization (IML) methods mostly rely heavily on task-specific designs, making them perform well only on the target IML task, while joint training on multiple IML tasks causes significant performance…

Computer Vision and Pattern Recognition · Computer Science 2025-04-30 Chenfan Qu , Yiwu Zhong , Fengjun Guo , Lianwen Jin

Propose and Rectify: A Forensics-Driven MLLM Framework for Image Manipulation Localization

The increasing sophistication of image manipulation techniques demands robust forensic solutions that can both reliably detect alterations and precisely localize tampered regions. Recent Multimodal Large Language Models (MLLMs) show promise…

Computer Vision and Pattern Recognition · Computer Science 2025-08-26 Keyang Zhang , Chenqi Kong , Hui Liu , Bo Ding , Xinghao Jiang , Haoliang Li

SIGMA: Semantic-Difference Instruction-Grounding Mask Annotator for Text-Driven Image Manipulation Localization

Text-driven image editing has advanced rapidly, but reliably localizing these manipulations requires image manipulation localization (IML) models trained on large pixel-annotated datasets, and there is still no low-cost way to obtain such…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Peiyu Zhuang , Jianquan Yang , Haodong Li , Zhuoying Cai , Ruitao Xie , Jishen Zeng , Baoying Chen , Jiwu Huang , Xiaochun Cao

Image-to-Text Translation for Interactive Image Recognition: A Comparative User Study with Non-Expert Users

Interactive machine learning (IML) allows users to build their custom machine learning models without expert knowledge. While most existing IML systems are designed with classification algorithms, they sometimes oversimplify the…

Human-Computer Interaction · Computer Science 2024-04-16 Wataru Kawabe , Yusuke Sugano

Locate-Then-Examine: Grounded Region Reasoning Improves Detection of AI-Generated Images

The rapid growth of AI-generated imagery has blurred the boundary between real and synthetic content, raising practical concerns for digital integrity. Vision-language models (VLMs) can provide natural language explanations, but standard…

Computer Vision and Pattern Recognition · Computer Science 2026-04-23 Yikun Ji , Yan Hong , Bowen Deng , Jun Lan , Huijia Zhu , Weiqiang Wang , Liqing Zhang , Jianfu Zhang

LLM-Augmented Semantic Steering of Text Embedding Projection Spaces

Low-dimensional projections of text embeddings support visual analysis of document collections, but their spatial organization may not reflect the relationships an analyst intends to examine. Existing semantic interaction approaches encode…

Human-Computer Interaction · Computer Science 2026-05-05 Wei Liu , Eric Krokos , Kirsten Whitley , Rebecca Faust , Chris North

Rethinking Semantic Segmentation Evaluation for Explainability and Model Selection

Semantic segmentation aims to robustly predict coherent class labels for entire regions of an image. It is a scene understanding task that powers real-world applications (e.g., autonomous navigation). One important application, the use of…

Computer Vision and Pattern Recognition · Computer Science 2023-02-16 Yuxiang Zhang , Sachin Mehta , Anat Caspi

See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding

We present SWIM (See What I Mean), a novel training strategy that aligns vision and language representations to enable fine-grained object understanding solely from textual prompts. Unlike existing approaches that require explicit visual…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Boyuan Sun , Bowen Yin , Yuanming Li , Xihan Wei , Qibin Hou

Learning Ordinality in Semantic Segmentation

Semantic segmentation consists of predicting a semantic label for each image pixel. While existing deep learning approaches achieve high accuracy, they often overlook the ordinal relationships between classes, which can provide critical…

Computer Vision and Pattern Recognition · Computer Science 2025-02-06 Ricardo P. M. Cruz , Rafael Cristino , Jaime S. Cardoso

ISLE: A Framework for Image Level Semantic Segmentation Ensemble

One key bottleneck of employing state-of-the-art semantic segmentation networks in the real world is the availability of training labels. Conventional semantic segmentation networks require massive pixel-wise annotated labels to reach…

Computer Vision and Pattern Recognition · Computer Science 2023-09-21 Erik Ostrowski , Muhammad Shafique

Learning to Evaluate Performance of Multi-modal Semantic Localization

Semantic localization (SeLo) refers to the task of obtaining the most relevant locations in large-scale remote sensing (RS) images using semantic information such as text. As an emerging task based on cross-modal retrieval, SeLo achieves…

Computer Vision and Pattern Recognition · Computer Science 2022-09-20 Zhiqiang Yuan , Wenkai Zhang , Chongyang Li , Zhaoying Pan , Yongqiang Mao , Jialiang Chen , Shouke Li , Hongqi Wang , Xian Sun