English
Related papers

Related papers: Image-Based Geolocation Using Large Vision-Languag…

200 papers

Objectives: The rapid advancement of Multimodal Large Language Models (MLLMs) has significantly enhanced their reasoning capabilities, enabling a wide range of intelligent applications. However, these advancements also raise critical…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Xian Zhang , Xiang Cheng

Geo-localization is the task of identifying the location of an image using visual cues alone. It has beneficial applications, such as improving disaster response, enhancing navigation, and geography education. Recently, Vision-Language…

Computer Vision and Pattern Recognition · Computer Science 2025-08-28 Oliver Grainge , Sania Waheed , Jack Stilgoe , Michael Milford , Shoaib Ehsan

Images shared on social media often expose geographic cues. While early geolocation methods required expert effort and lacked generalization, the rise of Large Vision Language Models (LVLMs) now enables accurate geolocation even for…

Cryptography and Security · Computer Science 2025-12-01 Xinyu Zhang , Yixin Wu , Boyang Zhang , Chenhao Lin , Chao Shen , Michael Backes , Yang Zhang

The prevalence of Vision-Language Models (VLMs) raises important questions about privacy in an era where visual information is increasingly available. While foundation VLMs demonstrate broad knowledge and learned capabilities, we…

Computer Vision and Pattern Recognition · Computer Science 2025-02-21 Neel Jay , Hieu Minh Nguyen , Trung Dung Hoang , Jacob Haimes

Vision-language models (VLMs) have demonstrated strong performance in image geolocation, a capability further sharpened by frontier multimodal large reasoning models (MLRMs). This poses a significant privacy risk, as these widely accessible…

Cryptography and Security · Computer Science 2026-02-19 Ruixin Yang , Ethan Mendes , Arthur Wang , James Hays , Sauvik Das , Wei Xu , Alan Ritter

This work tackles the problem of geo-localization with a new paradigm using a large vision-language model (LVLM) augmented with human inference knowledge. A primary challenge here is the scarcity of data for training the LVLM - existing…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Ling Li , Yu Ye , Yao Zhou , Bingchuan Jiang , Wei Zeng

Geolocation, the task of identifying an image's location, requires complex reasoning and is crucial for navigation, monitoring, and cultural preservation. However, current methods often produce coarse, imprecise, and non-interpretable…

Computer Vision and Pattern Recognition · Computer Science 2026-01-07 Zirui Song , Jingpu Yang , Yuan Huang , Jonathan Tonglet , Zeyu Zhang , Tao Cheng , Meng Fang , Iryna Gurevych , Xiuying Chen

Vision Language Models (VLMs) are rapidly advancing in their capability to answer information-seeking questions. As these models are widely deployed in consumer applications, they could lead to new privacy risks due to emergent abilities to…

Computation and Language · Computer Science 2024-10-18 Ethan Mendes , Yang Chen , James Hays , Sauvik Das , Wei Xu , Alan Ritter

The advances in Vision-Language models (VLMs) offer exciting opportunities for robotic applications involving image geo-localization, the problem of identifying the geo-coordinates of a place based on visual data only. Recent research works…

Computer Vision and Pattern Recognition · Computer Science 2025-01-29 Sania Waheed , Bruno Ferrarini , Michael Milford , Sarvapali D. Ramchurn , Shoaib Ehsan

Image geolocalization, the task of identifying the geographic location depicted in an image, is important for applications in crisis response, digital forensics, and location-based intelligence. While recent advances in large language…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Lingyao Li , Runlong Yu , Qikai Hu , Bowei Li , Min Deng , Yang Zhou , Xiaowei Jia

Image geolocalization has traditionally been addressed through retrieval-based place recognition or geometry-based visual localization pipelines. Recent advances in Vision-Language Models (VLMs) have demonstrated strong zero-shot reasoning…

Computer Vision and Pattern Recognition · Computer Science 2026-04-20 Siddhant Bharadwaj , Ashish Vashist , Fahimul Aleem , Shruti Vyas

Large Visual Language Models (LVLMs) now pose a serious yet overlooked privacy threat, as they can infer a social media user's geolocation directly from shared images, leading to unintended privacy leakage. While adversarial image…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Jiayi Zhu , Yihao Huang , Yue Cao , Xiaojun Jia , Qing Guo , Felix Juefei-Xu , Geguang Pu , Bin Wang

Web search engines have long served as indispensable tools for information retrieval; user behavior and query formulation strategies have been well studied. The introduction of search engines powered by large language models (LLMs)…

Information Retrieval · Computer Science 2024-01-19 Albatool Wazzan , Stephen MacNeil , Richard Souvenir

Visual-Language Models (VLMs) have shown remarkable performance across various tasks, particularly in recognizing geographic information from images. However, VLMs still show regional biases in this task. To systematically evaluate these…

Computer Vision and Pattern Recognition · Computer Science 2025-09-09 Jingyuan Huang , Jen-tse Huang , Ziyi Liu , Xiaoyuan Liu , Wenxuan Wang , Jieyu Zhao

While numerous recent benchmarks focus on evaluating generic Vision-Language Models (VLMs), they do not effectively address the specific challenges of geospatial applications. Generic VLM benchmarks are not designed to handle the…

Computer Vision and Pattern Recognition · Computer Science 2025-03-14 Muhammad Sohail Danish , Muhammad Akhtar Munir , Syed Roshaan Ali Shah , Kartik Kuckreja , Fahad Shahbaz Khan , Paolo Fraccaro , Alexandre Lacoste , Salman Khan

Recent advances in multi-modal large reasoning models (MLRMs) have shown significant ability to interpret complex visual content. While these models enable impressive reasoning capabilities, they also introduce novel and underexplored…

Cryptography and Security · Computer Science 2026-03-04 Weidi Luo , Tianyu Lu , Qiming Zhang , Xiaogeng Liu , Bin Hu , Yue Zhao , Jieyu Zhao , Song Gao , Patrick McDaniel , Zhen Xiang , Chaowei Xiao

Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding. However, most methods rely on training…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Yue Zhou , Zhihang Zhong , Xue Yang

Vision Language Models (VLMs) are good at recognizing the global location of a photograph -- their geolocation prediction accuracy rivals the best human experts. But many VLMs are startlingly bad at \textit{explaining} which image evidence…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Mohit Talreja , Joshua Diao , Jim Thannikary James , Radu Casapu , Tejas Santanam , Ethan Mendes , Alan Ritter , Wei Xu , James Hays

Cross-view geo-localisation identifies coarse geographical position of an automated vehicle by matching a ground-level image to a geo-tagged satellite image from a database. Despite the advancements in Cross-view geo-localisation,…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Barkin Dagda , Muhammad Awais , Saber Fallah

Global geolocation, which seeks to predict the geographical location of images captured anywhere in the world, is one of the most challenging tasks in the field of computer vision. In this paper, we introduce an innovative interactive…

Computer Vision and Pattern Recognition · Computer Science 2025-04-21 Zhiyang Dou , Zipeng Wang , Xumeng Han , Guorong Li , Zhipei Huang , Zhenjun Han
‹ Prev 1 2 3 10 Next ›