English
Related papers

Related papers: Multimodal Query-guided Object Localization

200 papers

We introduce the novel problem of localizing all the instances of an object (seen or unseen during training) in a natural image via sketch query. We refer to this problem as sketch-guided object localization. This problem is distinctively…

Computer Vision and Pattern Recognition · Computer Science 2020-08-18 Aditay Tripathi , Rajath R Dani , Anand Mishra , Anirban Chakraborty

In this work, we investigate the problem of sketch-based object localization on natural images, where given a crude hand-drawn sketch of an object, the goal is to localize all the instances of the same object on the target image. This…

Computer Vision and Pattern Recognition · Computer Science 2023-03-16 Aditay Tripathi , Anand Mishra , Anirban Chakraborty

This work investigates the problem of sketch-guided object localization (SGOL), where human sketches are used as queries to conduct the object localization in natural images. In this cross-modal setting, we first contribute with a…

Computer Vision and Pattern Recognition · Computer Science 2021-09-27 Pau Riba , Sounak Dey , Ali Furkan Biten , Josep Llados

In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query. A cross-modal deep network architecture is formulated to jointly model the sketch and text input modalities…

Computer Vision and Pattern Recognition · Computer Science 2018-05-01 Sounak Dey , Anjan Dutta , Suman K. Ghosh , Ernest Valveny , Josep Lladós , Umapada Pal

Most existing image retrieval systems use text queries as a way for the user to express what they are looking for. However, fine-grained image retrieval often requires the ability to also express where in the image the content they are…

Computer Vision and Pattern Recognition · Computer Science 2021-08-26 Soravit Changpinyo , Jordi Pont-Tuset , Vittorio Ferrari , Radu Soricut

We introduce Sketch-based Video Object Localization (SVOL), a new task aimed at localizing spatio-temporal object boxes in video queried by the input sketch. We first outline the challenges in the SVOL task and build the Sketch-Video…

Computer Vision and Pattern Recognition · Computer Science 2023-11-30 Sangmin Woo , So-Yeong Jeon , Jinyoung Park , Minji Son , Sumin Lee , Changick Kim

Sketches, with their expressive potential, allow humans to convey the essence of an object through even a rough contour. For the first time, we harness this expressive potential to improve segmentation performance in challenging tasks like…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 Ying Zang , Runlong Cao , Jianqi Zhang , Yidong Han , Ziyue Cao , Wenjun Hu , Didi Zhu , Lanyun Zhu , Zejian Li , Deyi Ji , Tianrun Chen

Visual object localization is the key step in a series of object detection tasks. In the literature, high localization accuracy is achieved with the mainstream strongly supervised frameworks. However, such methods require object-level…

Computer Vision and Pattern Recognition · Computer Science 2021-08-10 Yi-Geng Hong , Hui-Chu Xiao , Wan-Lei Zhao

Multimodal referring segmentation aims to segment target objects in visual scenes, such as images, videos, and 3D scenes, based on referring expressions in text or audio format. This task plays a crucial role in practical applications…

Computer Vision and Pattern Recognition · Computer Science 2025-08-06 Henghui Ding , Song Tang , Shuting He , Chang Liu , Zuxuan Wu , Yu-Gang Jiang

Keypoint detection, integral to modern machine perception, faces challenges in few-shot learning, particularly when source data from the same distribution as the query is unavailable. This gap is addressed by leveraging sketches, a popular…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Subhajit Maity , Ayan Kumar Bhunia , Subhadeep Koley , Pinaki Nath Chowdhury , Aneeshan Sain , Yi-Zhe Song

Object discovery, which refers to the task of localizing objects without human annotations, has gained significant attention in 2D image analysis. However, despite this growing interest, it remains under-explored in 3D data, where…

Computer Vision and Pattern Recognition · Computer Science 2025-03-20 Saad Lahlali , Sandra Kara , Hejer Ammar , Florian Chabot , Nicolas Granger , Hervé Le Borgne , Quoc-Cuong Pham

We introduce a novel problem, i.e., the localization of an input image within a multi-modal reference map represented by a database of 3D scene graphs. These graphs comprise multiple modalities, including object-level point clouds, images,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-15 Yang Miao , Francis Engelmann , Olga Vysotska , Federico Tombari , Marc Pollefeys , Dániel Béla Baráth

In this paper, we are interested in the problem of generating target grasps by understanding freehand sketches. The sketch is useful for the persons who cannot formulate language and the cases where a textual description is not available on…

Robotics · Computer Science 2022-05-10 Haitao Lin , Chilam Cheang , Yanwei Fu , Xiangyang Xue

The study of eye gaze fixations on photographic images is an active research area. In contrast, the image subcategory of freehand sketches has not received as much attention for such studies. In this paper, we analyze the results of a…

Computer Vision and Pattern Recognition · Computer Science 2017-08-09 Ravi Kiran Sarvadevabhatla , Sudharshan Suresh , R. Venkatesh Babu

With the human pursuit of knowledge, open-set object detection (OSOD) has been designed to identify unknown objects in a dynamic world. However, an issue with the current setting is that all the predicted unknown objects share the same…

Computer Vision and Pattern Recognition · Computer Science 2022-04-13 Jiyang Zheng , Weihao Li , Jie Hong , Lars Petersson , Nick Barnes

Proliferation of touch-based devices has made sketch-based image retrieval practical. While many methods exist for sketch-based object detection/image retrieval on small datasets, relatively less work has been done on large (web)-scale…

Computer Vision and Pattern Recognition · Computer Science 2015-11-03 Sarthak Parui , Anurag Mittal

Object localization is an important task in computer vision but requires a large amount of computational power due mainly to an exhaustive multiscale search on the input image. In this paper, we describe a near real-time multiscale search…

Computer Vision and Pattern Recognition · Computer Science 2016-04-14 Hyungtae Lee , Heesung Kwon , Archith J. Bency , William D. Nothwang

3D object detection from multi-view images has drawn much attention over the past few years. Existing methods mainly establish 3D representations from multi-view images and adopt a dense detection head for object detection, or employ object…

Computer Vision and Pattern Recognition · Computer Science 2023-11-07 Zitian Wang , Zehao Huang , Jiahui Fu , Naiyan Wang , Si Liu

Existing object localization methods are tailored to locate specific classes of objects, relying heavily on abundant labeled data for model optimization. However, acquiring large amounts of labeled data is challenging in many real-world…

Computer Vision and Pattern Recognition · Computer Science 2024-06-06 Yunhan Ren , Bo Li , Chengyang Zhang , Yong Zhang , Baocai Yin

Text-to-image models give rise to workflows which often begin with an exploration step, where users sift through a large collection of generated images. The global nature of the text-to-image generation process prevents users from narrowing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-15 Or Patashnik , Daniel Garibi , Idan Azuri , Hadar Averbuch-Elor , Daniel Cohen-Or
‹ Prev 1 2 3 10 Next ›