English
Related papers

Related papers: Visual Zero-Shot E-Commerce Product Attribute Valu…

200 papers

Pretrained vision-language models, such as CLIP, show promising zero-shot performance across a wide variety of datasets. For closed-set classification tasks, however, there is an inherent limitation: CLIP image encoders are typically…

Computer Vision and Pattern Recognition · Computer Science 2023-09-14 Piyapat Saranrittichai , Mauricio Munoz , Volker Fischer , Chaithanya Kumar Mummadi

E-commerce websites (e.g. Amazon) have a plethora of structured and unstructured information (text and images) present on the product pages. Sellers often either don't label or mislabel values of the attributes (e.g. color, size etc.) for…

Computer Vision and Pattern Recognition · Computer Science 2023-06-02 Anant Khandelwal , Happy Mittal , Shreyas Sunil Kulkarni , Deepak Gupta

Audio-visual zero-shot learning methods commonly build on features extracted from pre-trained models, e.g. video or audio classification models. However, existing benchmarks predate the popularization of large multi-modal models, such as…

Computer Vision and Pattern Recognition · Computer Science 2024-04-10 David Kurzendörfer , Otniel-Bogdan Mercea , A. Sophia Koepke , Zeynep Akata

Visual anomaly classification and segmentation are vital for automating industrial quality inspection. The focus of prior research in the field has been on training custom models for each quality inspection task, which requires…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Jongheon Jeong , Yang Zou , Taewan Kim , Dongqing Zhang , Avinash Ravichandran , Onkar Dabeer

Recently, large-scale vision-language models such as CLIP have demonstrated immense potential in zero-shot anomaly segmentation (ZSAS) task, utilizing a unified model to directly detect anomalies on any unseen product with painstakingly…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Zhen Qu , Xian Tao , Mukesh Prasad , Fei Shen , Zhengtao Zhang , Xinyi Gong , Guiguang Ding

Understanding product attributes plays an important role in improving online shopping experience for customers and serves as an integral part for constructing a product knowledge graph. Most existing methods focus on attribute extraction…

Computer Vision and Pattern Recognition · Computer Science 2021-06-10 Rongmei Lin , Xiang He , Jie Feng , Nasser Zalmout , Yan Liang , Li Xiong , Xin Luna Dong

Structured product data in the form of attribute/value pairs is the foundation of many e-commerce applications such as faceted product search, product comparison, and product recommendation. Product offers often only contain textual…

Computation and Language · Computer Science 2023-06-28 Alexander Brinkmann , Roee Shraga , Reng Chiz Der , Christian Bizer

With the prosperity of e-commerce industry, various modalities, e.g., vision and language, are utilized to describe product items. It is an enormous challenge to understand such diversified data, especially via extracting the…

Computer Vision and Pattern Recognition · Computer Science 2023-04-07 Mengyin Liu , Chao Zhu , Hongyu Gao , Weibo Gu , Hongfa Wang , Wei Liu , Xu-cheng Yin

The fusion of vision and language has brought about a transformative shift in computer vision through the emergence of Vision-Language Models (VLMs). However, the resource-intensive nature of existing VLMs poses a significant challenge. We…

Computer Vision and Pattern Recognition · Computer Science 2024-01-23 Jordan Shipard , Arnold Wiliem , Kien Nguyen Thanh , Wei Xiang , Clinton Fookes

In this study, we define and tackle zero shot "real" classification by description, a novel task that evaluates the ability of Vision-Language Models (VLMs) like CLIP to classify objects based solely on descriptive attributes, excluding…

Computer Vision and Pattern Recognition · Computer Science 2024-12-19 Ethan Baron , Idan Tankel , Peter Tu , Guy Ben-Yosef

Vision-Language Models like CLIP create aligned embedding spaces for text and images, making it possible for anyone to build a visual classifier by simply naming the classes they want to distinguish. However, a model that works well in one…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 Kevin Robbins , Xiaotong Liu , Yu Wu , Le Sun , Grady McPeak , Abby Stylianou , Robert Pless

Product attribute values are essential in many e-commerce scenarios, such as customer service robots, product recommendations, and product retrieval. While in the real world, the attribute values of a product are usually incomplete and vary…

Computation and Language · Computer Science 2020-09-16 Tiangang Zhu , Yue Wang , Haoran Li , Youzheng Wu , Xiaodong He , Bowen Zhou

Vehicle make and model recognition (VMMR) is an important task in intelligent transportation systems, but existing approaches struggle to adapt to newly released models. Contrastive Language-Image Pretraining (CLIP) provides strong…

Computer Vision and Pattern Recognition · Computer Science 2025-10-22 Wei-Chia Chang , Yan-Ann Chen

Zero-shot object counting (ZOC) aims to enumerate objects in images using only the names of object classes during testing, without the need for manual annotations. However, a critical challenge in current ZOC methods lies in their inability…

Computer Vision and Pattern Recognition · Computer Science 2024-07-10 Huilin Zhu , Jingling Yuan , Zhengwei Yang , Yu Guo , Zheng Wang , Xian Zhong , Shengfeng He

E-commerce platforms should provide detailed product descriptions (attribute values) for effective product search and recommendation. However, attribute value information is typically not available for new products. To predict unseen…

Information Retrieval · Computer Science 2024-02-15 Jiaying Gong , Hoda Eldardiry

Vision-Language Models (VLMs) have demonstrated impressive capabilities in zero-shot action recognition by learning to associate video embeddings with class embeddings. However, a significant challenge arises when relying solely on action…

Computer Vision and Pattern Recognition · Computer Science 2025-11-04 Yehna Kim , Young-Eun Kim , Seong-Whan Lee

Zero-shot anomaly detection (ZSAD) aims to detect anomalies without any target domain training samples, relying solely on external auxiliary data. Existing CLIP-based methods attempt to activate the model's ZSAD potential via handcrafted or…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Ziteng Yang , Jingzehua Xu , Yanshu Li , Zepeng Li , Yeqiang Wang , Xinghui Li

Large-scale pre-trained multi-modal models (e.g., CLIP) demonstrate strong zero-shot transfer capability in many discriminative tasks. Their adaptation to zero-shot image-conditioned text generation tasks has drawn increasing interest.…

Computer Vision and Pattern Recognition · Computer Science 2023-03-07 Wei Li , Linchao Zhu , Longyin Wen , Yi Yang

Vision-language models trained on large, randomly collected data had significant impact in many areas since they appeared. But as they show great performance in various fields, such as image-text-retrieval, their inner workings are still…

Computer Vision and Pattern Recognition · Computer Science 2022-09-15 Felix Vogel , Nina Shvetsova , Leonid Karlinsky , Hilde Kuehne

Accurate video moment retrieval (VMR) requires universal visual-textual correlations that can handle unknown vocabulary and unseen scenes. However, the learned correlations are likely either biased when derived from a limited amount of…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Dezhao Luo , Jiabo Huang , Shaogang Gong , Hailin Jin , Yang Liu
‹ Prev 1 2 3 10 Next ›