Related papers: Hyperbolic Image-Text Representations

Compositional Entailment Learning for Hyperbolic Vision-Language Models

Image-text representation learning forms a cornerstone in vision-language models, where pairs of images and textual descriptions are contrastively aligned in a shared embedding space. Since visual and textual concepts are naturally…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Avik Pal , Max van Spengler , Guido Maria D'Amely di Melendugno , Alessandro Flaborea , Fabio Galasso , Pascal Mettes

Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU

Machine unlearning methods have become increasingly important for selective concept removal in large pre-trained models. While recent work has explored unlearning in Euclidean contrastive vision-language models, the effectiveness of concept…

Computer Vision and Pattern Recognition · Computer Science 2025-04-15 Àlex Pujol Vidal , Sergio Escalera , Kamal Nasrollahi , Thomas B. Moeslund

HMID-Net: An Exploration of Masked Image Modeling and Knowledge Distillation in Hyperbolic Space

Visual and semantic concepts are often structured in a hierarchical manner. For instance, textual concept `cat' entails all images of cats. A recent study, MERU, successfully adapts multimodal learning techniques from Euclidean space to…

Computer Vision and Pattern Recognition · Computer Science 2025-07-22 Changli Wang , Fang Yin , Jiafeng Liu , Rui Wu

Hyperbolic Contrastive Learning

Learning good image representations that are beneficial to downstream tasks is a challenging task in computer vision. As such, a wide variety of self-supervised learning approaches have been proposed. Among them, contrastive learning has…

Computer Vision and Pattern Recognition · Computer Science 2023-02-06 Yun Yue , Fangzhou Lin , Kazunori D Yamada , Ziming Zhang

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

Structuring latent representations in a hierarchical manner enables models to learn patterns at multiple levels of abstraction. However, most prevalent image understanding models focus on visual similarity, and learning visual hierarchies…

Computer Vision and Pattern Recognition · Computer Science 2026-01-07 Ziwei Wang , Sameera Ramasinghe , Chenchen Xu , Julien Monteil , Loris Bazzani , Thalaiyasingam Ajanthan

Hyperbolic Learning with Multimodal Large Language Models

Hyperbolic embeddings have demonstrated their effectiveness in capturing measures of uncertainty and hierarchical relationships across various deep-learning tasks, including image segmentation and active learning. However, their application…

Machine Learning · Computer Science 2024-08-12 Paolo Mandica , Luca Franco , Konstantinos Kallidromitis , Suzanne Petryk , Fabio Galasso

HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding

Embedded topic models are able to learn interpretable topics even with large and heavy-tailed vocabularies. However, they generally hold the Euclidean embedding space assumption, leading to a basic limitation in capturing hierarchical…

Information Retrieval · Computer Science 2022-10-20 Yishi Xu , Dongsheng Wang , Bo Chen , Ruiying Lu , Zhibin Duan , Mingyuan Zhou

Embedding Text in Hyperbolic Spaces

Natural language text exhibits hierarchical structure in a variety of respects. Ideally, we could incorporate our prior knowledge of this hierarchical structure into unsupervised learning algorithms that work on text data. Recent work by…

Computation and Language · Computer Science 2018-06-13 Bhuwan Dhingra , Christopher J. Shallue , Mohammad Norouzi , Andrew M. Dai , George E. Dahl

ARGENT: Adaptive Hierarchical Image-Text Representations

Large-scale Vision-Language Models (VLMs) such as CLIP learn powerful semantic representations but operate in Euclidean space, which fails to capture the inherent hierarchical structure of visual and linguistic concepts. Hyperbolic…

Computer Vision and Pattern Recognition · Computer Science 2026-03-25 Chuong Huynh , Hossein Souri , Abhinav Kumar , Vitali Petsiuk , Deen Dayal Mohan , Suren Kumar

PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning

Vision-language models have achieved remarkable success in multi-modal representation learning from large-scale pairs of visual scenes and linguistic descriptions. However, they still struggle to simultaneously express two distinct types of…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Daiki Yoshikawa , Takashi Matsubara

Emergent Visual-Semantic Hierarchies in Image-Text Representations

While recent vision-and-language models (VLMs) like CLIP are a powerful tool for analyzing text and images in a shared semantic space, they do not explicitly model the hierarchical nature of the set of texts which may describe an image.…

Computer Vision and Pattern Recognition · Computer Science 2024-07-17 Morris Alper , Hadar Averbuch-Elor

Understanding and Mitigating Hyperbolic Dimensional Collapse in Graph Contrastive Learning

Learning generalizable self-supervised graph representations for downstream tasks is challenging. To this end, Contrastive Learning (CL) has emerged as a leading approach. The embeddings of CL are arranged on a hypersphere where similarity…

Machine Learning · Computer Science 2025-02-25 Yifei Zhang , Hao Zhu , Menglin Yang , Jiahong Liu , Rex Ying , Irwin King , Piotr Koniusz

Hyperbolic Interaction Model For Hierarchical Multi-Label Classification

Different from the traditional classification tasks which assume mutual exclusion of labels, hierarchical multi-label classification (HMLC) aims to assign multiple labels to every instance with the labels organized under hierarchical…

Machine Learning · Computer Science 2019-09-05 Boli Chen , Xin Huang , Lin Xiao , Zixin Cai , Liping Jing

Are "Hierarchical" Visual Representations Hierarchical?

Learned visual representations often capture large amounts of semantic information for accurate downstream applications. Human understanding of the world is fundamentally grounded in hierarchy. To mimic this and further improve…

Computer Vision and Pattern Recognition · Computer Science 2023-11-27 Ethan Shen , Ali Farhadi , Aditya Kusupati

Order-Embeddings of Images and Language

Hypernymy, textual entailment, and image captioning can be seen as special cases of a single visual-semantic hierarchy over words, sentences, and images. In this paper we advocate for explicitly modeling the partial order structure of this…

Machine Learning · Computer Science 2016-03-02 Ivan Vendrov , Ryan Kiros , Sanja Fidler , Raquel Urtasun

Hyperbolic Image-and-Pointcloud Contrastive Learning for 3D Classification

3D contrastive representation learning has exhibited remarkable efficacy across various downstream tasks. However, existing contrastive learning paradigms based on cosine similarity fail to deeply explore the potential intra-modal…

Computer Vision and Pattern Recognition · Computer Science 2024-09-25 Naiwen Hu , Haozhe Cheng , Yifan Xie , Pengcheng Shi , Jihua Zhu

Hyperbolic Contrastive Learning for Visual Representations beyond Objects

Although self-/un-supervised methods have led to rapid progress in visual representation learning, these methods generally treat objects and scenes using the same lens. In this paper, we focus on learning representations for objects and…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Songwei Ge , Shlok Mishra , Simon Kornblith , Chun-Liang Li , David Jacobs

Unit Ball Model for Embedding Hierarchical Structures in the Complex Hyperbolic Space

Learning the representation of data with hierarchical structures in the hyperbolic space attracts increasing attention in recent years. Due to the constant negative curvature, the hyperbolic space resembles tree metrics and captures the…

Machine Learning · Computer Science 2022-02-21 Huiru Xiao , Caigao Jiang , Yangqiu Song , James Zhang , Junwu Xiong

A Fully Hyperbolic Neural Model for Hierarchical Multi-Class Classification

Label inventories for fine-grained entity typing have grown in size and complexity. Nonetheless, they exhibit a hierarchical structure. Hyperbolic spaces offer a mathematically appealing approach for learning hierarchical representations of…

Computation and Language · Computer Science 2020-10-06 Federico López , Michael Strube

Hierarchical Representation Matching for CLIP-based Class-Incremental Learning

Class-Incremental Learning (CIL) aims to endow models with the ability to continuously adapt to evolving data streams. Recent advances in pre-trained vision-language models (e.g., CLIP) provide a powerful foundation for this task. However,…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Zhen-Hao Wen , Yan Wang , Ji Feng , Han-Jia Ye , De-Chuan Zhan , Da-Wei Zhou