English
Related papers

Related papers: Cross-view Brain Decoding

200 papers

Large-scale pre-trained multi-modal models (e.g., CLIP) demonstrate strong zero-shot transfer capability in many discriminative tasks. Their adaptation to zero-shot image-conditioned text generation tasks has drawn increasing interest.…

Computer Vision and Pattern Recognition · Computer Science 2023-03-07 Wei Li , Linchao Zhu , Longyin Wen , Yi Yang

Keyword spotting (KWS) is an important technique for speech applications, which enables users to activate devices by speaking a keyword phrase. Although a phoneme classifier can be used for KWS, exploiting a large amount of transcribed data…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-23 Takuya Higuchi , Anmol Gupta , Chandra Dhir

Decoding visual semantic representations from human brain activity is a significant challenge. While recent zero-shot decoding approaches have improved performance by leveraging aligned image-text datasets, they overlook a fundamental…

Neurons and Cognition · Quantitative Biology 2026-01-21 Zhengdi Zhang , Hao Zhang , Wenjun Xia

Contrastive Language-Image Pretraining (CLIP) performs zero-shot image classification by mapping images and textual class representation into a shared embedding space, then retrieving the class closest to the image. This work provides a new…

Computer Vision and Pattern Recognition · Computer Science 2024-12-19 Fawaz Sammani , Nikos Deligiannis

Building a semantic parser quickly in a new domain is a fundamental challenge for conversational interfaces, as current semantic parsers require expensive supervision and lack the ability to generalize to new domains. In this paper, we…

Computation and Language · Computer Science 2018-09-25 Jonathan Herzig , Jonathan Berant

The development of algorithms to accurately decode neural information has long been a research focus in the field of neuroscience. Brain decoding typically involves training machine learning models to map neural data onto a preestablished…

In this work, we explore new perspectives on cross-view completion learning by drawing an analogy to self-supervised correspondence learning. Through our analysis, we demonstrate that the cross-attention map within cross-view completion…

Computer Vision and Pattern Recognition · Computer Science 2024-12-13 Honggyu An , Jinhyeon Kim , Seonghoon Park , Jaewoo Jung , Jisang Han , Sunghwan Hong , Seungryong Kim

Recent advancements in text-to-image generative models have demonstrated a remarkable ability to capture a deep semantic understanding of images. In this work, we leverage this semantic knowledge to transfer the visual appearance between…

Computer Vision and Pattern Recognition · Computer Science 2023-11-07 Yuval Alaluf , Daniel Garibi , Or Patashnik , Hadar Averbuch-Elor , Daniel Cohen-Or

Recent text-to-image matching models apply contrastive learning to large corpora of uncurated pairs of images and sentences. While such models can provide a powerful score for matching and subsequent zero-shot tasks, they are not capable of…

Computer Vision and Pattern Recognition · Computer Science 2022-04-01 Yoad Tewel , Yoav Shalev , Idan Schwartz , Lior Wolf

Brain decoding is a field of computational neuroscience that uses measurable brain activity to infer mental states or internal representations of perceptual inputs. Therefore, we propose a novel approach to brain decoding that also relies…

Computer Vision and Pattern Recognition · Computer Science 2023-03-23 Matteo Ferrante , Tommaso Boccato , Nicola Toschi

Visual decoding from brain signals is a key challenge at the intersection of computer vision and neuroscience, requiring methods that bridge neural representations and computational models of vision. A field-wide goal is to achieve…

Complex Word Identification (CWI) is a task centered on detecting hard-to-understand words, or groups of words, in texts from different areas of expertise. The purpose of CWI is to highlight problematic structures that non-native speakers…

Computation and Language · Computer Science 2020-10-05 George-Eduard Zaharia , Dumitru-Clementin Cercel , Mihai Dascalu

Zero-shot scene understanding in real-world settings presents major challenges due to the complexity and variability of natural scenes, where models must recognize new objects, actions, and contexts without prior labeled examples. This work…

Computer Vision and Pattern Recognition · Computer Science 2025-10-30 Manjunath Prasad Holenarasipura Rajiv , B. M. Vidyavathi

Zero-shot image captioning (IC) without well-paired image-text data can be divided into two categories, training-free and text-only-training. Generally, these two types of methods realize zero-shot IC by integrating pretrained…

Computer Vision and Pattern Recognition · Computer Science 2024-03-07 Zequn Zeng , Yan Xie , Hao Zhang , Chiyu Chen , Zhengjue Wang , Bo Chen

State-of-the-art approaches for image captioning require supervised training data consisting of captions with paired image data. These methods are typically unable to use unsupervised data such as textual data with no corresponding images,…

Computer Vision and Pattern Recognition · Computer Science 2017-06-27 Wenhu Chen , Aurelien Lucchi , Thomas Hofmann

Multilingual neural machine translation systems learn to map sentences of different languages into a common representation space. Intuitively, with a growing number of seen languages the encoder sentence representation grows more flexible…

Computation and Language · Computer Science 2024-08-06 Carlos Mullov , Ngoc-Quan Pham , Alexander Waibel

There has been a recent spike in interest in multi-modal Language and Vision problems. On the language side, most of these models primarily focus on English since most multi-modal datasets are monolingual. We try to bridge this gap with a…

Machine Learning · Computer Science 2021-09-17 Pranav Aggarwal , Ritiz Tambi , Ajinkya Kale

Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks such as translation and multilingual word sense disambiguation (WSD). However, they often struggle at disambiguating…

Computation and Language · Computer Science 2023-04-28 Haoqiang Kang , Terra Blevins , Luke Zettlemoyer

The many-to-many multilingual neural machine translation can be regarded as the process of integrating semantic features from the source sentences and linguistic features from the target sentences. To enhance zero-shot translation, models…

Computation and Language · Computer Science 2024-08-05 Mengyu Bu , Shuhao Gu , Yang Feng

Obtaining the human-like perception ability of abstracting visual concepts from concrete pixels has always been a fundamental and important target in machine learning research fields such as disentangled representation learning and scene…

Computer Vision and Pattern Recognition · Computer Science 2022-10-14 Tao Yang , Yuwang Wang , Yan Lu , Nanning Zheng
‹ Prev 1 2 3 10 Next ›