HomeMachine LearningarXiv:2605.29850

MIRAGE: Adaptive Multimodal Gating for Whole-Brain fMRI Encoding

Machine Learning2026-05v1license

Abstract

Recent progress in task-optimized neural networks has established encoding models as a powerful tool for predicting brain responses to naturalistic stimuli, yet most existing approaches rely on unimodal representations. The emergence of omni-modal foundation models and rich multimodal neural datasets enables encoding models that jointly integrate visual, auditory, and linguistic information across subjects. We introduce MIRAGE, a brain encoding framework for predicting whole-brain fMRI responses to naturalistic audiovisual stimuli. MIRAGE achieves state-of-the-art performance via a native multimodal backbone and adaptive feature gating across layers. These representations are then combined with a transformer-based brain encoder and a subject-specific linear head over the cortical parcels. Controlled comparisons show that natively multimodal features consistently outperform post-hoc aggregation of independent unimodal features, across architectural levels and backbones. Beyond predictive accuracy, the learned attention weights are directly inspectable to interpret the modality-specific gating profile over the backbone, and each modality traces a distinct anatomical pattern across cortex. Together, these results propose adaptive layer-wise aggregation of natively multimodal features as a generalizable, interpretable, and accurate approach for whole-brain encoding.

Comments: Preprint. First two author contributed equally

Cite

@article{arxiv.2605.29850,
  title  = {MIRAGE: Adaptive Multimodal Gating for Whole-Brain fMRI Encoding},
  author = {Abdulkadir Gokce and Badr AlKhamissi and Martin Schrimpf},
  journal= {arXiv preprint arXiv:2605.29850},
  year   = {2026}
}