Related papers: Controllable Mind Visual Diffusion Model
Reconstructing visual stimuli from human brain activities provides a promising opportunity to advance our understanding of the brain's visual system and its connection with computer vision models. Although deep generative models have been…
Reconstructing visual stimuli from brain recordings has been a meaningful and challenging task. Especially, the achievement of precise and controllable image reconstruction bears great significance in propelling the progress and utilization…
Reconstructing visual stimuli from measured functional magnetic resonance imaging (fMRI) has been a meaningful and challenging task. Previous studies have successfully achieved reconstructions with structures similar to the original images,…
Understanding how humans process visual information is one of the crucial steps for unraveling the underlying mechanism of brain activity. Recently, this curiosity has motivated the fMRI-to-image reconstruction task; given the fMRI data…
Computational neuroimaging involves analyzing brain images or signals to provide mechanistic insights and predictive tools for human cognition and behavior. While diffusion models have shown stability and high-quality generation in natural…
Reconstructing visual stimuli from functional Magnetic Resonance Imaging fMRI enables fine-grained retrieval of brain activity. However, the accurate reconstruction of diverse details, including structure, background, texture, color, and…
Accurate detection and segmentation of brain tumors from magnetic resonance imaging (MRI) are essential for diagnosis, treatment planning, and clinical monitoring. While convolutional architectures such as U-Net have long been the backbone…
Concept Bottleneck Models (CBMs) enhance the interpretability of neural networks by basing predictions on human-understandable concepts. However, current CBMs typically rely on concept sets extracted from large language models or extensive…
Text-to-image diffusion models have made significant advancements in generating high-quality, diverse images from text prompts. However, the inherent limitations of textual signals often prevent these models from fully capturing specific…
Understanding how the brain encodes visual information is a central challenge in neuroscience and machine learning. A promising approach is to reconstruct visual stimuli, essentially images, from functional Magnetic Resonance Imaging (fMRI)…
Diffusion models have demonstrated significant potential in producing high-quality images in medical image translation to aid disease diagnosis, localization, and treatment. Nevertheless, current diffusion models have limited success in…
Visual information has been introduced for enhancing machine translation (MT), and its effectiveness heavily relies on the availability of large amounts of bilingual parallel sentence pairs with manual image annotations. In this paper, we…
Semantic Communication (SC) is an emerging technology that has attracted much attention in the sixth-generation (6G) mobile communication systems. However, few literature has fully considered the perceptual quality of the reconstructed…
Brain decoding is a field of computational neuroscience that uses measurable brain activity to infer mental states or internal representations of perceptual inputs. Therefore, we propose a novel approach to brain decoding that also relies…
The diffusion model has recently emerged as a potent approach in computer vision, demonstrating remarkable performances in the field of generative artificial intelligence. Capable of producing high-quality synthetic images, diffusion models…
We introduce MVControl, a novel neural network architecture that enhances existing pre-trained multi-view 2D diffusion models by incorporating additional input conditions, e.g. edge maps. Our approach enables the generation of controllable…
Recovering degraded low-resolution text images is challenging, especially for Chinese text images with complex strokes and severe degradation in real-world scenarios. Ensuring both text fidelity and style realness is crucial for…
With the development of deep learning, speech enhancement has been greatly optimized in terms of speech quality. Previous methods typically focus on the discriminative supervised learning or generative modeling, which tends to introduce…
While deep neural networks (NN) significantly advance image compressed sensing (CS) by improving reconstruction quality, the necessity of training current CS NNs from scratch constrains their effectiveness and hampers rapid deployment.…
Spatial control methods using additional modules on pretrained diffusion models have gained attention for enabling conditional generation in natural images. These methods guide the generation process with new conditions while leveraging the…