Related papers: Advancing EEG-Based Gaze Prediction Using Depthwis…
The task of Electroencephalogram (EEG) analysis is paramount to the development of Brain-Computer Interfaces (BCIs). However, to reach the goal of developing robust, useful BCIs depends heavily on the speed and the accuracy at which BCIs…
Learning harmful shortcuts such as spurious correlations and biases prevents deep neural networks from learning the meaningful and useful representations, thus jeopardizing the generalizability and interpretability of the learned…
Exploring the trustworthiness of deep learning models is crucial, especially in critical domains such as medical imaging decision support systems. Conformal prediction has emerged as a rigorous means of providing deep learning models with…
Deep learning models are increasingly utilized on resource-constrained edge devices for real-time data analytics. Recently, Vision Transformer and their variants have shown exceptional performance in various computer vision tasks. However,…
Deep learning with convolutional neural networks (ConvNets) have dramatically improved learning capabilities of computer vision applications just through considering raw data without any prior feature extraction. Nowadays, there is rising…
Pretrained vision foundation models deliver strong performance across tasks with limited fine-tuning. However, their Vision Transformer (ViT) backbones impose high inference costs, limiting deployment on resource-constrained devices. In…
In the context of electroencephalogram (EEG)-based driver drowsiness recognition, it is still challenging to design a calibration-free system, since EEG signals vary significantly among different subjects and recording sessions. Many…
In this paper, we present two approaches and algorithms that adapt areas of interest We present a new deep neural network (DNN) that can be used to directly determine gaze position using EEG data. EEG-based eye tracking is a new and…
Vision Transformers (ViTs) have revolutionized computer vision by leveraging self-attention to model long-range dependencies. However, ViTs face challenges such as high computational costs due to the quadratic scaling of self-attention and…
Deep learning has significantly advanced electrocardiogram (ECG) analysis, enabling automatic annotation, disease screening, and prognosis beyond traditional clinical capabilities. However, understanding these models remains a challenge,…
Vision Transformer (ViT) has shown its advantages over the convolutional neural network (CNN) with its ability to capture global long-range dependencies for visual representation learning. Besides ViT, contrastive learning is another…
Pneumonia, particularly when induced by diseases like COVID-19, remains a critical global health challenge requiring rapid and accurate diagnosis. This study presents a comprehensive comparison of traditional machine learning and…
Traditional fault diagnosis methods using Convolutional Neural Networks (CNNs) often struggle with capturing the temporal dynamics of vibration signals. To overcome this, the application of Transformer-based Vision Transformer (ViT) methods…
Vision Transformer (ViT) has prevailed in computer vision tasks due to its strong long-range dependency modelling ability. \textcolor{blue}{However, its large model size and weak local feature modeling ability hinder its application in real…
Convolutional architectures have proven extremely successful for vision tasks. Their hard inductive biases enable sample-efficient learning, but come at the cost of a potentially lower performance ceiling. Vision Transformers (ViTs) rely on…
The present study investigates the impact of the Rational Discrete Wavelet Transform (RDWT), used as a plug-in preprocessing step for motor imagery electroencephalographic (EEG) decoding prior to applying deep learning classifiers. A…
The complex application scenarios have raised critical requirements for precise and generalizable gaze estimation methods. Recently, the pre-trained CLIP has achieved remarkable performance on various vision tasks, but its potentials have…
Vision Transformers (ViT) become widely-adopted architectures for various vision tasks. Masked auto-encoding for feature pretraining and multi-scale hybrid convolution-transformer architectures can further unleash the potentials of ViT,…
Self-supervised pretrain techniques have been widely used to improve the downstream tasks' performance. However, real-world magnetic resonance (MR) studies usually consist of different sets of contrasts due to different acquisition…
Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs' outstanding…