Related papers: Advancing EEG-Based Gaze Prediction Using Depthwis…

Fusing Pretrained ViTs with TCNet for Enhanced EEG Regression

The task of Electroencephalogram (EEG) analysis is paramount to the development of Brain-Computer Interfaces (BCIs). However, to reach the goal of developing robust, useful BCIs depends heavily on the speed and the accuracy at which BCIs…

Signal Processing · Electrical Eng. & Systems 2024-08-08 Eric Modesitt , Haicheng Yin , Williams Huang Wang , Brian Lu

Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning

Learning harmful shortcuts such as spurious correlations and biases prevents deep neural networks from learning the meaningful and useful representations, thus jeopardizing the generalizability and interpretability of the learned…

Computer Vision and Pattern Recognition · Computer Science 2022-05-26 Chong Ma , Lin Zhao , Yuzhong Chen , Lu Zhang , Zhenxiang Xiao , Haixing Dai , David Liu , Zihao Wu , Zhengliang Liu , Sheng Wang , Jiaxing Gao , Changhe Li , Xi Jiang , Tuo Zhang , Qian Wang , Dinggang Shen , Dajiang Zhu , Tianming Liu

Domain Adaptive Skin Lesion Classification via Conformal Ensemble of Vision Transformers

Exploring the trustworthiness of deep learning models is crucial, especially in critical domains such as medical imaging decision support systems. Conformal prediction has emerged as a rigorous means of providing deep learning models with…

Computer Vision and Pattern Recognition · Computer Science 2025-05-23 Mehran Zoravar , Shadi Alijani , Homayoun Najjaran

Efficient Partitioning Vision Transformer on Edge Devices for Distributed Inference

Deep learning models are increasingly utilized on resource-constrained edge devices for real-time data analytics. Recently, Vision Transformer and their variants have shown exceptional performance in various computer vision tasks. However,…

Computer Vision and Pattern Recognition · Computer Science 2025-05-22 Xiang Liu , Yijun Song , Xia Li , Yifei Sun , Huiying Lan , Zemin Liu , Linshan Jiang , Jialin Li

Deep Learning with ConvNET Predicts Imagery Tasks Through EEG

Deep learning with convolutional neural networks (ConvNets) have dramatically improved learning capabilities of computer vision applications just through considering raw data without any prior feature extraction. Nowadays, there is rising…

Signal Processing · Electrical Eng. & Systems 2019-07-15 Apdullah Yayık , Yakup Kutlu , Gökhan Altan

Accelerating Vision Foundation Models with Drop-in Depthwise Convolution

Pretrained vision foundation models deliver strong performance across tasks with limited fine-tuning. However, their Vision Transformer (ViT) backbones impose high inference costs, limiting deployment on resource-constrained devices. In…

Computer Vision and Pattern Recognition · Computer Science 2026-05-22 Carmelo Scribano , Mohammad Mahdi , Nedyalko Prisadnikov , Yuqian Fu , Giorgia Franchini , Danda Pani Paudel , Marko Bertogna , Luc Van Gool

EEG-based Cross-Subject Driver Drowsiness Recognition with an Interpretable Convolutional Neural Network

In the context of electroencephalogram (EEG)-based driver drowsiness recognition, it is still challenging to design a calibration-free system, since EEG signals vary significantly among different subjects and recording sessions. Many…

Signal Processing · Electrical Eng. & Systems 2022-02-21 Jian Cui , Zirui Lan , Olga Sourina , Wolfgang Müller-Wittig

One step closer to EEG based eye tracking

In this paper, we present two approaches and algorithms that adapt areas of interest We present a new deep neural network (DNN) that can be used to directly determine gaze position using EEG data. EEG-based eye tracking is a new and…

Signal Processing · Electrical Eng. & Systems 2023-03-13 Wolfgang Fuhl , Susanne Zabel , Theresa Harbig , Julia Astrid Moldt , Teresa Festl Wiete , Anne Herrmann Werner , Kay Nieselt

ECViT: Efficient Convolutional Vision Transformer with Local-Attention and Multi-scale Stages

Vision Transformers (ViTs) have revolutionized computer vision by leveraging self-attention to model long-range dependencies. However, ViTs face challenges such as high computational costs due to the quadratic scaling of self-attention and…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Zhoujie Qian

Explaining deep learning for ECG using time-localized clusters

Deep learning has significantly advanced electrocardiogram (ECG) analysis, enabling automatic annotation, disease screening, and prognosis beyond traditional clinical capabilities. However, understanding these models remains a challenge,…

Machine Learning · Computer Science 2025-09-19 Ahcène Boubekki , Konstantinos Patlatzoglou , Joseph Barker , Fu Siong Ng , Antônio H. Ribeiro

Vision Transformer for Contrastive Clustering

Vision Transformer (ViT) has shown its advantages over the convolutional neural network (CNN) with its ability to capture global long-range dependencies for visual representation learning. Besides ViT, contrastive learning is another…

Computer Vision and Pattern Recognition · Computer Science 2022-07-12 Hua-Bao Ling , Bowen Zhu , Dong Huang , Ding-Hua Chen , Chang-Dong Wang , Jian-Huang Lai

Comparative Analysis of Vision Transformers and Traditional Deep Learning Approaches for Automated Pneumonia Detection in Chest X-Rays

Pneumonia, particularly when induced by diseases like COVID-19, remains a critical global health challenge requiring rapid and accurate diagnosis. This study presents a comprehensive comparison of traditional machine learning and…

Image and Video Processing · Electrical Eng. & Systems 2025-07-16 Gaurav Singh

TSViT: A Time Series Vision Transformer for Fault Diagnosis

Traditional fault diagnosis methods using Convolutional Neural Networks (CNNs) often struggle with capturing the temporal dynamics of vibration signals. To overcome this, the application of Transformer-based Vision Transformer (ViT) methods…

Systems and Control · Electrical Eng. & Systems 2025-01-03 Shouhua Zhang , Jiehan Zhou , Xue Ma , Susanna Pirttikangas , Chunsheng Yang

A Lightweight Convolution and Vision Transformer integrated model with Multi-scale Self-attention Mechanism

Vision Transformer (ViT) has prevailed in computer vision tasks due to its strong long-range dependency modelling ability. \textcolor{blue}{However, its large model size and weak local feature modeling ability hinder its application in real…

Computer Vision and Pattern Recognition · Computer Science 2025-09-12 Yi Zhang , Lingxiao Wei , Bowei Zhang , Ziwei Liu , Kai Yi , Shu Hu

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

Convolutional architectures have proven extremely successful for vision tasks. Their hard inductive biases enable sample-efficient learning, but come at the cost of a potentially lower performance ceiling. Vision Transformers (ViTs) rely on…

Computer Vision and Pattern Recognition · Computer Science 2022-12-07 Stéphane d'Ascoli , Hugo Touvron , Matthew Leavitt , Ari Morcos , Giulio Biroli , Levent Sagun

Investigating the Impact of Rational Dilated Wavelet Transform on Motor Imagery EEG Decoding with Deep Learning Models

The present study investigates the impact of the Rational Discrete Wavelet Transform (RDWT), used as a plug-in preprocessing step for motor imagery electroencephalographic (EEG) decoding prior to applying deep learning classifiers. A…

Human-Computer Interaction · Computer Science 2025-10-13 Marco Siino , Giuseppe Bonomo , Rosario Sorbello , Ilenia Tinnirello

Differential Contrastive Training for Gaze Estimation

The complex application scenarios have raised critical requirements for precise and generalizable gaze estimation methods. Recently, the pre-trained CLIP has achieved remarkable performance on various vision tasks, but its potentials have…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Lin Zhang , Yi Tian , XiYun Wang , Wanru Xu , Yi Jin , Yaping Huang

ConvMAE: Masked Convolution Meets Masked Autoencoders

Vision Transformers (ViT) become widely-adopted architectures for various vision tasks. Masked auto-encoding for feature pretraining and multi-scale hybrid convolution-transformer architectures can further unleash the potentials of ViT,…

Computer Vision and Pattern Recognition · Computer Science 2022-05-20 Peng Gao , Teli Ma , Hongsheng Li , Ziyi Lin , Jifeng Dai , Yu Qiao

VIViT: Variable-Input Vision Transformer Framework for 3D MR Image Segmentation

Self-supervised pretrain techniques have been widely used to improve the downstream tasks' performance. However, real-world magnetic resonance (MR) studies usually consist of different sets of contrasts due to different acquisition…

Image and Video Processing · Electrical Eng. & Systems 2025-06-17 Badhan Kumar Das , Ajay Singh , Gengyan Zhao , Han Liu , Thomas J. Re , Dorin Comaniciu , Eli Gibson , Andreas Maier

Assessing learned features of Deep Learning applied to EEG

Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs' outstanding…

Machine Learning · Computer Science 2021-11-09 Dung Truong , Scott Makeig , Arnaud Delorme