English
Related papers

Related papers: Transformer for Partial Differential Equations' Op…

200 papers

Operator learning for Partial Differential Equations (PDEs) is rapidly emerging as a promising approach for surrogate modeling of intricate systems. Transformers with the self-attention mechanism$\unicode{x2013}$a powerful tool originally…

Machine Learning · Computer Science 2024-05-17 Junfeng Chen , Kailiang Wu

Many machine learning tasks such as multiple instance learning, 3D shape recognition, and few-shot image classification are defined on sets of instances. Since solutions to such problems do not depend on the order of elements of the set,…

Machine Learning · Computer Science 2019-05-28 Juho Lee , Yoonho Lee , Jungtaek Kim , Adam R. Kosiorek , Seungjin Choi , Yee Whye Teh

Most approaches for semantic segmentation use only information from color cameras to parse the scenes, yet recent advancements show that using depth data allows to further improve performances. In this work, we focus on transformer-based…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Francesco Barbato , Giulia Rizzoli , Pietro Zanuttigh

In this paper, we propose an encoder-decoder neural architecture (called Channelformer) to achieve improved channel estimation for orthogonal frequency-division multiplexing (OFDM) waveforms in downlink scenarios. The self-attention…

Signal Processing · Electrical Eng. & Systems 2023-02-10 Dianxin Luan , John Thompson

While attention has been empirically shown to improve model performance, it lacks a rigorous mathematical justification. This short paper establishes a novel connection between attention mechanisms and multinomial regression. Specifically,…

Machine Learning · Computer Science 2025-10-28 Jonas A. Actor , Anthony Gruber , Eric C. Cyr

Transformer is a ubiquitous model for natural language processing and has attracted wide attentions in computer vision. The attention maps are indispensable for a transformer model to encode the dependencies among input tokens. However,…

Machine Learning · Computer Science 2021-02-26 Yujing Wang , Yaming Yang , Jiangang Bai , Mingliang Zhang , Jing Bai , Jing Yu , Ce Zhang , Gao Huang , Yunhai Tong

Neural operators have emerged as promising frameworks for learning mappings governed by partial differential equations (PDEs), serving as data-driven alternatives to traditional numerical methods. While methods such as the Fourier neural…

Machine Learning · Computer Science 2025-04-21 Minsu Koh , Beom-Chul Park , Heejo Kong , Seong-Whan Lee

Attention mechanisms have become a popular component in deep neural networks, yet there has been little examination of how different influencing factors and methods for computing attention from these factors affect performance. Toward a…

Computer Vision and Pattern Recognition · Computer Science 2019-04-15 Xizhou Zhu , Dazhi Cheng , Zheng Zhang , Stephen Lin , Jifeng Dai

Attention-based Transformers have demonstrated strong adaptability across a wide range of tasks and have become the backbone of modern Large Language Models (LLMs). However, their underlying mechanisms remain open for further exploration.…

Machine Learning · Computer Science 2026-01-13 Ruifeng Ren , Sheng Ouyang , Huayi Tang , Yong Liu

Transfer learning (TL) enables the transfer of knowledge gained in learning to perform one task (source) to a related but different task (target), hence addressing the expense of data acquisition and labeling, potential computational power…

Machine Learning · Computer Science 2022-12-20 Somdatta Goswami , Katiana Kontolati , Michael D. Shields , George Em Karniadakis

Transformers have recently gained attention in the computer vision domain due to their ability to model long-range dependencies. However, the self-attention mechanism, which is the core part of the Transformer model, usually suffers from…

Computer Vision and Pattern Recognition · Computer Science 2023-07-28 Reza Azad , René Arimond , Ehsan Khodapanah Aghdam , Amirhossein Kazerouni , Dorit Merhof

Transformers have recently shown superior performances on various vision tasks. The large, sometimes even global, receptive field endows Transformer models with higher representation power over their CNN counterparts. Nevertheless, simply…

Computer Vision and Pattern Recognition · Computer Science 2022-05-25 Zhuofan Xia , Xuran Pan , Shiji Song , Li Erran Li , Gao Huang

Transformer-based models have been achieving state-of-the-art results in several fields of Natural Language Processing. However, its direct application to speech tasks is not trivial. The nature of this sequences carries problems such as…

Computation and Language · Computer Science 2022-05-17 Gerard Sant , Gerard I. Gállego , Belen Alastruey , Marta R. Costa-Jussà

Origin-Destination (OD) matrices record directional flow data between pairs of OD regions. The intricate spatiotemporal dependency in the matrices makes the OD matrix forecasting (ODMF) problem not only intractable but also non-trivial.…

Artificial Intelligence · Computer Science 2022-08-18 Jin Huang , Bosong Huang , Weihao Yu , Jing Xiao , Ruzhong Xie , Ke Ruan

Partial differential equations (PDEs) are fundamental for modeling complex physical systems, yet classical numerical solvers face prohibitive computational costs in high-dimensional and multi-scale regimes. While Transformer-based neural…

Machine Learning · Computer Science 2026-03-04 Pengyu Lai , Yixiao Chen , Dewu Yang , Rui Wang , Feng Wang , Hui Xu

Transformer-based models are popularly used in natural language processing (NLP). Its core component, self-attention, has aroused widespread interest. To understand the self-attention mechanism, a direct method is to visualize the attention…

Machine Learning · Computer Science 2021-07-02 Han Shi , Jiahui Gao , Xiaozhe Ren , Hang Xu , Xiaodan Liang , Zhenguo Li , James T. Kwok

Large-scale foundation models for scientific machine learning adapt to physical settings unseen during training, such as zero-shot transfer between turbulent scales. This phenomenon, in-context learning, challenges conventional…

Machine Learning · Computer Science 2026-04-14 Anthony Bao , Jeffrey Lai , William Gilpin

To capture user preference, transformer models have been widely applied to model sequential user behavior data. The core of transformer architecture lies in the self-attention mechanism, which computes the pairwise attention scores in a…

Information Retrieval · Computer Science 2024-04-05 Zhen Tian , Wayne Xin Zhao , Changwang Zhang , Xin Zhao , Zhongrui Ma , Ji-Rong Wen

Latest development of neural models has connected the encoder and decoder through a self-attention mechanism. In particular, Transformer, which is solely based on self-attention, has led to breakthroughs in Natural Language Processing (NLP)…

Computation and Language · Computer Science 2019-11-07 Xindian Ma , Peng Zhang , Shuai Zhang , Nan Duan , Yuexian Hou , Dawei Song , Ming Zhou

Neural operators, as an efficient surrogate model for learning the solutions of PDEs, have received extensive attention in the field of scientific machine learning. Among them, attention-based neural operators have become one of the…

Machine Learning · Computer Science 2024-12-30 Zipeng Xiao , Zhongkai Hao , Bokai Lin , Zhijie Deng , Hang Su
‹ Prev 1 2 3 10 Next ›