English
Related papers

Related papers: CodeSSM: Towards State Space Models for Code Under…

200 papers

State Space Models (SSMs) have emerged as an efficient alternative to the transformer architecture. Recent studies show that SSMs can match or surpass Transformers on code understanding tasks, such as code retrieval, when trained under…

Artificial Intelligence · Computer Science 2026-02-09 Jiali Wu , Abhinav Anand , Shweta Verma , Mira Mezini

State Space Models (SSMs) have emerged as a promising alternative to the popular transformer-based models and have been increasingly gaining attention. Compared to transformers, SSMs excel at tasks with sequential data or longer contexts,…

Machine Learning · Computer Science 2025-03-17 Xingtai Lv , Youbang Sun , Kaiyan Zhang , Shang Qu , Xuekai Zhu , Yuchen Fan , Yi Wu , Ermo Hua , Xinwei Long , Ning Ding , Bowen Zhou

State-space models (SSMs) have recently attention as an efficient alternative to computationally expensive attention-based models for sequence modeling. They rely on linear recurrences to integrate information over time, enabling fast…

Machine Learning · Computer Science 2026-01-01 Mahdi Karami , Ali Behrouz , Peilin Zhong , Razvan Pascanu , Vahab Mirrokni

In the post-deep learning era, the Transformer architecture has demonstrated its powerful performance across pre-trained big models and various downstream tasks. However, the enormous computational demands of this architecture have deterred…

Transformers are the dominant architecture for sequence modeling, but there is growing interest in models that use a fixed-size latent state that does not depend on the sequence length, which we refer to as "generalized state space models"…

Machine Learning · Computer Science 2024-06-05 Samy Jelassi , David Brandfonbrener , Sham M. Kakade , Eran Malach

Recently, recurrent models based on linear state space models (SSMs) have shown promising performance in language modeling (LM), competititve with transformers. However, there is little understanding of the in-principle abilities of such…

Computation and Language · Computer Science 2025-12-15 Yash Sarrof , Yana Veitsman , Michael Hahn

State Space Models (SSMs) have become the leading alternative to Transformers for sequence modeling. Their primary advantage is efficiency in long-context and long-form generation, enabled by fixed-size memory and linear scaling of…

Machine Learning · Computer Science 2025-10-17 Eran Malach , Omid Saremi , Sinead Williamson , Arwen Bradley , Aryo Lotfi , Emmanuel Abbe , Josh Susskind , Etai Littwin

State space models (SSMs) have recently emerged as a powerful framework for long sequence processing, outperforming traditional methods on diverse benchmarks. Fundamentally, SSMs can generalize both recurrent and convolutional networks and…

Signal Processing · Electrical Eng. & Systems 2025-12-24 Xiaoyu Zhang , Mingtao Hu , Sen Lu , Soohyeon Kim , Eric Yeu-Jer Lee , Yuyang Liu , Wei D. Lu

Large Audio Language Models (LALM) combine the audio perception models and the Large Language Models (LLM) and show a remarkable ability to reason about the input audio, infer the meaning, and understand the intent. However, these systems…

Audio and Speech Processing · Electrical Eng. & Systems 2024-11-26 Saurabhchand Bhati , Yuan Gong , Leonid Karlinsky , Hilde Kuehne , Rogerio Feris , James Glass

Emerging applications such as AR are driving demands for machine intelligence capable of processing continuous and/or long-context inputs on local devices. However, currently dominant models based on Transformer architecture suffers from…

Hardware Architecture · Computer Science 2026-03-24 Saptarshi Mitra , Rachid Karami , Haocheng Xu , Sitao Huang , Hyoukjun Kwon

State-space models (SSMs) have emerged as a potential alternative architecture for building large language models (LLMs) compared to the previously ubiquitous transformer architecture. One theoretical weakness of transformers is that they…

Machine Learning · Computer Science 2025-03-07 William Merrill , Jackson Petty , Ashish Sabharwal

Deep neural networks based on state space models (SSMs) are attracting significant attention in sequence modeling since their computational cost is much smaller than that of Transformers. While the capabilities of SSMs have been…

Machine Learning · Statistics 2025-03-06 Naoki Nishikawa , Taiji Suzuki

State Space Models (SSMs) have emerged as a promising alternative to Transformers for long-context sequence modeling, offering linear $O(N)$ computational complexity compared to the Transformer's quadratic $O(N^2)$ scaling. This paper…

Machine Learning · Computer Science 2026-01-06 Abidemi Koledoye , Chinemerem Unachukwu , Gold Nwobu , Hasin Rana

Long-range dependencies are critical for understanding genomic structure and function, yet most conventional methods struggle with them. Widely adopted transformer-based models, while excelling at short-context tasks, are limited by the…

Long Short-Term Memory (LSTM) is one of the most powerful sequence models. Despite the strong performance, however, it lacks the nice interpretability as in state space models. In this paper, we present a way to combine the best of both…

Machine Learning · Computer Science 2017-12-04 Xun Zheng , Manzil Zaheer , Amr Ahmed , Yuan Wang , Eric P Xing , Alexander J Smola

Video diffusion models have recently shown promise for world modeling through autoregressive frame prediction conditioned on actions. However, they struggle to maintain long-term memory due to the high computational cost associated with…

Computer Vision and Pattern Recognition · Computer Science 2025-05-27 Ryan Po , Yotam Nitzan , Richard Zhang , Berlin Chen , Tri Dao , Eli Shechtman , Gordon Wetzstein , Xun Huang

State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g. LSTMs) proved extremely successful in modeling complex time series…

Selective state-space models (SSMs) are an emerging alternative to the Transformer, offering the unique advantage of parallel training and sequential inference. Although these models have shown promising performance on a variety of tasks,…

Machine Learning · Computer Science 2025-07-08 Aleksandar Terzić , Michael Hersche , Giacomo Camposampiero , Thomas Hofmann , Abu Sebastian , Abbas Rahimi

State Space Models (SSMs), developed to tackle long sequence modeling tasks efficiently, offer both parallelizable training and fast inference. At their core are recurrent dynamical systems that maintain a hidden state, with update costs…

Machine Learning · Computer Science 2026-02-26 Makram Chahine , Philipp Nazari , Daniela Rus , T. Konstantin Rusch

Selective State-Space Models (SSMs) such as Mamba have emerged as an alternative architecture to self-attention based transformers in sequence modeling tasks. Recent works have demonstrated the use of transformers in some filtering and…

Systems and Control · Electrical Eng. & Systems 2026-04-28 Alex Tang , M. Emrullah Ildiz , Batin Kurt , Samet Oymak , Necmiye Ozay
‹ Prev 1 2 3 10 Next ›