English
Related papers

Related papers: A model for learning to segment temporal sequences…

200 papers

Recurrent Neural Networks (RNNs) achieve state-of-the-art results in many sequence-to-sequence modeling tasks. However, RNNs are difficult to train and tend to suffer from overfitting. Motivated by the Data Processing Inequality (DPI), we…

Machine Learning · Statistics 2018-05-24 Ziv Aharoni , Gal Rattner , Haim Permuter

Although Recurrent Neural Network (RNN) has been a powerful tool for modeling sequential data, its performance is inadequate when processing sequences with multiple patterns. In this paper, we address this challenge by introducing a novel…

Machine Learning · Computer Science 2019-02-28 Kui Zhao , Yuechuan Li , Chi Zhang , Cheng Yang , Huan Xu

This paper presents a novel approach to imitation learning from observations, where an autoregressive mixture of experts model is deployed to fit the underlying policy. The parameters of the model are learned via a two-stage framework. By…

Machine Learning · Computer Science 2024-11-14 Renzi Wang , Flavia Sofia Acerbo , Tong Duy Son , Panagiotis Patrinos

Stochastic gradient methods are the workhorse (algorithms) of large-scale optimization problems in machine learning, signal processing, and other computational sciences and engineering. This paper studies Markov chain gradient descent, a…

Optimization and Control · Mathematics 2018-09-13 Tao Sun , Yuejiao Sun , Wotao Yin

Recurrent neural networks (RNNs) have shown promising performance for language modeling. However, traditional training of RNNs using back-propagation through time often suffers from overfitting. One reason for this is that stochastic…

Computation and Language · Computer Science 2017-04-25 Zhe Gan , Chunyuan Li , Changyou Chen , Yunchen Pu , Qinliang Su , Lawrence Carin

A common problem in training neural networks is the vanishing and/or exploding gradient problem which is more prominently seen in training of Recurrent Neural Networks (RNNs). Thus several algorithms have been proposed for training RNNs.…

Machine Learning · Computer Science 2019-09-10 S. Indrapriyadarsini , Shahrzad Mahboubi , Hiroshi Ninomiya , Hideki Asai

Mixture of experts method is a neural network based ensemble learning that has great ability to improve the overall classification accuracy. This method is based on the divide and conquer principle, in which the problem space is divided…

Machine Learning · Computer Science 2021-05-26 Laleh Armi , Elham Abbasi , Jamal Zarepour-Ahmadabadi

Deep Recurrent Neural Network architectures, though remarkably capable at modeling sequences, lack an intuitive high-level spatio-temporal structure. That is while many problems in computer vision inherently have an underlying high-level…

Computer Vision and Pattern Recognition · Computer Science 2016-04-12 Ashesh Jain , Amir R. Zamir , Silvio Savarese , Ashutosh Saxena

Manipulation tasks often consist of subtasks, each representing a distinct skill. Mastering these skills is essential for robots, as it enhances their autonomy, efficiency, adaptability, and ability to work in their environment. Learning…

Robotics · Computer Science 2025-05-21 Juyan Zhang , Dana Kulic , Michael Burke

We present a novel Recurrent Graph Network (RGN) approach for predicting discrete marked event sequences by learning the underlying complex stochastic process. Using the framework of Point Processes, we interpret a marked discrete event…

Machine Learning · Computer Science 2022-08-12 Saurabh Dash , Xueyuan She , Saibal Mukhopadhyay

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due…

Neural and Evolutionary Computing · Computer Science 2015-04-20 Tomas Mikolov , Armand Joulin , Sumit Chopra , Michael Mathieu , Marc'Aurelio Ranzato

The focus of this paper is dynamic gesture recognition in the context of the interaction between humans and machines. We propose a model consisting of two sub-networks, a transformer and an ordered-neuron long-short-term-memory (ON-LSTM)…

Computer Vision and Pattern Recognition · Computer Science 2023-11-03 Kenneth Lai , Svetlana Yanushkevich

Graph incremental learning is a learning paradigm that aims to adapt trained models to continuously incremented graphs and data over time without the need for retraining on the full dataset. However, regular graph machine learning methods…

Machine Learning · Computer Science 2025-08-14 Lecheng Kong , Theodore Vasiloudis , Seongjun Yun , Han Xie , Xiang Song

Models for sequential data such as the recurrent neural network (RNN) often implicitly model a sequence as having a fixed time interval between observations and do not account for group-level effects when multiple sequences are observed. We…

Machine Learning · Computer Science 2018-12-27 Ghazal Fazelnia , Mark Ibrahim , Ceena Modarres , Kevin Wu , John Paisley

In this work, we propose an efficient implementation of mixtures of experts distributional regression models which exploits robust estimation by using stochastic first-order optimization techniques with adaptive learning rate schedulers. We…

Computation · Statistics 2026-03-23 David Rügamer , Florian Pfisterer , Bernd Bischl , Bettina Grün

We introduce a Mixture of Raytraced Experts, a stacked Mixture of Experts (MoE) architecture which can dynamically select sequences of experts, producing computational graphs of variable width and depth. Existing MoE architectures generally…

Machine Learning · Computer Science 2025-07-17 Andrea Perin , Giacomo Lagomarsini , Claudio Gallicchio , Giuseppe Nuti

Currently, instance segmentation is attracting more and more attention in machine learning region. However, there exists some defects on the information propagation in previous Mask R-CNN and other network models. In this paper, we propose…

Computer Vision and Pattern Recognition · Computer Science 2021-06-08 Kuikun Liu , Jie Yang , Cai Sun , Haoyuan Chi

This study introduces a novel approach for learning mixtures of Markov chains, a critical process applicable to various fields, including healthcare and the analysis of web users. Existing research has identified a clear divide in…

Machine Learning · Computer Science 2024-05-27 Fabian Spaeh , Konstantinos Sotiropoulos , Charalampos E. Tsourakakis

Mix-up training approaches have proven to be effective in improving the generalization ability of Deep Neural Networks. Over the years, the research community expands mix-up methods into two directions, with extensive efforts to improve…

Computer Vision and Pattern Recognition · Computer Science 2023-08-14 Minh-Long Luu , Zeyi Huang , Eric P. Xing , Yong Jae Lee , Haohan Wang

The problem of efficiently sampling from a set of(undirected) graphs with a given degree sequence has many applications. One approach to this problem uses a simple Markov chain, which we call the switch chain, to perform the sampling. The…

Data Structures and Algorithms · Computer Science 2014-12-18 Catherine Greenhill
‹ Prev 1 2 3 10 Next ›