Related papers: Learning Overcomplete HMMs

Fundamental limits for learning hidden Markov model parameters

We study the frontier between learnable and unlearnable hidden Markov models (HMMs). HMMs are flexible tools for clustering dependent data coming from unknown populations. The model parameters are known to be fully identifiable (up to…

Machine Learning · Statistics 2022-10-25 Kweku Abraham , Zacharie Naulet , Elisabeth Gassiat

Minimal Realization Problems for Hidden Markov Models

Consider a stationary discrete random process with alphabet size d, which is assumed to be the output process of an unknown stationary Hidden Markov Model (HMM). Given the joint probabilities of finite length strings of the process, we are…

Machine Learning · Computer Science 2015-12-15 Qingqing Huang , Rong Ge , Sham Kakade , Munther Dahleh

End-to-End Training of a Neural HMM with Label and Transition Probabilities

We investigate a novel modeling approach for end-to-end neural network training using hidden Markov models (HMM) where the transition probabilities between hidden states are modeled and learned explicitly. Most contemporary…

Machine Learning · Computer Science 2023-10-10 Daniel Mann , Tina Raissi , Wilfried Michel , Ralf Schlüter , Hermann Ney

On Learning Latent Models with Multi-Instance Weak Supervision

We consider a weakly supervised learning scenario where the supervision signal is generated by a transition function $\sigma$ of labels associated with multiple input instances. We formulate this problem as \emph{multi-instance Partial…

Machine Learning · Computer Science 2024-07-16 Kaifu Wang , Efthymia Tsamoura , Dan Roth

When Is Partially Observable Reinforcement Learning Not Scary?

Applications of Reinforcement Learning (RL), in which agents learn to make a sequence of decisions despite lacking complete information about the latent states of the controlled system, that is, they act under partial observability of the…

Machine Learning · Computer Science 2022-05-26 Qinghua Liu , Alan Chung , Csaba Szepesvári , Chi Jin

Frontiers to the learning of nonparametric hidden Markov models

Hidden Markov models (HMMs) are flexible tools for clustering dependent data coming from unknown populations, allowing nonparametric modelling of the population densities. Identifiability fails when the data is in fact independent and…

Statistics Theory · Mathematics 2025-07-16 Kweku Abraham , Elisabeth Gassiat , Zacharie Naulet

On learning parametric-output HMMs

We present a novel approach for learning an HMM whose outputs are distributed according to a parametric family. This is done by {\em decoupling} the learning task into two steps: first estimating the output parameters, and then estimating…

Machine Learning · Computer Science 2013-02-26 Aryeh Kontorovich , Boaz Nadler , Roi Weiss

Transductive Learning Is Compact

We demonstrate a compactness result holding broadly across supervised learning with a general class of loss functions: Any hypothesis class $H$ is learnable with transductive sample complexity $m$ precisely when all of its finite…

Machine Learning · Computer Science 2024-10-31 Julian Asilis , Siddartha Devic , Shaddin Dughmi , Vatsal Sharan , Shang-Hua Teng

HyperMM : Robust Multimodal Learning with Varying-sized Inputs

Combining multiple modalities carrying complementary information through multimodal learning (MML) has shown considerable benefits for diagnosing multiple pathologies. However, the robustness of multimodal models to missing modalities is…

Machine Learning · Computer Science 2024-07-31 Hava Chaptoukaev , Vincenzo Marcianó , Francesco Galati , Maria A. Zuluaga

Learning through atypical "phase transitions" in overparameterized neural networks

Current deep neural networks are highly overparameterized (up to billions of connection weights) and nonlinear. Yet they can fit data almost perfectly through variants of gradient descent algorithms and achieve unexpected levels of…

Machine Learning · Computer Science 2022-07-27 Carlo Baldassi , Clarissa Lauditi , Enrico M. Malatesta , Rosalba Pacelli , Gabriele Perugini , Riccardo Zecchina

Does MAML Only Work via Feature Re-use? A Data Centric Perspective

Recent work has suggested that a good embedding is all we need to solve many few-shot learning benchmarks. Furthermore, other work has strongly suggested that Model Agnostic Meta-Learning (MAML) also works via this same method - by learning…

Machine Learning · Computer Science 2021-12-28 Brando Miranda , Yu-Xiong Wang , Sanmi Koyejo

More Algorithms for Provable Dictionary Learning

In dictionary learning, also known as sparse coding, the algorithm is given samples of the form $y = Ax$ where $x\in \mathbb{R}^m$ is an unknown random sparse vector and $A$ is an unknown dictionary matrix in $\mathbb{R}^{n\times m}$…

Data Structures and Algorithms · Computer Science 2014-01-06 Sanjeev Arora , Aditya Bhaskara , Rong Ge , Tengyu Ma

Impossible Triangle: What's Next for Pre-trained Language Models?

Recent development of large-scale pre-trained language models (PLM) have significantly improved the capability of models in various NLP tasks, in terms of performance after task-specific fine-tuning and zero-shot / few-shot learning.…

Computation and Language · Computer Science 2022-04-21 Chenguang Zhu , Michael Zeng

Hierarchical Meta Learning

Meta learning is a promising solution to few-shot learning problems. However, existing meta learning methods are restricted to the scenarios where training and application tasks share the same out-put structure. To obtain a meta model…

Machine Learning · Computer Science 2019-04-22 Yingtian Zou , Jiashi Feng

Biasless Language Models Learn Unnaturally: How LLMs Fail to Distinguish the Possible from the Impossible

Are large language models (LLMs) sensitive to the distinction between humanly possible and impossible languages? This question was recently used in a broader debate on whether LLMs and humans share the same innate learning biases. Previous…

Computation and Language · Computer Science 2026-04-01 Imry Ziv , Nur Lan , Emmanuel Chemla

True Few-Shot Learning with Language Models

Pretrained language models (LMs) perform well on many tasks even when learning from a few examples, but prior work uses many held-out examples to tune various aspects of learning, such as hyperparameters, training objectives, and natural…

Computation and Language · Computer Science 2021-05-25 Ethan Perez , Douwe Kiela , Kyunghyun Cho

Learning Languages in the Limit from Positive Information with Finitely Many Memory Changes

We investigate learning collections of languages from texts by an inductive inference machine with access to the current datum and a bounded memory in form of states. Such a bounded memory states (BMS) learner is considered successful in…

Formal Languages and Automata Theory · Computer Science 2021-06-18 Timo Kötzing , Karen Seidel

Matrix Completion with Hypergraphs:Sharp Thresholds and Efficient Algorithms

This paper considers the problem of completing a rating matrix based on sub-sampled matrix entries as well as observed social graphs and hypergraphs. We show that there exists a \emph{sharp threshold} on the sample probability for the task…

Machine Learning · Computer Science 2026-05-29 Zhongtian Ma , Qiaosheng Zhang , Zhen Wang

Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning

Large language models (LLMs) serve as giant information stores, often including personal or copyrighted data, and retraining them from scratch is not a viable option. This has led to the development of various fast, approximate unlearning…

Computation and Language · Computer Science 2024-10-18 Minseok Choi , ChaeHun Park , Dohyun Lee , Jaegul Choo

Learning Hidden Markov Models Using Conditional Samples

This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM). Although HMMs are some of the most widely used tools in sequential and time series modeling, they are cryptographically hard to learn in…

Machine Learning · Computer Science 2024-02-27 Sham M. Kakade , Akshay Krishnamurthy , Gaurav Mahajan , Cyril Zhang