Related papers: LMLFM: Longitudinal Multi-Level Factorization Mach…
Selective inference aims at providing valid inference after a data-driven selection of models or hypotheses. It is essential to avoid overconfident results and replicability issues. While significant advances have been made in this area for…
Latent Factor Model (LFM) is one of the most successful methods for Collaborative filtering (CF) in the recommendation system, in which both users and items are projected into a joint latent factor space. Base on matrix factorization…
Probabilistic modeling is one of the foundations of modern machine learning and artificial intelligence. In this paper, we propose a novel type of probabilistic models named latent dependency forest models (LDFMs). A LDFM models the…
Multivariate longitudinal data of mixed-type are increasingly collected in many science domains. However, algorithms to cluster this kind of data remain scarce, due to the challenge to simultaneously model the within- and between-time…
In this paper we address the problem of modeling relational data, which appear in many applications such as social network analysis, recommender systems and bioinformatics. Previous studies either consider latent feature based models but…
Lifelong machine learning (LML) is an area of machine learning research concerned with human-like persistent and cumulative nature of learning. LML system's objective is consolidating new information into an existing machine learning model…
Linear mixed models (LMMs), which incorporate fixed and random effects, are key tools for analyzing heterogeneous data, such as in personalized medicine. Nowadays, this type of data is increasingly wide, sometimes containing thousands of…
In various web applications like targeted advertising and recommender systems, the available categorical features (e.g., product type) are often of great importance but sparse. As a widely adopted solution, models based on Factorization…
It is often of interest to perform clustering on longitudinal data, yet it is difficult to formulate an intuitive model for which estimation is computationally feasible. We propose a model-based clustering method for clustering objects that…
Long-term time series forecasting (LTSF) offers broad utility in practical settings like energy consumption and weather prediction. Accurately predicting long-term changes, however, is demanding due to the intricate temporal patterns and…
Longitudinal analysis is important in many disciplines, such as the study of behavioral transitions in social science. Only very recently, feature selection has drawn adequate attention in the context of longitudinal modeling. Standard…
Causal discovery for both cross-sectional and temporal data has traditionally followed a dataset-specific paradigm, where a new model is fitted for each individual dataset. Such an approach limits the potential of multi-dataset pretraining.…
Sequence labeling is a fundamental problem in machine learning, natural language processing and many other fields. A classic approach to sequence labeling is linear chain conditional random fields (CRFs). When combined with neural network…
Modern applications of machine learning (ML) deal with increasingly heterogeneous datasets comprised of data collected from overlapping latent subpopulations. As a result, traditional models trained over large datasets may fail to recognize…
Unmeasured or latent variables are often the cause of correlations between multivariate measurements, which are studied in a variety of fields such as psychology, ecology, and medicine. For Gaussian measurements, there are classical tools…
Latent Class Models (LCMs) are used to cluster multivariate categorical data (e.g. group participants based on survey responses). Traditional LCMs assume a property called conditional independence. This assumption can be restrictive,…
Generative models for sequential data often struggle with sparsely sampled and high-dimensional trajectories, typically reducing the learning of dynamics to pairwise transitions. We propose Interpolative Multi-Marginal Flow Matching…
We consider the problem of learning a classifier from observed functional data. Here, each data-point takes the form of a single time-series and contains numerous features. Assuming that each such series comes with a binary label, the…
We present generalized additive latent and mixed models (GALAMMs) for analysis of clustered data with responses and latent variables depending smoothly on observed variables. A scalable maximum likelihood estimation algorithm is proposed,…
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However,…