Related papers: A model selection approach for multiple sequence s…

Variable selection using MM algorithms

Variable selection is fundamental to high-dimensional statistical modeling. Many variable selection techniques may be implemented by maximum penalized likelihood using various penalty functions. Optimizing the penalized likelihood function…

Statistics Theory · Mathematics 2007-06-13 David R. Hunter , Runze Li

Estimation and variable selection in high dimension in nonlinear mixed-effects models

We consider nonlinear mixed effects models including high-dimensional covariates to model individual parameters variability. The objective is to identify relevant covariates among a large set under sparsity assumption and to estimate model…

Statistics Theory · Mathematics 2025-08-06 Antoine Caillebotte , Estelle Kuhn , Sarah Lemler

Subset Selection for Multiple Linear Regression via Optimization

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan

Model selection for the segmentation of multiparameter exponential family distributions

We consider the segmentation problem of univariate distributions from the exponential family with multiple parameters. In segmentation, the choice of the number of segments remains a difficult issue due to the discrete nature of the…

Statistics Theory · Mathematics 2015-03-27 Alice Cleynen , Emilie Lebarbier

Multiple sequences Prophet Inequality Under Observation Constraints

In our problem, we are given access to a number of sequences of nonnegative i.i.d. random variables, whose realizations are observed sequentially. All sequences are of the same finite length. The goal is to pick one element from each…

Statistics Theory · Mathematics 2024-02-06 Aristomenis Tsopelakos , Olgica Milenkovic

On the Optimality of Averaging in Distributed Statistical Learning

A common approach to statistical learning with big-data is to randomly split it among $m$ machines and learn the parameter of interest by averaging the $m$ individual estimates. In this paper, focusing on empirical risk minimization, or…

Machine Learning · Statistics 2016-06-14 Jonathan Rosenblatt , Boaz Nadler

Fast Sequence Segmentation using Log-Linear Models

Sequence segmentation is a well-studied problem, where given a sequence of elements, an integer K, and some measure of homogeneity, the task is to split the sequence into K contiguous segments that are maximally homogeneous. A classic…

Data Structures and Algorithms · Computer Science 2019-02-12 Nikolaj Tatti

Best subset selection, persistence in high-dimensional statistical learning and optimization under $l_1$ constraint

Let $(Y,X_1,...,X_m)$ be a random vector. It is desired to predict $Y$ based on $(X_1,...,X_m)$. Examples of prediction methods are regression, classification using logistic regression or separating hyperplanes, and so on. We consider the…

Statistics Theory · Mathematics 2007-06-13 Eitan Greenshtein

Sequence Modeling via Segmentations

Segmental structure is a common pattern in many types of sequences such as phrases in human languages. In this paper, we present a probabilistic model for sequences via their segmentations. The probability of a segmented sequence is…

Machine Learning · Statistics 2018-07-20 Chong Wang , Yining Wang , Po-Sen Huang , Abdelrahman Mohamed , Dengyong Zhou , Li Deng

An Algorithm for Optimal Partitioning of Data on an Interval

Many signal processing problems can be solved by maximizing the fitness of a segmented model over all possible partitions of the data interval. This letter describes a simple but powerful algorithm that searches the exponentially large…

Numerical Analysis · Mathematics 2025-10-20 Brad Jackson , Jeffrey D. Scargle , David Barnes , Sundararajan Arabhi , Alina Alt , Peter Gioumousis , Elyus Gwin , Paungkaew Sangtrakulcharoen , Linda Tan , Tun Tao Tsai

Randomized maximum-contrast selection: subagging for large-scale regression

We introduce a very general method for sparse and large-scale variable selection. The large-scale regression settings is such that both the number of parameters and the number of samples are extremely large. The proposed method is based on…

Statistics Theory · Mathematics 2019-07-31 Jelena Bradic

Probabilistic Segmentation via Total Variation Regularization

We present a convex approach to probabilistic segmentation and modeling of time series data. Our approach builds upon recent advances in multivariate total variation regularization, and seeks to learn a separate set of parameters for the…

Machine Learning · Statistics 2015-11-17 Matt Wytock , J. Zico Kolter

Sequence Alignment Algorithm for Statistical Similarity Assessment

This paper presents a new approach to statistical similarity assessment based on sequence alignment. The algorithm performs mutual matching of two random sequences by successively searching for common elements and by applying sequence…

Signal Processing · Electrical Eng. & Systems 2021-06-09 Jakub Nikonowicz , Łukasz Matuszewski , Paweł Kubczak

Computational Complexity of Sub-Linear Convergent Algorithms

Optimizing machine learning algorithms that are used to solve the objective function has been of great interest. Several approaches to optimize common algorithms, such as gradient descent and stochastic gradient descent, were explored. One…

Machine Learning · Computer Science 2022-10-06 Hilal AlQuabeh , Farha AlBreiki , Dilshod Azizov

Detection and estimation of parameters in high dimensional multiple change point regression models via $\ell_1/\ell_0$ regularization and discrete optimization

Binary segmentation, which is sequential in nature is thus far the most widely used method for identifying multiple change points in statistical models. Here we propose a top down methodology called arbitrary segmentation that proceeds in a…

Statistics Theory · Mathematics 2019-06-12 Abhishek Kaul , Venkata K Jandhyala , Stergios B Fotopoulos

A Fast Hierarchical Multilevel Image Segmentation Method using Unbiased Estimators

This paper proposes a novel method for segmentation of images by hierarchical multilevel thresholding. The method is global, agglomerative in nature and disregards pixel locations. It involves the optimization of the ratio of the unbiased…

Computer Vision and Pattern Recognition · Computer Science 2007-12-27 Sreechakra Goparaju , Jayadev Acharya , Ajoy K. Ray , Jaideva C. Goswami

Streaming Algorithms for Partitioning Integer Sequences

We study the problem of partitioning integer sequences in the one-pass data streaming model. Given is an input stream of integers $X \in \{0, 1, \dots, m \}^n$ of length $n$ with maximum element $m$, and a parameter $p$. The goal is to…

Data Structures and Algorithms · Computer Science 2014-07-08 Christian Konrad , László Kozma

Learning-based Support Estimation in Sublinear Time

We consider the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its elements. The problem occurs in many…

Machine Learning · Computer Science 2021-06-17 Talya Eden , Piotr Indyk , Shyam Narayanan , Ronitt Rubinfeld , Sandeep Silwal , Tal Wagner

Analysis of Adaptive Multilevel Splitting algorithms in an idealized case

The Adaptive Multilevel Splitting algorithm is a very powerful and versatile method to estimate rare events probabilities. It is an iterative procedure on an interacting particle system, where at each step, the $k$ less well-adapted…

Probability · Mathematics 2014-05-07 Charles-Edouard Bréhier , Tony Lelievre , Mathias Rousset

Strongly polynomial efficient approximation scheme for segmentation

Partitioning a sequence of length $n$ into $k$ coherent segments (Seg) is one of the classic optimization problems. As long as the optimization criterion is additive, Seg can be solved exactly in $O(n^2k)$ time using a classic dynamic…

Data Structures and Algorithms · Computer Science 2019-02-06 Nikolaj Tatti