Related papers: Algorithm for multivariate data standardization up…
Following the student t-statistic, normalization has been a widely used method in statistic and other disciplines including economics, ecology and machine learning. We focus on statistics taking the form of a ratio over (some power of) the…
The classification of complex data usually requires the composition of processing steps. Here, a major challenge is the selection of optimal algorithms for preprocessing and classification (including parameterizations). Nowadays, parts of…
With increasing computing capabilities of modern supercomputers, the size of the data generated from the scientific simulations is growing rapidly. As a result, application scientists need effective data summarization techniques that can…
We propose a new algorithm for approximating the non-asymptotic second moment of the marginal likelihood estimate, or normalizing constant, provided by a particle filter. The computational cost of the new method is $O(M)$ per time step,…
We extend a general result showing that the asymptotic behavior of high moments, factorial or standard, of random variables, determines the asymptotically normality, from the one dimensional to the multidimensional setting. This approach…
Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used…
This paper focuses on regularisation methods using models up to the third order to search for up to second-order critical points of a finite-sum minimisation problem. The variant presented belongs to the framework of [3]: it employs random…
As we know that the normalization is a pre-processing stage of any type problem statement. Especially normalization takes important role in the field of soft computing, cloud computing etc. for manipulation of data like scale down or scale…
Phase estimation is a quantum algorithm for measuring the eigenvalues of a Hamiltonian. We propose and rigorously analyse a randomized phase estimation algorithm with two distinctive features. First, our algorithm has complexity independent…
Meta-analysis, because of both logistical convenience and statistical efficiency, is widely popular for synthesizing information on common parameters of interest across multiple studies. We propose developing a generalized meta-analysis…
This paper is a continuation of our earlier work \cite{NRxx} in which a numerical moment method with arbitrary order of moments was presented. However, the computation may break down during the calculation of the structure of a shock wave…
Data transformation, normalization and handling of batch effect are a key part of data analysis for almost all spectrometry-based omics data. This paper reviews and contrasts these three distinct aspects. We present a systematic overview of…
Combining the representations of the words that make up a sentence into a cohesive whole is difficult, since it needs to account for the order of words, and to establish how the words present relate to each other. The solution we propose…
Principal Moment Analysis is a method designed for dimension reduction, analysis and visualization of high dimensional multivariate data. It generalizes Principal Component Analysis and allows for significant statistical modeling…
Multivariate longitudinal data of mixed-type are increasingly collected in many science domains. However, algorithms to cluster this kind of data remain scarce, due to the challenge to simultaneously model the within- and between-time…
Data segmentation a.k.a. multiple change point analysis has received considerable attention due to its importance in time series analysis and signal processing, with applications in a variety of fields including natural and social sciences,…
Shapelets are phase independent subsequences designed for time series classification. We propose three adaptations to the Shapelet Transform (ST) to capture multivariate features in multivariate time series classification. We create a…
Personalization is being applied to great extend in many systems. This paper presents a multi-dimensional user data model and its application in web search. Online and Offline activities of the user are tracked for creating the user model.…
We treat collaborative filtering as a univariate time series estimation problem: given a user's previous votes, predict the next vote. We describe two families of methods for transforming data to encode time order in ways amenable to…
This paper studies the application of the generalized method of moments (GMM) to multi-reference alignment (MRA): the problem of estimating a signal from its circularly-translated and noisy copies. We begin by proving that the GMM estimator…