Related papers: Skopus: Mining top-k sequential patterns under lev…

Efficient Discovering of Top-K Sequential Patterns in Event-Based Spatio-Temporal Data

We consider the problem of discovering sequential patterns from event-based spatio-temporal data. The dataset is described by a set of event types and their instances. Based on the given dataset, the task is to discover all significant…

Databases · Computer Science 2017-07-04 Piotr S. Maciąg

Towards Top-$K$ Non-Overlapping Sequential Patterns

Sequential pattern mining (SPM) has excellent prospects and application spaces and has been widely used in different fields. The non-overlapping SPM, as one of the data mining techniques, has been used to discover patterns that have…

Databases · Computer Science 2023-04-25 Zefeng Chen , Wensheng Gan , Gengsen Huang , Yan Li , Zhenlian Qi

TKUS: Mining Top-K High-Utility Sequential Patterns

High-utility sequential pattern mining (HUSPM) has recently emerged as a focus of intense research interest. The main task of HUSPM is to find all subsequences, within a quantitative sequential database, that have high utility with respect…

Databases · Computer Science 2020-11-30 Chunkai Zhang , Zilin Du , Wensheng Gan , Philip S. Yu

Efficient Support Coupled Frequent Pattern Mining Over Progressive Databases

There have been many recent studies on sequential pattern mining. The sequential pattern mining on progressive databases is relatively very new, in which we progressively discover the sequential patterns in period of interest. Period of…

Databases · Computer Science 2010-07-15 B. N. Keshavamurthy , Mitesh Sharma , Durga Toshniwal

Consecutive Support: Better Be Close!

We propose a new measure of support (the number of occur- rences of a pattern), in which instances are more important if they occur with a certain frequency and close after each other in the stream of trans- actions. We will explain this…

Artificial Intelligence · Computer Science 2007-05-23 Edgar de Graaf , Jeannette de Graaf , Walter A. Kosters

A New Algorithm for Exploratory Projection Pursuit

In this paper, we propose a new algorithm for exploratory projection pursuit. The basis of the algorithm is the insight that previous approaches used fairly narrow definitions of interestingness / non interestingness. We argue that allowing…

Methodology · Statistics 2011-12-20 Mohit Dayal

A Subsequence Interleaving Model for Sequential Pattern Mining

Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel…

Machine Learning · Statistics 2016-11-14 Jaroslav Fowkes , Charles Sutton

Interestingness Measure for Mining Spatial Gene Expression Data using Association Rule

The search for interesting association rules is an important topic in knowledge discovery in spatial gene expression databases. The set of admissible rules for the selected support and confidence thresholds can easily be extracted by…

Databases · Computer Science 2010-03-25 M. Anandhavalli , M. K. Ghose , K. Gauthaman

An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets

As advances in technology allow for the collection, storage, and analysis of vast amounts of data, the task of screening and assessing the significance of discovered patterns is becoming a major challenge in data mining applications. In…

Databases · Computer Science 2010-02-08 Adam Kirsch , Michael Mitzenmacher , Andrea Pietracaprina , Geppino Pucci , Eli Upfal , Fabio Vandin

Simple and Scalable Sparse k-means Clustering via Feature Ranking

Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters.…

Machine Learning · Statistics 2020-10-23 Zhiyue Zhang , Kenneth Lange , Jason Xu

Finding Skewed Subcubes Under a Distribution

Say that we are given samples from a distribution $\psi$ over an $n$-dimensional space. We expect or desire $\psi$ to behave like a product distribution (or a $k$-wise independent distribution over its marginals for small $k$). We propose…

Data Structures and Algorithms · Computer Science 2020-11-16 Parikshit Gopalan , Roie Levin , Udi Wieder

Interestingness as an Inductive Heuristic for Future Compression Progress

One of the bottlenecks on the way towards recursively self-improving systems is the challenge of interestingness: the ability to prospectively identify which tasks or data hold the potential for future progress. We formalize interestingness…

Artificial Intelligence · Computer Science 2026-05-15 Vincent Herrmann , Jürgen Schmidhuber

A Statistical Perspective on Algorithmic Leveraging

One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging samples and rescales…

Methodology · Statistics 2013-06-25 Ping Ma , Michael W. Mahoney , Bin Yu

Enhancing Clustering: An Explainable Approach via Filtered Patterns

Machine learning has become a central research area, with increasing attention devoted to explainable clustering, also known as conceptual clustering, which is a knowledge-driven unsupervised learning paradigm that partitions data into…

Artificial Intelligence · Computer Science 2026-04-15 Motaz Ben Hassine , Saïd Jabbour

Mode-wise Principal Subspace Pursuit and Matrix Spiked Covariance Model

This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a…

Methodology · Statistics 2024-08-06 Runshi Tang , Ming Yuan , Anru R. Zhang

A Bayesian Network Model for Interesting Itemsets

Mining itemsets that are the most interesting under a statistical model of the underlying data is a commonly used and well-studied technique for exploratory data analysis, with the most recent interestingness models exhibiting state of the…

Machine Learning · Statistics 2016-11-14 Jaroslav Fowkes , Charles Sutton

Guided Exploration of Sequential Rules

In pattern mining, sequential rules provide a formal framework to capture the temporal relationships and inferential dependencies between items. However, the discovery process is computationally intensive. To obtain mining results…

Databases · Computer Science 2026-02-20 Wensheng Gan , Gengsen Huang , Junyu Ren , Philip S. Yu

A study of dependency features of spike trains through copulas

Simultaneous recordings from many neurons hide important information and the connections characterizing the network remain generally undiscovered despite the progresses of statistical and machine learning techniques. Discerning the presence…

Applications · Statistics 2019-03-21 Pietro Verzelli , Laura Sacerdote

Expedition: A System for the Unsupervised Learning of a Hierarchy of Concepts

We present a system for bottom-up cumulative learning of myriad concepts corresponding to meaningful character strings, and their part-related and prediction edges. The learning is self-supervised in that the concepts discovered are used as…

Machine Learning · Computer Science 2021-12-20 Omid Madani

Ranking Episodes using a Partition Model

One of the biggest setbacks in traditional frequent pattern mining is that overwhelmingly many of the discovered patterns are redundant. A prototypical example of such redundancy is a freerider pattern where the pattern contains a true…

Data Structures and Algorithms · Computer Science 2019-02-05 Nikolaj Tatti