Related papers: MDL-based Compressing Sequential Rules

A Subsequence Interleaving Model for Sequential Pattern Mining

Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel…

Machine Learning · Statistics 2016-11-14 Jaroslav Fowkes , Charles Sutton

Summarizing Event Sequences with Serial Episodes: A Statistical Model and an Application

In this paper we address the problem of discovering a small set of frequent serial episodes from sequential data so as to adequately characterize or summarize the data. We discuss an algorithm based on the Minimum Description Length (MDL)…

Machine Learning · Computer Science 2019-04-02 Soumyajit Mitra , P S Sastry

Discovering Useful Compact Sets of Sequential Rules in a Long Sequence

We are interested in understanding the underlying generation process for long sequences of symbolic events. To do so, we propose COSSU, an algorithm to mine small and meaningful sets of sequential rules. The rules are selected using an…

Machine Learning · Computer Science 2023-01-02 Erwan Bourrand , Luis Galárraga , Esther Galbrun , Elisa Fromont , Alexandre Termier

The Long and the Short of It: Summarising Event Sequences with Serial Episodes

An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand. Standard frequent pattern miners do not achieve this goal, as due to the…

Data Structures and Algorithms · Computer Science 2019-02-11 Nikolaj Tatti , Jilles Vreeken

Guided Exploration of Sequential Rules

In pattern mining, sequential rules provide a formal framework to capture the temporal relationships and inferential dependencies between items. However, the discovery process is computationally intensive. To obtain mining results…

Databases · Computer Science 2026-02-20 Wensheng Gan , Gengsen Huang , Junyu Ren , Philip S. Yu

Discovering Compressing Serial Episodes from Event Sequences

Most pattern mining methods output a very large number of frequent patterns and isolating a small but relevant subset is a challenging problem of current interest in frequent pattern mining. In this paper we consider discovery of a small…

Databases · Computer Science 2014-10-14 A. Ibrahim , Shivakumar Sastry , P. S. Sastry

Towards Target Sequential Rules

In many real-world applications, sequential rule mining (SRM) can provide prediction and recommendation functions for a variety of services. It is an important technique of pattern mining to discover all valuable rules that belong to…

Databases · Computer Science 2022-06-13 Wensheng Gan , Gengsen Huang , Jian Weng , Tianlong Gu , Philip S. Yu

Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation

Cross-Domain Sequential Recommendation (CDSR) aims to mine and transfer users' sequential preferences across different domains to alleviate the long-standing cold-start issue. Traditional CDSR models capture collaborative information…

Machine Learning · Computer Science 2024-06-06 Tingjia Shen , Hao Wang , Jiaqing Zhang , Sirui Zhao , Liangyue Li , Zulong Chen , Defu Lian , Enhong Chen

Constraint-based Sequential Pattern Mining with Decision Diagrams

Constrained sequential pattern mining aims at identifying frequent patterns on a sequential database of items while observing constraints defined over the item attributes. We introduce novel techniques for constraint-based sequential…

Machine Learning · Computer Science 2019-01-01 Amin Hosseininasab , Willem-Jan van Hoeve , Andre A. Cire

Automatic Parameter Selection for Non-Redundant Clustering

High-dimensional datasets often contain multiple meaningful clusterings in different subspaces. For example, objects can be clustered either by color, weight, or size, revealing different interpretations of the given dataset. A variety of…

Machine Learning · Computer Science 2025-04-08 Collin Leiber , Dominik Mautz , Claudia Plant , Christian Böhm

Robust Subspace Clustering with Compressed Data

Dimension reduction is widely regarded as an effective way for decreasing the computation, storage and communication loads of data-driven intelligent systems, leading to a growing demand for statistical methods that allow analysis (e.g.,…

Computer Vision and Pattern Recognition · Computer Science 2019-08-23 Guangcan Liu , Zhao Zhang , Qingshan Liu , Kongkai Xiong

Learned Data Compression: Challenges and Opportunities for the Future

Compressing integer keys is a fundamental operation among multiple communities, such as database management (DB), information retrieval (IR), and high-performance computing (HPC). Recent advances in \emph{learned indexes} have inspired the…

Databases · Computer Science 2024-12-17 Qiyu Liu , Siyuan Han , Jianwei Liao , Jin Li , Jingshu Peng , Jun Du , Lei Chen

A Generic Network Compression Framework for Sequential Recommender Systems

Sequential recommender systems (SRS) have become the key technology in capturing user's dynamic interests and generating high-quality recommendations. Current state-of-the-art sequential recommender models are typically based on a…

Information Retrieval · Computer Science 2020-05-27 Yang Sun , Fajie Yuan , Min Yang , Guoao Wei , Zhou Zhao , Duo Liu

A Rough Sets Partitioning Model for Mining Sequential Patterns with Time Constraint

Now a days, data mining and knowledge discovery methods are applied to a variety of enterprise and engineering disciplines to uncover interesting patterns from databases. The study of Sequential patterns is an important data mining problem…

Databases · Computer Science 2009-06-24 Jigyasa Bisaria , Namita Shrivastava , K. R. Pardasani

The Minimum Description Length Principle for Pattern Mining: A Survey

This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the…

Databases · Computer Science 2022-07-29 Esther Galbrun

Extension of Dictionary-Based Compression Algorithms for the Quantitative Visualization of Patterns from Log Files

Many services today massively and continuously produce log files of different and varying formats. These logs are important since they contain information about the application activities, which is necessary for improvements by analyzing…

Information Retrieval · Computer Science 2023-04-11 Igor Cherepanov , Jonathan Geraldi Joewono , Arjan Kuijper , Jörn Kohlhammer

Progressive Compressed Records: Taking a Byte out of Deep Learning Data

Deep learning accelerators efficiently train over vast and growing amounts of data, placing a newfound burden on commodity networks and storage devices. A common approach to conserve bandwidth involves resizing or compressing data prior to…

Machine Learning · Computer Science 2021-08-13 Michael Kuchnik , George Amvrosiadis , Virginia Smith

Maximal co-occurrence nonoverlapping sequential rule mining

The aim of sequential pattern mining (SPM) is to discover potentially useful information from a given se-quence. Although various SPM methods have been investigated, most of these focus on mining all of the patterns. However, users…

Databases · Computer Science 2023-01-31 Yan Li , Chang Zhang , Jie Li , Wei Song , Zhenlian Qi , Youxi Wu , Xindong Wu

The Minimal Compression Rate for Similarity Identification

Traditionally, data compression deals with the problem of concisely representing a data source, e.g. a sequence of letters, for the purpose of eventual reproduction (either exact or approximate). In this work we are interested in the case…

Information Theory · Computer Science 2013-12-10 Amir Ingber , Tsachy Weissman

Optimizations and Heuristics to improve Compression in Columnar Database Systems

In-memory columnar databases have become mainstream over the last decade and have vastly improved the fast processing of large volumes of data through multi-core parallelism and in-memory compression thereby eliminating the usual…

Databases · Computer Science 2016-09-27 Jayanth Jayanth