Related papers: Efficient Model Selection for Predictive Pattern M…
In this paper we study predictive pattern mining problems where the goal is to construct a predictive model based on a subset of predictive patterns in the database. Our main contribution is to introduce a novel method called safe pattern…
The use of patterns in predictive models is a topic that has received a lot of attention in recent years. Pattern mining can help to obtain models for structured domains, such as graphs and sequences, and has been proposed as a means to…
Discovering statistically significant patterns from databases is an important challenging problem. The main obstacle of this problem is in the difficulty of taking into account the selection bias, i.e., the bias arising from the fact that…
We introduce a pruning algorithm that provably sparsifies the parameters of a trained model in a way that approximately preserves the model's predictive accuracy. Our algorithm uses a small batch of input points to construct a data-informed…
Given a set of sequences comprised of time-ordered events, sequential pattern mining is useful to identify frequent subsequences from different sequences or within the same sequence. However, in sport, these techniques cannot determine the…
Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance. However, most current structured pruning methods do not provide any performance guarantees, and…
This paper proposes a frequent pattern data mining algorithm based on support vector machine (SVM), aiming to solve the performance bottleneck of traditional frequent pattern mining algorithms in high-dimensional and sparse data…
We introduce Probe Pruning (PP), a novel framework for online, dynamic, structured pruning of Large Language Models (LLMs) applied in a batch-wise manner. PP leverages the insight that not all samples and tokens contribute equally to the…
Statistically significant patterns mining (SSPM) is an essential and challenging data mining task in the field of knowledge discovery in databases (KDD), in which each pattern is evaluated via a hypothesis test. Our study aims to introduce…
Structured prediction problems are one of the fundamental tools in machine learning. In order to facilitate algorithm development for their numerical solution, we collect in one place a large number of datasets in easy to read formats for a…
Discriminative pattern mining is an essential task of data mining. This task aims to discover patterns which occur more frequently in a class than other classes in a class-labeled dataset. This type of patterns is valuable in various…
A powerful and flexible approach to structured prediction consists in embedding the structured objects to be predicted into a feature space of possibly infinite dimension by means of output kernels, and then, solving a regression problem in…
Data mining is the task of discovering interesting patterns from large amounts of data. There are many data mining tasks, such as classification, clustering, association rule mining, and sequential pattern mining. Sequential pattern mining…
With the growing popularity of shared resources, large volumes of complex data of different types are collected automatically. Traditional data mining algorithms generally have problems and challenges including huge memory cost, low…
This paper develops a memory-efficient approach for Sequential Pattern Mining (SPM), a fundamental topic in knowledge discovery that faces a well-known memory bottleneck for large data sets. Our methodology involves a novel hybrid trie data…
Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system. The framework yields simple algorithms for computing…
A time series is a collection of measurements in chronological order. Discovering patterns from time series is useful in many domains, such as stock analysis, disease detection, and weather forecast. To discover patterns, existing methods…
Increasingly, researchers have suggested the benefits of temporal analysis to improve our understanding of the learning process. Sequential pattern mining (SPM), as a pattern recognition technique, has the potential to reveal the temporal…
Catastrophic forgetting is a challenge issue in continual learning when a deep neural network forgets the knowledge acquired from the former task after learning on subsequent tasks. However, existing methods try to find the joint…
In this paper, we propose a novel progressive parameter pruning method for Convolutional Neural Network acceleration, named Structured Probabilistic Pruning (SPP), which effectively prunes weights of convolutional layers in a probabilistic…