数据库
Utility-oriented pattern mining has become an emerging topic since it can reveal high-utility patterns (e.g., itemsets, rules, sequences) from different types of data, which provides more information than the traditional…
Useful knowledge, embedded in a database, is likely to change over time. Identifying recent changes in temporal databases can provide valuable up-to-date information to decision-makers. Nevertheless, techniques for mining high-utility…
Utility is an important concept in economics. A variety of applications consider utility in real-life situations, which has lead to the emergence of utility-oriented mining (also called utility mining) in the recent decade. Utility mining…
Modern Internet of Things (IoT) applications generate massive amounts of data, much of it in the form of objects/items of readings, events, and log entries. Specifically, most of the objects in these IoT data contain rich embedded…
Utility-oriented mining which integrates utility theory and data mining is a useful tool for understanding economic consumer behavior. Traditional algorithms for mining high-utility patterns (HUPs) applies a single/uniform minimum…
Mining useful patterns from varied types of databases is an important research topic, which has many real-life applications. Most studies have considered the frequency as sole interestingness measure for identifying high quality patterns.…
With the growing popularity of shared resources, large volumes of complex data of different types are collected automatically. Traditional data mining algorithms generally have problems and challenges including huge memory cost, low…
The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. For identifying and evaluating the usefulness of different kinds of…
Graph similarity search algorithms usually leverage the structural properties of a database. Hence, these algorithms are effective only on some structural variations of the data and are ineffective on other forms, which makes them hard to…
Fuzzy systems have good modeling capabilities in several data science scenarios, and can provide human-explainable intelligence models with explainability and interpretability. In contrast to transaction data, which have been extensively…
Machine learning (ML) is now commonplace, powering data-driven applications in various organizations. Unlike the traditional perception of ML in research, ML production pipelines are complex, with many interlocking analytical components…
Sample-based approximate query processing (AQP) suffers from many pitfalls such as the inability to answer very selective queries and unreliable confidence intervals when sample sizes are small. Recent research presented an intriguing…
The paper presents the results of the experiments that were conducted to confirm the main ideas of the proposed approach to determining the objects innovativeness. This approach assumed that the product life cycle of whose descriptions are…
The article discusses the approach for evaluating the innovation index of the products and technologies. The evaluation results can be used to create a warehouse of the object descriptions with significant innovation potential. The model of…
How can we track synchronized behavior in a stream of time-stamped tuples, such as mobile devices installing and uninstalling applications in the lockstep, to boost their ranks in the app store? We model such tuples as entries in a…
In this paper, we are revisiting pattern mining and especially itemset mining, which allows one to analyze binary datasets in searching for interesting and meaningful association rules and respective itemsets in an unsupervised way. While a…
In many data analysis applications, there is a need to explain why a surprising or interesting result was produced by a query. Previous approaches to explaining results have directly or indirectly used data provenance (input tuples…
Physical data layout is an important performance factor for modern databases. Clustering, i.e., storing similar values in proximity, can lead to performance gains in several ways. We present an automated model to determine beneficial…
The article presents a systematic review of the results of the development of the theoretical basis and the pilot implementation of data storage technology with automatic replenishment of data from sources belonging to different thematic…
Performing analytics tasks over large-scale video datasets is increasingly common in a wide range of applications. These tasks generally involve object detection and tracking operations that require applying expensive machine learning…