Related papers: Practical Range Aggregation, Selection and Set Mai…
In this paper we present several novel efficient techniques and multidimensional data structures which can improve the decision making process in many domains. We consider online range aggregation, range selection and range weighted median…
In this paper I present several novel, efficient, algorithmic techniques for solving some multidimensional geometric data management and analysis problems. The techniques are based on several data structures from computational geometry…
We introduce a new technique for the efficient management of large sequences of multidimensional data, which takes advantage of regularities that arise in real-world datasets and supports different types of aggregation queries. More…
The paper addresses aggregation issues for composite (modular) solutions. A systemic view point is suggested for various aggregation problems. Several solution structures are considered: sets, set morphologies, trees, etc. Mainly, the…
As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering…
Efficient processing of aggregated range queries on two-dimensional grids is a common requirement in information retrieval and data mining systems, for example in Geographic Information Systems and OLAP cubes. We introduce a technique to…
Distributed data aggregation is an important task, allowing the decentralized determination of meaningful global properties, that can then be used to direct the execution of other applications. The resulting values result from the…
The domains of data mining and knowledge discovery make use of large amounts of textual data, which need to be handled efficiently. Specific problems, like finding the maximum weight ordered common subset of a set of ordered sets or…
The problem of recovering (count and sum) range queries over multidimensional data only on the basis of aggregate information on such data is addressed. This problem can be formalized as follows. Suppose that a transformation T producing a…
Machine learning on sets towards sequential output is an important and ubiquitous task, with applications ranging from language modeling and meta-learning to multi-agent strategy games and power grid optimization. Combining elements of…
Box folding represents a crucial challenge for automated packaging systems. This work bridges the gap between existing methods for folding sequence extraction and approaches focused on the adaptability of automated systems to specific box…
Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to…
Ensembles of artificial neural networks show improved generalization capabilities that outperform those of single networks. However, for aggregation to be effective, the individual networks must be as accurate and diverse as possible. An…
Dynamic data selection aims to accelerate training with lossless performance. However, reducing training data inherently limits data diversity, potentially hindering generalization. While data augmentation is widely used to enhance…
Variable selection for high-dimensional, highly correlated data has long been a challenging problem, often yielding unstable and unreliable models. We propose a resample-aggregate framework that exploits diffusion models' ability to…
This survey presents some historical background and recent developments in the area of selections for set-valued mappings along with several open questions. It was written with the hope that the presented material may pique an interest in…
In the era of big data, analysts usually explore various statistical models or machine learning methods for observed data in order to facilitate scientific discoveries or gain predictive power. Whatever data and fitting procedures are…
The work presented in this paper is related to the area of situational method engineering (SME). In this domain, approaches are developed accordingly to specific project specifications. We propose to adapt an existing method construction…
From a machine learning point of view, identifying a subset of relevant features from a real data set can be useful to improve the results achieved by classification methods and to reduce their time and space complexity. To achieve this…
Rank aggregation aims to combine the preference rankings of a number of alternatives from different voters into a single consensus ranking. As a useful model for a variety of practical applications, however, it is a computationally…