Related papers: Creating and Querying Data Cubes in Python using p…
Data cubes are widely used as a powerful tool to provide multidimensional views in data warehousing and On-Line Analytical Processing (OLAP). However, with increasing data sizes, it is becoming computationally expensive to perform data cube…
A large amount of data is produced every second from modern information systems such as mobile devices, the world wide web, Internet of Things, social media, etc. Analysis and mining of this massive data requires a lot of advanced tools and…
Python has become the prime language for application development in the Data Science and Machine Learning domains. However, data scientists are not necessarily experienced programmers. While Python lets them quickly implement their…
Advancements in Earth system science have seen a surge in diverse datasets. Earth System Data Cubes (ESDCs) have been introduced to efficiently handle this influx of high-dimensional data. ESDCs offer a structured, intuitive framework for…
Data stored in a data warehouse are inherently multidimensional, but most data-pruning techniques (such as iceberg and top-k queries) are unidimensional. However, analysts need to issue multidimensional queries. For example, an analyst may…
In today's world data is being generated at a high rate due to which it has become inevitable to analyze and quickly get results from this data. Most of the relational databases primarily support SQL querying with a limited support for…
In recent years, Deep Neural Networks were successfully adopted in numerous domains to solve various image-related tasks, ranging from simple classification to fine borders annotation. Naturally, many researches proposed to use it to solve…
Large-scale video repositories are increasingly available for modern video understanding and generation tasks. However, transforming raw videos into high-quality, task-specific datasets remains costly and inefficient. We present DataCube,…
Data extraction algorithms on data hypercubes, or datacubes, are traditionally only capable of cutting boxes of data along the datacube axes. For many use cases however, this is not a sufficient approach and returns more data than users…
The amount of multidimensional data published on the semantic web (SW) is constantly increasing, due to initiatives such as Open Data and Open Government Data, among other ones. Models, languages, and tools, that allow to obtain valuable…
Visual exploration of large multidimensional datasets has seen tremendous progress in recent years, allowing users to express rich data queries that produce informative visual summaries, all in real time. Techniques based on data cubes are…
Data analysis applications typically aggregate data across many dimensions looking for anomalies or unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zero-dimensional or one-dimensional aggregates. Applications…
scida is a Python package for reading and analyzing large scientific data sets with support for various cosmological and galaxy formation simulations out-of-the-box. Data access is provided through a hierarchical dictionary-like data…
Python data science libraries such as Pandas and NumPy have recently gained immense popularity. Although these libraries are feature-rich and easy to use, their scalability limitations require more robust computational resources. In this…
Behavioral studies using personal digital devices typically produce rich longitudinal datasets of mixed data types. These data provide information about the behavior of users of these devices in real-time and in the users' natural…
{\mu}Manager, an open-source microscopy acquisition software, has been an essential tool for many microscopy experiments over the past 15 years, but is not easy to use for experiments in which image acquisition and analysis are closely…
BEANS software is a web based, easy to install and maintain, new tool to store and analyse data in a distributed way for a massive amount of data. It provides a clear interface for querying, filtering, aggregating, and plotting data from an…
With web and mobile platforms becoming more prominent devices utilized in data analysis, there are currently few systems which are not without flaw. In order to increase the performance of these systems and decrease errors of data…
This article presents the implementation process of a Data Warehouse and a multidimensional analysis of business data for a holding company in the financial sector. The goal is to create a business intelligence system that, in a simple,…
We describe how Python can be leveraged to streamline the curation, modelling and dissemination of drug discovery data as well as the development of innovative, freely available tools for the related scientific community. We look at various…