Related papers: BEANS - a software package for distributed Big Dat…
A large amount of data is produced every second from modern information systems such as mobile devices, the world wide web, Internet of Things, social media, etc. Analysis and mining of this massive data requires a lot of advanced tools and…
Nowadays, many scientific areas share the same need of being able to deal with massive and distributed datasets and to perform on them complex knowledge extraction tasks. This simple consideration is behind the international efforts to…
The term, Big Data, has been authored to refer to the extensive heave of data that can't be managed by traditional data handling methods or techniques. The field of Big Data plays an indispensable role in various fields, such as…
Nowadays, many scientific areas share the same broad requirements of being able to deal with massive and distributed datasets while, when possible, being integrated with services and applications. In order to solve the growing gap between…
The increasing availability of high-quality optical and near-infrared spectroscopic data, as well as advances in modelling techniques, have greatly expanded the scientific potential of spectroscopic studies. However, the software tools…
We introduce NebulOS, a Big Data platform that allows a cluster of Linux machines to be treated as a single computer. With NebulOS, the process of writing a massively parallel program for a datacenter is no more complicated than writing a…
The amount of data in the world is expanding rapidly. Every day, huge amounts of data are created by scientific experiments, companies, and end users' activities. These large data sets have been labeled as "Big Data", and their storage,…
Scientific data sets continue to increase in both size and complexity. In the past, dedicated graphics systems at supercomputing centers were required to visualize large data sets, but as the price of commodity graphics hardware has dropped…
Astronomy produces extremely large data sets from ground-based telescopes, space missions, and simulation. The volume and complexity of these rich data sets require new approaches and advanced tools to understand the information contained…
The recent explosion of recorded digital data and its processed derivatives threatens to overwhelm researchers when analysing their experimental data or when looking up data items in archives and file systems. While current hardware…
scida is a Python package for reading and analyzing large scientific data sets with support for various cosmological and galaxy formation simulations out-of-the-box. Data access is provided through a hierarchical dictionary-like data…
We increasingly live in a data-driven world, with diverse kinds of data distributed across many locations. In some cases, the datasets are collected from multiple locations, such as sensors (e.g., mobile phones and street cameras) spread…
Big data refers to large and complex data sets that, under existing approaches, exceed the capacity and capability of current compute platforms, systems software, analytical tools and human understanding. Numerous lessons on the scalability…
Recently, we have been witnessing huge advancements in the scale of data we routinely generate and collect in pretty much everything we do, as well as our ability to exploit modern technologies to process, analyze and understand this data.…
In this paper we describe the main features of the software package named FITSH, intended to provide a standalone environment for analysis of data acquired by imaging astronomical detectors. The package provides utilities both for the full…
The Statistical Toolkit is an open source system specialized in the statistical comparison of distributions. It addresses requirements common to different experimental domains, such as simulation validation (e.g. comparison of experimental…
The excessive amounts of data generated by devices and Internet-based sources at a regular basis constitute, big data. This data can be processed and analyzed to develop useful applications for specific domains. Several mathematical and…
Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely…
The use of statistical software in academia and enterprises has been evolving over the last years. More often than not, students, professors, workers, and users, in general, have all had, at some point, exposure to statistical software.…
Today's astronomical projects need computational systems capable to store and analyze large amounts of scientific data, to effectively share data with other research Institutes and to easily implement information services to present data…