Related papers: Multivariate Pointwise Information-Driven Data Sam…
With the increasing computational power of current supercomputers, the size of data produced by scientific simulations is rapidly growing. To reduce the storage footprint and facilitate scalable post-hoc analyses of such scientific data…
Multivariate spatial data plays an important role in computational science and engineering simulations. The potential features and hidden relationships in multivariate data can assist scientists to gain an in-depth understanding of a…
Subsampling is one of the popular methods to balance statistical efficiency and computational efficiency in the big data era. Most approaches aim at selecting informative or representative sample points to achieve good overall information…
In the era of burgeoning data generation, managing and storing large-scale time-varying datasets poses significant challenges. With the rise of supercomputing capabilities, the volume of data produced has soared, intensifying storage and…
Prompted by modern technologies in data acquisition, the statistical analysis of spatially distributed function-valued quantities has attracted a lot of attention in recent years. In particular, combinations of functional variables and…
Statistical topic models efficiently facilitate the exploration of large-scale data sets. Many models have been developed and broadly used to summarize the semantic structure in news, science, social media, and digital humanities. However,…
Interactive visualizations are crucial in ad hoc data exploration and analysis. However, with the growing number of massive datasets, generating visualizations in interactive timescales is increasingly challenging. One approach for…
In the present work we have selected a collection of statistical and mathematical tools useful for the exploration of multivariate data and we present them in a form that is meant to be particularly accessible to a classically trained…
The extensive adoption of Deep Neural Networks has led to their increased utilization in challenging scientific visualization tasks. Recent advancements in building compressed data models using implicit neural representations have shown…
The influx of massive amounts of data from current and upcoming cosmological surveys necessitates compression schemes that can efficiently summarize the data with minimal loss of information. We introduce a method that leverages the…
While advances in computing resources have made processing enormous amounts of data possible, human ability to identify patterns in such data has not scaled accordingly. Efficient computational methods for condensing and simplifying data…
The basic objective of data visualization is to provide an efficient graphical display for summarizing and reasoning about quantitative information. During the last decades, political science has accumulated a large corpus of various kinds…
Class imbalance and distributional differences in large datasets present significant challenges for classification tasks machine learning, often leading to biased models and poor predictive performance for minority classes. This work…
Modern scientific simulations, observations, and large-scale experiments generate data at volumes that often exceed the limits of storage, processing, and analysis. This challenge drives the development of data reduction methods that…
The continuous and rapid growth of highly interconnected datasets, which are both voluminous and complex, calls for the development of adequate processing and analytical techniques. One method for condensing and simplifying such datasets is…
The overview-driven visual analysis of large-scale dynamic graphs poses a major challenge. We propose Multiscale Snapshots, a visual analytics approach to analyze temporal summaries of dynamic graphs at multiple temporal scales. First, we…
Complex systems are fascinating because their rich macroscopic properties emerge from the interaction of many simple parts. Understanding the building principles of these emergent phenomena in nature requires assessing natural complex…
The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing…
Data-driven approaches to sequence-to-sequence modelling have been successfully applied to short text summarization of news articles. Such models are typically trained on input-summary pairs consisting of only a single or a few sentences,…
Visualizations support rapid analysis of scientific datasets, allowing viewers to glean aggregate information (e.g., the mean) within split-seconds. While prior research has explored this ability in conventional charts, it is unclear if…