Related papers: Parallelizing Convergent Cross Mapping Using Apach…
Convergent Cross Mapping (CCM) is a powerful method for detecting causality in coupled nonlinear dynamical systems, providing a model-free approach to capture dynamic causal interactions. Partial Cross Mapping (PCM) was introduced as an…
Convergent Cross-Mapping (CCM) is a technique for computing specific kinds of correlations between sets of times series. It was introduced by Sugihara et al. and is reported to be "a necessary condition for causation" capable of…
Identifying causal relationships in climate systems remains challenging due to nonlinear, coupled dynamics that limit the effectiveness of linear and stochastic causal discovery approaches. This study benchmarks Convergence Cross Mapping…
Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering. Much of the existing literature in linear causal analysis operates in the time domain. Unfortunately, the…
Convergent Cross-Mapping (CCM) has shown high potential to perform causal inference in the absence of models. We assess the strengths and weaknesses of the method by varying coupling strength and noise levels in coupled logistic maps. We…
Soft sensor modeling plays a crucial role in process monitoring. Causal feature selection can enhance the performance of soft sensor models in industrial applications. However, existing methods ignore two critical characteristics of…
Identifying directed interactions between species from time series of their population densities has many uses in ecology. This key statistical task is equivalent to causal time series inference, which connects to the Granger causality (GC)…
The CERN IT provides a set of Hadoop clusters featuring more than 5 PBytes of raw storage with different open-source, user-level tools available for analytical purposes. The CMS experiment started collecting a large set of computing…
As dataset sizes increase, data analysis tasks in high performance computing (HPC) are increasingly dependent on sophisticated dataflows and out-of-core methods for efficient system utilization. In addition, as HPC systems grow, memory…
Data analytic applications built upon big data processing frameworks such as Apache Spark are an important class of applications. Many of these applications are not latency-sensitive and thus can run as batch jobs in data centers. By…
We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks…
Infectious diseases are notorious for their complex dynamics, which make it difficult to fit models to test hypotheses. Methods based on state-space reconstruction have been proposed to infer causal interactions in noisy, nonlinear…
We present massively parallel (MPC) algorithms and hardness of approximation results for computing Single-Linkage Clustering of $n$ input $d$-dimensional vectors under Hamming, $\ell_1, \ell_2$ and $\ell_\infty$ distances. All our…
Most neural models of causality assume static causal graphs, failing to capture the dynamic and sparse nature of physical interactions where causal relationships emerge and dissolve over time. We introduce the Causal Process Framework and…
Causal discovery with time series data remains a challenging yet increasingly important task across many scientific domains. Convergent cross mapping (CCM) and related methods have been proposed to study time series that are generated by…
Random walk based distance measures for graphs such as commute-time distance are useful in a variety of graph algorithms, such as clustering, anomaly detection, and creating low dimensional embeddings. Since such measures hinge on the…
While witnessing the exceptional success of machine learning (ML) technologies in many applications, users are starting to notice a critical shortcoming of ML: correlation is a poor substitute for causation. The conventional way to discover…
Convergent cross mapping (CCM) provides a powerful technique for exploring causal relationships in nonlinear coupled systems. The method relies on Takens' theorem exploiting that time delay embeddings of infinite length general observations…
Training deep networks is expensive and time-consuming with the training period increasing with data size and growth in model parameters. In this paper, we provide a framework for distributed training of deep networks over a cluster of CPUs…
Understanding causal relationships within a system is crucial for uncovering its underlying mechanisms. Causal discovery methods, which facilitate the construction of such models from time-series data, hold the potential to significantly…