Related papers: A Communication-Efficient Distributed Algorithm fo…
In multicenter research, individual-level data are often protected against sharing across sites. To overcome the barrier of data sharing, many distributed algorithms, which only require sharing aggregated information, have been developed.…
With the rapid growth in the volume of data sets, models, and devices in the domain of deep learning, there is increasing attention on large-scale distributed deep learning. In contrast to traditional distributed deep learning, the…
Objective: To enable privacy-preserving learning of high quality generative and discriminative machine learning models from distributed electronic health records. Methods and Results: We describe general and scalable strategy to build…
Originated from distributed learning, federated learning enables privacy-preserved collaboration on a new abstracted level by sharing the model parameters only. While the current research mainly focuses on optimizing learning algorithms and…
With the ever-increasing range of applications of Internet in Things (IoT) and sensor networks, challenges are emerging in various categories of classification tasks. Applications such as vehicular networking, UAV swarm coordination and…
Electronic health records (EHRs) offer great promises for advancing precision medicine and, at the same time, present significant analytical challenges. Particularly, it is often the case that patient-level data in EHRs cannot be shared…
Motivated by multi-center biomedical studies that cannot share individual data due to privacy and ownership concerns, we develop communication-efficient iterative distributed algorithms for estimation and inference in the high-dimensional…
Data sharing barriers are paramount challenges arising from multicenter clinical trials where multiple data sources are stored in a distributed fashion at different local study sites. Merging such data sources into a common data storage for…
Due to the pervasive diffusion of personal mobile and IoT devices, many ``smart environments'' (e.g., smart cities and smart factories) will be, among others, generators of huge amounts of data. Currently, this is typically achieved through…
The recent decades have seen a surge of interests in distributed computing. Existing work focus primarily on either distributed computing platforms, data query tools, or, algorithms to divide big data and conquer at individual machines etc.…
We consider several estimation and learning problems that networked agents face when making decisions given their uncertainty about an unknown variable. Our methods are designed to efficiently deal with heterogeneity in both size and…
The paper studies distributed Dictionary Learning (DL) problems where the learning task is distributed over a multi-agent network with time-varying (nonsymmetric) connectivity. This formulation is relevant, for instance, in big-data…
Distributed optimization finds applications in large-scale machine learning, data processing and classification over multi-agent networks. In real-world scenarios, the communication network of agents may encounter latency that may affect…
Federated Learning has emerged as a transformative paradigm for collaborative machine learning across distributed environments. However, its performance is strongly influenced by the aggregation strategy used to combine local model updates…
Collaborative learning across heterogeneous model architectures presents significant challenges in ensuring interoperability and preserving privacy. We propose a communication-efficient distributed learning framework that supports model…
We are interested in estimating the effect of a treatment applied to individuals at multiple sites, where data is stored locally for each site. Due to privacy constraints, individual-level data cannot be shared across sites; the sites may…
Estimating statistical models within sensor networks requires distributed algorithms, in which both data and computation are distributed across the nodes of the network. We propose a general approach for distributed learning based on…
Distributed, online data mining systems have emerged as a result of applications requiring analysis of large amounts of correlated and high-dimensional data produced by multiple distributed data sources. We propose a distributed online data…
At this moment, databanks worldwide contain brain images of previously unimaginable numbers. Combined with developments in data science, these massive data provide the potential to better understand the genetic underpinnings of brain…
Efficient extraction of useful knowledge from these data is still a challenge, mainly when the data is distributed, heterogeneous and of different quality depending on its corresponding local infrastructure. To reduce the overhead cost,…