English
Related papers

Related papers: Distributed TensorFlow with MPI

200 papers

TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous…

Deep Learning (DL) algorithms have become the {\em de facto} choice for data analysis. Several DL implementations -- primarily limited to a single compute node -- such as Caffe, TensorFlow, Theano and Torch have become readily available.…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-04-18 Abhinav Vishnu , Joseph Manzano , Charles Siegel , Jeff Daily

TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of…

TensorFlow is a popular emerging open-source programming framework supporting the execution of distributed applications on heterogeneous hardware. While TensorFlow has been initially designed for developing Machine Learning (ML)…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-03 Steven W. D. Chien , Stefano Markidis , Vyacheslav Olshevsky , Yaroslav Bulatov , Erwin Laure , Jeffrey S. Vetter

We describe TensorFlow-Serving, a system to serve machine learning models inside Google which is also available in the cloud and via open-source. It is extremely flexible in terms of the types of ML platforms it supports, and ways to…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-12-29 Christopher Olston , Noah Fiedel , Kiril Gorovoy , Jeremiah Harmsen , Li Lao , Fangwei Li , Vinu Rajashekhar , Sukriti Ramesh , Jordan Soyke

As machine learning (ML) has seen increasing adoption in safety-critical domains (e.g., autonomous vehicles), the reliability of ML systems has also grown in importance. While prior studies have proposed techniques to enable efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-07 Zitao Chen , Niranjhana Narayanan , Bo Fang , Guanpeng Li , Karthik Pattabiraman , Nathan DeBardeleben

Deep learning inference on embedded devices is a burgeoning field with myriad applications because tiny embedded devices are omnipresent. But we must overcome major challenges before we can benefit from this opportunity. Embedded processors…

Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-09 Yuan Yu , Martín Abadi , Paul Barham , Eugene Brevdo , Mike Burrows , Andy Davis , Jeff Dean , Sanjay Ghemawat , Tim Harley , Peter Hawkins , Michael Isard , Manjunath Kudlur , Rajat Monga , Derek Murray , Xiaoqiang Zheng

The advent of multi-/many-core processors in clusters advocates hybrid parallel programming, which combines Message Passing Interface (MPI) for inter-node parallelism with a shared memory model for on-node parallelism. Compared to the…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-15 Huan Zhou , Jose Gracia , Ralf Schneider

TensorFlow is a popular cloud computing framework that targets machine learning applications. It separates the specification of application logic (in a dataflow graph) from the execution of the logic. TensorFlow's native runtime executes…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-27 Sam Whitlock , James Larus , Edouard Bugnion

TensorFlow has been the most widely adopted Machine/Deep Learning framework. However, little exists in the literature that provides a thorough understanding of the capabilities which TensorFlow offers for the distributed training of large…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-14 Ammar Ahmad Awan , Jeroen Bedorf , Ching-Hsiang Chu , Hari Subramoni , Dhabaleswar K. Panda

Google's Machine Learning framework TensorFlow was open-sourced in November 2015 [1] and has since built a growing community around it. TensorFlow is supposed to be flexible for research purposes while also allowing its models to be…

Machine Learning · Computer Science 2016-12-06 Martin Schrimpf

Large-scale deep learning benefits from an emerging class of AI accelerators. Some of these accelerators' designs are general enough for compute-intensive applications beyond AI and Cloud TPU is one such example. In this paper, we…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-19 Kun Yang , Yi-Fan Chen , Georgios Roumpos , Chris Colby , John Anderson

The current trend of multicore architectures on shared memory systems underscores the need of parallelism. While there are some programming model to express parallelism, thread programming model has become a standard to support these system…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-12-13 D. T. Hasta , A. B. Mutiara

TensorFlow.js is a library for building and executing machine learning algorithms in JavaScript. TensorFlow.js models run in a web browser and in the Node.js environment. The library is part of the TensorFlow ecosystem, providing a set of…

We present a framework for experimenting with secure multi-party computation directly in TensorFlow. By doing so we benefit from several properties valuable to both researchers and practitioners, including tight integration with ordinary…

Cryptography and Security · Computer Science 2018-10-24 Morten Dahl , Jason Mancuso , Yann Dupis , Ben Decoste , Morgan Giraud , Ian Livingstone , Justin Patriquin , Gavin Uhma

This system paper documents the technical foundations for the extension of the Topology ToolKit (TTK) to distributed-memory parallelism with the Message Passing Interface (MPI). While several recent papers introduced topology-based…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-16 Eve Le Guillou , Michael Will , Pierre Guillou , Jonas Lukasczyk , Pierre Fortin , Christoph Garth , Julien Tierny

A number of popular systems, most notably Google's TensorFlow, have been implemented from the ground up to support machine learning tasks. We consider how to make a very small set of changes to a modern relational database management system…

Databases · Computer Science 2019-04-26 Dimitrije Jankov , Shangyu Luo , Binhang Yuan , Zhuhua Cai , Jia Zou , Chris Jermaine , Zekai J. Gao

The TensorFlow Distributions library implements a vision of probability theory adapted to the modern deep-learning paradigm of end-to-end differentiable computation. Building on two basic abstractions, it offers flexible building blocks for…

As a big data application, extreme multilabel classification has emerged as an important research topic with applications in ranking and recommendation of products and items. A scalable hybrid distributed and shared memory implementation of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-21 Pawan Kumar
‹ Prev 1 2 3 10 Next ›