English

Distributed Matrix Factorization using Asynchrounous Communication

Distributed, Parallel, and Cluster Computing 2017-05-31 v1

Abstract

Using the matrix factorization technique in machine learning is very common mainly in areas like recommender systems. Despite its high prediction accuracy and its ability to avoid over-fitting of the data, the Bayesian Probabilistic Matrix Factorization algorithm (BPMF) has not been widely used on large scale data because of the prohibitive cost. In this paper, we propose a distributed high-performance parallel implementation of the BPMF using Gibbs sampling on shared and distributed architectures. We show by using efficient load balancing using work stealing on a single node, and by using asynchronous communication in the distributed version we beat state of the art implementations.

Keywords

Cite

@article{arxiv.1705.10633,
  title  = {Distributed Matrix Factorization using Asynchrounous Communication},
  author = {Tom Vander Aa and Imen Chakroun and Tom Haber},
  journal= {arXiv preprint arXiv:1705.10633},
  year   = {2017}
}

Comments

arXiv admin note: substantial text overlap with arXiv:1705.04159