English
Related papers

Related papers: Improved parallelization techniques for the densit…

200 papers

Shared-memory parallelization (SMP) strategies for density matrix renormalization group (DMRG) algorithms enable the treatment of complex systems in solid state physics. We present two different approaches by which parallelization of the…

Strongly Correlated Electrons · Physics 2009-11-10 G. Hager , E. Jeckelmann , H. Fehske , G. Wellein

There has been recent interest in the deployment of ab initio density matrix renormalization group computations on high performance computing platforms. Here, we introduce a reformulation of the conventional distributed memory ab initio…

Chemical Physics · Physics 2021-06-24 Huanchen Zhai , Garnet Kin-Lic Chan

The density matrix renormalization group (DMRG) algorithm is a popular alternating minimization scheme for solving high-dimensional optimization problems in the tensor train format. Classical DMRG, however, is based on sequential…

Numerical Analysis · Mathematics 2025-12-09 Laura Grigori , Muhammad Hassan

This paper presents a comparative analysis of distributed training strategies for large-scale neural networks, focusing on data parallelism, model parallelism, and hybrid approaches. We evaluate these strategies on image classification…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-01 Vishnu Vardhan Baligodugula , Fathi Amsaad

We demonstrate how to parallelize the density matrix renormalization group (DMRG) algorithm in real space through a straightforward modification of serial DMRG. This makes it possible to apply at least an order of magnitude more…

Strongly Correlated Electrons · Physics 2013-04-25 E. M. Stoudenmire , Steven R. White

The Density Matrix Renormalization Group (DMRG) algorithm is a powerful tool for solving eigenvalue problems to model quantum systems. DMRG relies on tensor contractions and dense linear algebra to compute properties of condensed matter…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-26 Ryan Levy , Edgar Solomonik , Bryan K. Clark

The approximate minimum degree algorithm is widely used before numerical factorization to reduce fill-in for sparse matrices. While considerable attention has been given to the numerical factorization process, less focus has been placed on…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-26 Yen-Hsiang Chang , Aydın Buluç , James Demmel

Training a deep neural network (DNN) requires substantial computational and memory requirements. It is common to use multiple devices to train a DNN to reduce the overall training time. There are several choices to parallelize each layer in…

Machine Learning · Computer Science 2024-07-08 Venmugil Elango

To prepare images for better segmentation, we need preprocessing applications, such as smoothing, to reduce noise. In this paper, we present an enhanced computation method for smoothing 2D object in binary case. Unlike existing approaches,…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-03-31 Ramzi Mahmoudi , Mohamed Akil

Triangle counting is a fundamental graph analytic operation that is used extensively in network science and graph mining. As the size of the graphs that needs to be analyzed continues to grow, there is a requirement in developing scalable…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-24 Ancy Sarah Tom , George Karypis

Nonnegative matrix factorization (NMF) is a powerful technique for dimension reduction, extracting latent factors and learning part-based representation. For large datasets, NMF performance depends on some major issues: fast algorithms,…

Optimization and Control · Mathematics 2015-07-01 Duy-Khuong Nguyen , Tu-Bao Ho

In this paper, we present a concurrent implementation of a powerful topological thinning operator. This operator is able to act directly over grayscale images without modifying their topology. We introduce an adapted parallelization…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-03-31 Ramzi Mahmoudi , Mohamed Akil , Petr Matas

The problem of finding dense components of a graph is a widely explored area in data analysis, with diverse applications in fields and branches of study including community mining, spam detection, computer security and bioinformatics. This…

Information Retrieval · Computer Science 2021-03-02 B. D. M. De Zoysa , Y. A. M. M. A. Ali , M. D. I. Maduranga , Indika Perera , Saliya Ekanayake , Anil Vullikanti

We propose new sequential sorting operations by adapting techniques and methods used for designing parallel sorting algorithms. Although the norm is to parallelize a sequential algorithm to improve performance, we adapt a contrarian…

Data Structures and Algorithms · Computer Science 2016-09-01 Alexandros V Gerbessiotis

The density matrix renormalization group (DMRG) is applied to some one-dimensional reaction-diffusion models in the vicinity of and at their critical point. The stochastic time evolution for these models is given in terms of a non-symmetric…

Statistical Mechanics · Physics 2011-10-11 Enrico Carlon , Malte Henkel , Ulrich Schollwoeck

We describe a computationally efficient, stochastic graph-regularization technique that can be utilized for the semi-supervised training of deep neural networks in a parallel or distributed setting. We utilize a technique, first described…

Machine Learning · Statistics 2018-05-31 Sunil Thulasidasan , Jeffrey Bilmes , Garrett Kenyon

We develop a novel parallel decomposition strategy for unweighted, undirected graphs, based on growing disjoint connected clusters from batches of centers progressively selected from yet uncovered nodes. With respect to similar previous…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-02-09 Matteo Ceccarello , Andrea Pietracaprina , Geppino Pucci , Eli Upfal

Data processing systems offer an ever increasing degree of parallelism on the levels of cores, CPUs, and processing nodes. Query optimization must exploit high degrees of parallelism in order not to gradually become the bottleneck of query…

Databases · Computer Science 2015-11-06 Immanuel Trummer , Christoph Koch

Smoothing filter is the method of choice for image preprocessing and pattern recognition. We present a new concurrent method for smoothing 2D object in binary case. Proposed method provides a parallel computation while preserving the…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-04-01 Ramzi Mahmoudi , Mohamed Akil

Deep learning models trained on large data sets have been widely successful in both vision and language domains. As state-of-the-art deep learning architectures have continued to grow in parameter count so have the compute budgets and times…

‹ Prev 1 2 3 10 Next ›