Related papers: A generalized GPU-based connected component labeli…
This work describes the hardware implementation of a connected component labelling (CCL) module in reprogammable logic. The main novelty of the design is the "full", i.e. without any simplifications, support of a 4 pixel per clock format (4…
Connected Component Labeling (CCL) is an important step in pattern recognition and image processing. It assigns labels to the pixels such that adjacent pixels sharing the same features are assigned the same label. Typically, CCL requires…
We report on our implementation of LatticeQCD applications using OpenCL. We focus on the general concept and on distributing different parts on hybrid systems, consisting of both CPUs (Central Processing Units) and GPUs (Graphic Processing…
A connected component labeling algorithm is developed for implicitly-defined domains specified by multivariate polynomials. The algorithm operates by recursively subdividing the constraint domain into hyperrectangular subcells until the…
In this paper, we report an optimized union-find (UF) algorithm that can label the connected components on a 2D image efficiently by employing the GPU architecture. The proposed method contains three phases: UF-based local merge, boundary…
We study a class of simple algorithms for concurrently computing the connected components of an $n$-vertex, $m$-edge graph. Our algorithms are easy to implement in either the COMBINING CRCW PRAM or the MPC computing model. For two related…
Graph neural networks can accurately predict the chemical properties of many molecular systems, but their suitability for large, macromolecular assemblies such as gels is unknown. Here, graph neural networks were trained and optimised for…
Convolutional neural networks (CNNs) have achieved great success on grid-like data such as images, but face tremendous challenges in learning from more generic data such as graphs. In CNNs, the trainable local filters enable the automatic…
This paper studies the problem of graph-level clustering, which is a novel yet challenging task. This problem is critical in a variety of real-world applications such as protein clustering and genome analysis in bioinformatics. Recent years…
In a finite undirected simple graph, a chordless cycle is an induced subgraph which is a cycle. We propose a GPU parallel algorithm for enumerating all chordless cycles of such a graph. The algorithm, implemented in OpenCL, is based on a…
Modern distributed ML suffers from a fundamental gap between the theoretical and realized performance of collective communication algorithms due to congestion and hop-count induced dilation in practical GPU clusters. We present PCCL, a…
The rapid growth of large language models is driving organizations to expand their GPU clusters, often with GPUs from multiple vendors. However, current deep learning frameworks lack support for collective communication across heterogeneous…
A quantum network is a network of entangled states and can be used to transmit quantum information. Non-maximally entangled states are not really effective in establishing quantum communication across vast distances. Creating and…
Cluster identification tasks occur in a multitude of contexts in physics and engineering such as, for instance, cluster algorithms for simulating spin models, percolation simulations, segmentation problems in image processing, or network…
We formulate a practical yet challenging problem: General Partial Label Learning (GPLL). Compared to the traditional Partial Label Learning (PLL) problem, GPLL relaxes the supervision assumption from instance-level -- a label set partially…
Machine learning models are increasingly being trained across multiple GPUs and servers. In this setting, data is transferred between GPUs using communication collectives such as AlltoAll and AllReduce, which can become a significant…
Multi-Label Continual Learning (MLCL) builds a class-incremental framework in a sequential multi-label image recognition data stream. The critical challenges of MLCL are the construction of label relationships on past-missing and…
Graph Contrastive Learning (GCL) has shown superior performance in representation learning in graph-structured data. Despite their success, most existing GCL methods rely on prefabricated graph augmentation and homophily assumptions. Thus,…
In recent years, graph neural networks (GNN) have achieved significant developments in a variety of graph analytical tasks. Nevertheless, GNN's superior performance will suffer from serious damage when the collected node features or…
Recent years, graph contrastive learning (GCL), which aims to learn representations from unlabeled graphs, has made great progress. However, the existing GCL methods mostly adopt human-designed graph augmentations, which are sensitive to…