Related papers: A Parallel Implementation of Computing Mean Averag…
We propose Stochastic Weight Averaging in Parallel (SWAP), an algorithm to accelerate DNN training. Our algorithm uses large mini-batches to compute an approximate solution quickly and then refines it by averaging the weights of multiple…
The research on hashing techniques for visual data is gaining increased attention in recent years due to the need for compact representations supporting efficient search/retrieval in large-scale databases such as online images. Among many…
Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and…
Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and…
Object detection is a fundamental vision task. It has been highly researched in academia and has been widely adopted in industry. Average Precision (AP) is the standard score for evaluating object detectors. Our understanding of the…
We present a method for training CNN-based object class detectors directly using mean average precision (mAP) as the training loss, in a truly end-to-end fashion that includes non-maximum suppression (NMS) at training time. This contrasts…
The construction of Mapper has emerged in the last decade as a powerful and effective topological data analysis tool that approximates and generalizes other topological summaries, such as the Reeb graph, the contour tree, split, and joint…
Deep learning object detectors often return false positives with very high confidence. Although they optimize generic detection performance, such as mean average precision (mAP), they are not designed for reliability. For a reliable…
Being intensively studied, visual tracking has seen great recent advances in either speed (e.g., with correlation filters) or accuracy (e.g., with deep features). Real-time and high accuracy tracking algorithms, however, remain scarce. In…
Using parallel embedded systems these days is increasing. They are getting more complex due to integrating multiple functionalities in one application or running numerous ones concurrently. This concerns a wide range of applications,…
Pseudo-arclength continuation is a well-established method for generating a numerical curve approximating the solution of an underdetermined system of nonlinear equations. It is an inherently sequential predictor-corrector method in which…
Distributed learning is commonly used for training deep learning models, especially large models. In distributed learning, manual parallelism (MP) methods demand considerable human effort and have limited flexibility. Hence, automatic…
In-memory associative processor architectures are offered as a great candidate to overcome memory-wall bottleneck and to enable vector/parallel arithmetic operations. In this paper, we extend the functionality of the associative processor…
The convergence of the generalized alternating projection (GAP) algorithm is studied in this paper to solve the compressive sensing problem $\yv = \Amat \xv + \epsilonv$. By assuming that $\Amat\Amat\ts$ is invertible, we prove that GAP…
Markov Chain Monte Carlo (MCMC) algorithms are essential tools in computational statistics for sampling from unnormalised probability distributions, but can be fragile when targeting high-dimensional, multimodal, or complex target…
Exactly solving multi-objective integer programming (MOIP) problems is often a very time consuming process, especially for large and complex problems. Parallel computing has the potential to significantly reduce the time taken to solve such…
Alternating projection (AP) of various forms, including the Parallel AP (PAP), Real-constrained AP (RAP) and the Serial AP (SAP), are proposed to solve phase retrieval with at most two coded diffraction patterns. The proofs of geometric…
MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI…
Bayesian inference remains one of the most important tool-kits for any scientist, but increasingly expensive likelihood functions are required for ever-more complex experiments, raising the cost of generating a Monte Carlo sample of the…
Despite the enormous success of Hamiltonian Monte Carlo and related Markov Chain Monte Carlo (MCMC) methods, sampling often still represents the computational bottleneck in scientific applications. Availability of parallel resources can…