Related papers: A Model for Communication in Clusters of Multi-cor…
Although recent works try to improve collective communication in grid systems by separating intra and inter-cluster communication, the optimisation of communications focus only on inter-cluster communications. We believe, instead, that the…
Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another. Typically, the communication pattern between tasks is sparse…
The latest trends in high-performance computing systems show an increasing demand on the use of a large scale multicore systems in a efficient way, so that high compute-intensive applications can be executed reasonably well. However, the…
With the rapid growth in the volume of data sets, models, and devices in the domain of deep learning, there is increasing attention on large-scale distributed deep learning. In contrast to traditional distributed deep learning, the…
Recent advances in computing architectures and networking are bringing parallel computing systems to the masses so increasing the number of potential users of these kinds of systems. In particular, two important technological evolutions are…
In this tutorial paper, we will firstly review some basic simulation concepts and then introduce the parallel and distributed simulation techniques in view of some new challenges of today and tomorrow. More in particular, in the last years…
Nowadays the number of available processing cores within computing nodes which are used in recent clustered environments, are growing up with a rapid rate. Despite this trend, the number of available network interfaces in such computing…
The rise of Big Data has led to new demands for Machine Learning (ML) systems to learn complex models with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics…
In a technological landscape that is quickly moving toward dense multi-CPU and multi-core computer systems, where using multithreading is an increasingly popular application design decision, it is important to choose a proper model for…
Clustering large datasets is a fundamental problem with a number of applications in machine learning. Data is often collected on different sites and clustering needs to be performed in a distributed manner with low communication. We would…
We investigate the minimal number of failures that can partition a system where processes communicate both through shared memory and by message passing. We prove that this number precisely captures the resilience that can be achieved by…
Transfer learning has recently become the dominant paradigm of machine learning. Pre-trained models fine-tuned for downstream tasks achieve better performance with fewer labelled examples. Nonetheless, it remains unclear how to develop…
Next-generation wireless technologies (for immersive-massive communication, joint communication and sensing) demand highly parallel architectures for massive data processing. A common architectural template scales up by grouping tens to…
Grid Computing is an idea of a new kind of network technology in which research work in progress. There is a great deal of hype in this technology based area for that reason it is getting a great deal of attention of the computing…
Upcoming many core processors are expected to employ a distributed memory architecture similar to currently available supercomputers, but parallel pattern mining algorithms amenable to the architecture are not comprehensively studied. We…
Conventional cache models are not suited for real-time parallel processing because tasks may flush each other's data out of the cache in an unpredictable manner. In this way the system is not compositional so the overall performance is…
While load balancing in distributed-memory computing has been well-studied, we present an innovative approach to this problem: a unified, reduced-order model that combines three key components to describe "work" in a distributed system:…
This thesis is concerned with the design of distributed algorithms for solving optimization problems. We consider networks where each node has exclusive access to a cost function, and design algorithms that make all nodes cooperate to find…
The demand for higher data rates and the scarce spectrum resources drive the adoption of collaborative communication techniques. In this work we shown that the existing cluster based collaborative schemes can be greatly improved in terms of…
Artificial intelligence has made remarkable progress in handling complex tasks, thanks to advances in hardware acceleration and machine learning algorithms. However, to acquire more accurate outcomes and solve more complex issues,…