Related papers: Software Scalability Issues in Large Clusters
Having built up Linux clusters to more than 1000 nodes over the past five years, we already have practical experience confronting some of the LHC scale computing challenges: scalability, automation, hardware diversity, security, and rolling…
High-performance computing (HPC) is essential for tackling complex computational problems across various domains. As the scale and complexity of HPC applications continue to grow, the need for scalable systems and software architectures…
Many appplications in computational science are sufficiently compute-intensive that they depend on the power of parallel computing for viability. For all but the "embarrassingly parallel" problems, the performance depends upon the level of…
The emergence of the GRID architecture and related tools will have a large impact in the operation and design of present and future large clusters. We present here the ongoing efforts to equip the Linux Farm at the RHIC Computing Facility…
To save cost, recently more and more users choose to provision virtual machine resources in cluster systems, especially in data centres. Maintaining a consistent member view is the foundation of reliable cluster managements, and it also…
Containers are an emerging technology that hold promise for improving productivity and code portability in scientific computing. We examine Linux container technology for the distribution of a non-trivial scientific computing software stack…
To accommodate the needs of large-scale distributed P2P systems, scalable data management strategies are required, allowing applications to efficiently cope with continuously growing, highly dis tributed data. This paper addresses the…
This paper explores the issues around the construction of large-scale complex systems which are built as 'systems of systems' and suggests that there are fundamental reasons, derived from the inherent complexity in these systems, why our…
Container technologies such as Docker have become a crucial component of many software industry practices especially those pertaining to reproducibility and portability. The containerization philosophy has influenced the scientific…
Containers, enabling lightweight environment and performance isolation, fast and flexible deployment, and fine-grained resource sharing, have gained popularity in better application management and deployment in addition to hardware…
Enterprises and labs performing computationally expensive data science applications sooner or later face the problem of scale but unconnected infrastructure. For this up-scaling process, an IT service provider can be hired or in-house…
Nowadays, many individuals and teams involved on projects are already using agile development techniques as part of their daily work. However, we have much less experience in how to scale and manage agile practices in distributed software…
Recent developments in the commercial open source community have catalysed the use of Linux containers for scalable deployment of web-based applications to the cloud. Scientific software can be containerized with dependencies, configuration…
Nowadays most of the cloud applications process large amount of data to provide the desired results. Data volumes to be processed by cloud applications are growing much faster than computing power. This growth demands new strategies for…
This article examines the significant challenges encountered in implementing sharding within distributed replication systems. It identifies the impediments of achieving consensus among large participant sets, leading to scalability,…
This paper presents a stream-oriented architecture for structuring cluster applications. Clusters that run applications based on this architecture can scale to tenths of thousands of nodes with significantly less performance loss or…
Virtual clusters are widely used computing platforms than can be deployed in multiple cloud platforms. The ability to dynamically grow and shrink the number of nodes has paved the way for customised elastic computing both for High…
Microsoft Cluster Service (MSCS) extends the Win-dows NT operating system to support high-availability services. The goal is to offer an execution environment where off-the-shelf server applications can continue to operate, even in the…
Modern high-performance computing architectures (Multicore, GPU, Manycore) are based on tightly-coupled clusters of processing elements, physically implemented as rectangular tiles. Their size and aspect ratio strongly impact the achievable…
Software Product Lines are large-scale, multi-unit systems that enable massive, customized production. They consist of a base of reusable artifacts and points of variation that provide the system with flexibility, allowing generating…