English
Related papers

Related papers: A Low Overhead Minimum Process Global Snapshop Col…

200 papers

The wireless mobile ad hoc network (MANET) architecture is one consisting of a set of mobile hosts capable of communicating with each other without the assistance of base stations. This has made possible creating a mobile distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-11-10 Ruchi Tuli , Parveen Kumar

A distributed system consisting of a huge number of computational entities is prone to faults, because faults in a few nodes cause the entire system to fail. Consequently, fault tolerance of distributed systems is a critical issue.…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-30 Junya Nakamura , Yonghwan Kim , Yoshiaki Katayama , Toshimitsu Masuzawa

To efficiently scale large model (LM) training, researchers transition from data parallelism (DP) to hybrid parallelism (HP) on GPU clusters, which frequently experience hardware and software failures. Existing works introduce in-memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-20 Yuxin Wang , Xueze Kang , Shaohuai Shi , Xin He , Zhenheng Tang , Xinglin Pan , Yang Zheng , Xiaoyu Wu , Amelie Chi Zhou , Bingsheng He , Xiaowen Chu

Recovery from transient failures is one of the prime issues in the context of distributed systems. These systems demand to have transparent yet efficient techniques to achieve the same. Checkpoint is defined as a designated place in a…

Networking and Internet Architecture · Computer Science 2011-09-01 Ruchi Tuli , Parveen Kumar

Taking snapshots of the state of a distributed computation is useful for off-line analysis of the computational state, for later restarting from the saved snapshot, for cloning a copy of the computation, and for migration to a new cluster.…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-09 Yao Xu , Gene Cooperman

In wireless sensor networks (WSNs), the sensed data by sensors need to be gathered, so that one very important application is periodical data collection. There is much effort which aimed at the data collection scheduling algorithm…

Data Structures and Algorithms · Computer Science 2018-10-30 Ngoc-Tu Nguyen , Bing-Hong Liu , Shao-I Chu , Hao-Zhe Weng

Real-time visual analysis tasks, like tracking and recognition, require swift execution of computationally intensive algorithms. Visual sensor networks can be enabled to perform such tasks by augmenting the sensor network with processing…

Computer Vision and Pattern Recognition · Computer Science 2017-05-24 Emil Eriksson , György Dán , Viktoria Fodor

A mobile computing system is a distributed system in which at least one of the processes is mobile. They are constrained by lack of stable storage, low network bandwidth, mobility, frequent disconnection and limited battery life.…

Databases · Computer Science 2012-06-08 Yogita Khatri

Massive machine-type communications protocols have typically been designed under the assumption that coordination between users requires significant communication overhead and is thus impractical. Recent progress in efficient activity…

Information Theory · Computer Science 2022-07-12 Justin Kang , Wei Yu

Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent. In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing…

Information Theory · Computer Science 2016-10-03 Mohamed Attia , Ravi Tandon

We focus on the problem of checkpointing (or taking a snapshot) in fully replicated eventually consistent distributed databases. In particular, we consider the problem of taking Distributed Transaction-Consistent Snapshots (DTCS). A typical…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-19 Raaghav Ravishankar , Sandeep Kulkarni , Nitin H Vaidya

In WSN, each sensor is responsible for sensing environmental conditions and sending them to the one or more base stations. Battery-operated sensors are severely constrained by the amount of energy that can be spend for transmitting these…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-10-16 Subhasis Bhattacharjee

Message aggregation is often used with a goal to reduce communication cost in HPC applications. The difference in the order of overhead of sending a message and cost of per byte transferred motivates the need for message aggregation, for…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-07 Kavitha Chandrasekar , Laxmikant Kale

Computational offloading has become an enabling component for edge intelligence in mobile and smart devices. Existing offloading schemes mainly focus on mobile devices and servers, while ignoring the potential network congestion caused by…

Networking and Internet Architecture · Computer Science 2024-01-23 Zhongyuan Zhao , Jake Perazzone , Gunjan Verma , Santiago Segarra

Accurate network synchronization is a key enabler for services such as coherent transmission, cooperative decoding, and localization in distributed and cell-free networks. Unlike centralized networks, where synchronization is generally…

Signal Processing · Electrical Eng. & Systems 2023-03-03 Dieter Verbruggen , Hazem Sallouha , Sofie Pollin

Modern mobile terminals often produce a large number of small data packets. For these packets, it is inefficient to follow the conventional medium access control protocols because of poor utilization of service resources. We propose a novel…

Information Theory · Computer Science 2014-09-05 Ronggui Xie , Huarui Yin , Xiaohui Chen , Zhengdao Wang

This paper considers base station cooperation (BSC) strategies for the uplink of a multi-user multi-cell high frequency reuse scenario where distributed iterative detection (DID) schemes with soft/hard interference cancellation algorithms…

Information Theory · Computer Science 2014-01-03 Peng Li , Rodrigo C. de Lamare

We develop distributed algorithms to allocate resources in multi-hop wireless networks with the aim of minimizing total cost. In order to observe the fundamental duplexing constraint that co-located transmitters and receivers cannot operate…

Networking and Internet Architecture · Computer Science 2016-11-15 Yufang Xi , Edmund M. Yeh

NVM-based systems are naturally fit candidates for incorporating periodic checkpointing (or snapshotting). This increases the reliability of the system, makes it more immune to power failures, and reduces wasted work in especially an HPC…

Hardware Architecture · Computer Science 2023-01-30 Akshin Singh , Smruti R. Sarangi

This paper seeks to address the question of designing distributed algorithms for the setting of compact memory i.e. sublinear bits working memory for arbitrary connected networks. The nodes in our networks may have much lower internal…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-22 Armando Castañeda , Jonas Lefèvre , Amitabh Trehan
‹ Prev 1 2 3 10 Next ›