Related papers: Observationally Cooperative Multithreading
Multi-threaded programs have traditionally fallen into one of two domains: cooperative and competitive. These two domains have traditionally remained mostly disjoint, with cooperative threading used for increasing throughput in…
There is an ongoing effort to provide programming abstractions that ease the burden of exploiting multicore hardware. Many programming abstractions (e.g., concurrent objects, transactional memory, etc.) simplify matters, but still involve…
OCaml is an industrial-strength, multi-paradigm programming language, widely used in industry and academia. OCaml is also one of the few modern managed system programming languages to lack support for shared memory parallel programming.…
The latest trends in high-performance computing systems show an increasing demand on the use of a large scale multicore systems in a efficient way, so that high compute-intensive applications can be executed reasonably well. However, the…
The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that…
Continual Model Merging (CMM) enables rapid customization of foundation models by sequentially incorporating task-adapted models without repeated retraining. However, existing merging rules usually update the deployed model through fixed…
The C/C++ memory model provides an interface and execution model for programmers of concurrent (shared-variable) code. It provides a range of mechanisms that abstract from underlying hardware memory models -- that govern how multicore…
Major chip manufacturers have all introduced Multithreaded processors. These processors are used for running a variety of workloads. Efficient resource utilization is an important design aspect in such processors. Particularly, it is…
Multicore parallel programming has some very difficult problems such as deadlocks during synchronizations and race conditions brought by concurrency. Added to the difficulty is the lack of a simple, well-accepted computing model for…
There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorithms are typically blocking, so they require fair scheduling. But GPU programming models (e.g.\ OpenCL) do not mandate fair scheduling, and…
Design of an efficient thread-safe concurrent data structure is a balancing act between its implementation complexity and performance. Lock-based concurrent data structures, which are relatively easy to derive from their sequential…
The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores…
Coherent causal memory (CCM) is causal memory in which prefixes of an execution can be mapped to global memory states in a consistent way. While CCM requires conflicting pairs of writes to be globally ordered, it allows writes to remain…
Concurrency has been rapidly gaining importance in general-purpose computing, caused by the recent turn towards multicore processing architectures. As a result, an increasing number of developers have to learn to write concurrent programs,…
Concurrency, the art of doing many things at the same time is slowly becoming a science. It is very difficult to master, yet it arises all over modern computing systems, both when the communication medium is shared memory and when it is by…
Software Transactional memory (STM) is an emerging abstraction for concurrent programming alternative to lock-based synchronizations. Most STM models admit only isolated transactions, which are not adequate in multithreaded programming…
The memory model is the crux of the concurrency semantics of shared-memory systems. It defines the possible values that a read operation is allowed to return for any given set of write operations performed by a concurrent program, thereby…
Exactly solving multi-objective integer programming (MOIP) problems is often a very time consuming process, especially for large and complex problems. Parallel computing has the potential to significantly reduce the time taken to solve such…
Objective Caml is a famous dialect of the ML family languages. It is well-known for its performance as a compiled programming language, notably thanks to its incremental generational automatic memory collection. However, for historical…
Despite the various research initiatives and proposed programming models, efficient solutions for parallel programming in HPC clusters still rely on a complex combination of different programming models (e.g., OpenMP and MPI), languages…