操作系统
Memory tiering has received wide adoption in recent years as an effective solution to address the increasing memory demands of memory-intensive workloads. However, existing tiered memory systems often fail to meet service-level objectives…
NVMe SSD hardware has witnessed widespread deployment as commodity and enterprise hardware due to its high performance and rich feature set. Despite the open specifications of various NVMe protocols by the NVMe Express group and NVMe being…
Adaptive Mixed-Criticality (AMC) is a fixed-priority preemptive scheduling algorithm for mixed-criticality hard real-time systems. It dominates many other scheduling algorithms for mixed-criticality systems, but does so at the cost of…
We discovered that a GPU kernel can have both idempotent and non-idempotent instances depending on the input. These kernels, called conditionally-idempotent, are prevalent in real-world GPU applications (490 out of 547 from six…
This work addresses the problem of exact schedulability assessment in uniprocessor mixed-criticality real-time systems with sporadic task sets. We model the problem by means of a finite automaton that has to be explored in order to check…
Heap-based exploits that leverage memory management errors continue to pose a significant threat to application security. The root cause of these vulnerabilities are the memory management errors within the applications, however various…
In many use cases the execution time of tasks is unknown and can be chosen by the designer to increase or decrease the application features depending on the availability of processing capacity. If the application has real-time constraints,…
Extended Berkeley Packet Filter (eBPF) is a runtime that enables users to load programs into the operating system (OS) kernel, like Linux or Windows, and execute them safely and efficiently at designated kernel hooks. Each program passes…
Real-time systems are intrinsic components of many pivotal applications, such as self-driving vehicles, aerospace and defense systems. The trend in these applications is to incorporate multiple tasks onto fewer, more powerful hardware…
In the context of Project Lilliput, which attempts to reduce the size of object header in the HotSpot Java Virtual Machine (JVM), we explore a curated set of synchronization algorithms. Each of the algorithms could serve as a potential…
In the case of compute-intensive machine learning, efficient operating system scheduling is crucial for performance and energy efficiency. This paper conducts a comparative study over FIFO(First-In-First-Out) and RR(Round-Robin) scheduling…
We present SupMario, a characterization framework designed to thoroughly analyze, model, and optimize CXL memory performance. SupMario is based on extensive evaluation of 265 workloads spanning 4 real CXL devices within 7 memory latency…
This research analyzed the performance and consistency of four synchronization mechanisms-reentrant locks, semaphores, synchronized methods, and synchronized blocks-across three operating systems: macOS, Windows, and Linux. Synchronization…
We leverage eBPF in order to implement custom policies in the Linux memory subsystem. Inspired by CBMM, we create a mechanism that provides the kernel with hints regarding the benefit of promoting a page to a specific size. We introduce a…
Memory access efficiency is significantly enhanced by caching recent address translations in the CPUs' Translation Lookaside Buffers (TLBs). However, since the operating system is not aware of which core is using a particular mapping, it…
Although best-fit is known to be slow, it excels at optimizing memory space utilization. Interestingly, by keeping the free memory region at the top of the memory, the process of memory allocation and deallocation becomes approximately…
We introduce explicit speculation, a variant of I/O speculation technique where I/O system calls can be parallelized under the guidance of explicit application code knowledge. We propose a formal abstraction -- the foreaction graph -- which…
Fully-partitioned fixed-priority scheduling (FP-FPS) multiprocessor systems are widely found in real-time applications, where spin-based protocols are often deployed to manage the mutually exclusive access of shared resources.…
The extended Berkeley Packet Filter (eBPF) is extensively utilized for observability and performance analysis in cloud-native environments. However, deploying eBPF programs across a heterogeneous cloud environment presents challenges,…
NVM is used as a new hierarchy in the storage system, due to its intermediate speed and capacity between DRAM, and its byte granularity. However, consistency problems emerge when we attempt to put DRAM, NVM, and disk together as an…