Optimized Memoryless Fair-Share HPC Resources Scheduling using Transparent Checkpoint-Restart Preemption

25 Feb 2021  ·  Kfir Zvi, Gal Oren ·

Common resource management methods in supercomputing systems usually include hard divisions, capping, and quota allotment. Those methods, despite their 'advantages', have some known serious disadvantages including unoptimized utilization of an expensive facility, and occasionally there is still a need to dynamically reschedule and reallocate the resources. Consequently, those methods involve bad supply-and-demand management rather than a free market playground that will eventually increase system utilization and productivity. In this work, we propose the newly Optimized Memoryless Fair-Share HPC Resources Scheduling using Transparent Checkpoint-Restart Preemption, in which the social welfare increases using a free-of-cost interchangeable proprietary possession scheme. Accordingly, we permanently keep the status-quo in regard to the fairness of the resources distribution while maximizing the ability of all users to achieve more CPUs and CPU hours for longer period without any non-straightforward costs, penalties or additional human intervention.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Distributed, Parallel, and Cluster Computing

Datasets


  Add Datasets introduced or used in this paper