Resource Management and Job Scheduling in Cloud Computing

Cloud computing has emerged as an important computing paradigm. One critical issue is to guarantee the service quality for end users. To address this problem, we design smart algorithms in cloud computing environment to improve the performance of jobs submitted by users. At present, we mainly focus on two metrics of job which are completion time and resource consumption.



1. Speculative Execution for a single job in a MapReduce-like cluster

Parallel processing plays an important role for large-scale data analytics. It breaks a job into many small tasks which run parallel on multiple machines such as MapReduce framework. One fundamental challenge faced to such parallel processing is the straggling tasks as they can delay the completion of a job seriously.

In this project, we focus on the speculative execution issue which is used to deal with the straggling problem in the literature. We present a theoretical framework for the optimization of a single job which differs a lot from the previous heuristics-based work. More precisely, we propose two schemes when the number of parallel tasks the job consists of is smaller than cluster size. In the first scheme, no monitoring is needed and we can provide the job deadline guarantee with a high probability while achieve the optimal resource consumption level. The second scheme needs to monitor the task progress and makes the optimal number of duplicates when the straggling problem happens. On the other hand, when the number of tasks in a job is larger than the cluster size, we propose an Enhanced Speculative Execution (ESE) algorithm to make the optimal decision whenever a machine is available for a new scheduling. The simulation results show the ESE algorithm can reduce the job flowtime by 50% while consume fewer resources comparing to the strategy without backup.


Publications:


´╗┐Copyright ┬ę 2015. All Rights Reserved. MobiTeC, The Chinese University of Hong Kong.
Disclaimer Privacy Statement