Workload modeling and prediction for workflow scheduling in dynamic grid environments
Many scientific applications utilize grid environments for processing their workloads, which consist of workflows. Grid environments are highly dynamic because the grid resources belong to different administrative domains, and have site-specific scheduling and resource management policies. Workflow scheduling in dynamic grid environments involves mapping the tasks of a workflow to the grid resources with an aim of optimizing certain objectives, e.g., the makespan of the workflow, the utilization of the grid resources, etc. Designing and evaluating new workflow scheduling algorithms requires comprehensive workload modeling, which is missing in contemporary research. Moreover, conventional grid workflow scheduling algorithms are based on some unrealistic assumptions, which do not necessarily hold in a dynamic grid environment. E.g., some approaches assume that at a time, only one workflow executes in a grid environment, the computation and the communication time of the workflows are known in advance, and the grid resources are always fully available. A workflow-based workload model representing the characteristics of workflows (e.g., workflows’ type, structure, and their requirements) is needed for evaluating the grid workflow scheduling algorithms. However, such a workload model is missing in contemporary research. This thesis presents a workflow-based workload model using a workflow test bench suite and integrates this workload model in a simulated grid environment. Moreover, this thesis presents the design and development of a decentralized grid workflow scheduler, which is based on predicting the Local Users’ (LUs) load at the grid resources in a dynamic grid environment. The new scheduling algorithms for scheduling the tasks of a workflow to the grid resources in the presence of their LUs’ load are proposed and implemented in the developed grid workflow scheduler. The proposed scheduling algorithms are named as the Local Users Load Prediction-based (LULP)-Max-Min, the LULP-Min-Min and the LULP-Sufferage. Real grid workload traces are used to represent the LUs’ load in the evaluation of the LULP-based scheduling algorithms. The proposed algorithms are compared with the Heterogeneous Earliest Finish Time First (HEFT) scheduling algorithm in terms of the makespan, the data movement time, and the resource effective utilization. The results show that the proposed algorithms achieved low makespan and data movement time. The decentralized grid workflow scheduler is extended for predicting the queue wait time of the grid resources. A workflow scheduling algorithm called Queue Wait time Prediction-based Grid Workflow Scheduling algorithm (QWP-GWS) is proposed and implemented in the grid workflow scheduler. The QWP-GWS aims to optimize multiple objectives, i.e., minimizing the makespan while maximizing the resource effective utilization. The results of the QWP-GWS algorithm are compared with the Min-Min, the Max-Min, the Sufferage and the HEFT scheduling algorithms in terms of the makespan, the queue wait time, the resource effective utilization, and the data movement time. The results show that the QWP-GWS algorithm achieved lower makespan without overwhelming the resource effective utilization and the data movement time. The LULP-Max-Min, the LULP-Min-Min, the LULP-Sufferage, and the QWP-GWS algorithms integrate the data saving algorithms proposed in this thesis in order to achieve low data movement time, which results in low makespan.
Nutzung und Vervielfältigung:
Alle Rechte vorbehalten