Adaptive Performance Control of Internet-based Grids in a Dynamic Environment

posted Jan 17, 2012, 4:29 AM by Eric Medvet   [ updated Dec 10, 2012, 6:26 AM ]
  • 16th International Conference on Computer Communications and Networks (ICCCN), 2007, Honolulu (HI, US)
  • Paolo Vercesi, Alberto Bartoli
  • Google Scholar
Organizations are increasingly aggregating their computing resources to form Internet-based grids specialized in specific application workflows and made available to other organizations. The scheduling of jobs in such grids may clearly have a substantial impact on performance, but finding effective scheduling policies is hard due to the very same nature of this scenario. Performance may greatly depend on a myriad of parameters whose values can hardly be determined in practice. Moreover, the load injected by users is typically unpredictable, performance of Internet links may vary widely during an execution and computing resources at participating organizations could also vary dynamically, perhaps because of additional workloads injected by other competing activities. In this paper we propose mechanisms and policies for controlling the scheduling of jobs in such a highly dynamic environment. We attempt to minimize the resource usage at the participating organizations while maintaining the performance delivered to clients at an acceptable level. Our approach consists of a form of admission control at the entrance point of the application workflow that is simple to deploy in practice and does not need any hook from the participating organizations. We simply vary dynamically the maximum number of jobs that can be injected within the grid, based on performance measures taken on line, and delay excess jobs. We have evaluated our proposal in detail, by simulation, focussing on its ability to adapt automatically to perturbations in the form of substantial and unexpected changes in the amount of computing resources available. We have found that our proposal is indeed capable of finding automatically a suitable trade-off between throughput and resource usage, even in such a dynamic scenario.
Eric Medvet,
Jan 17, 2012, 5:06 AM