Computer Science Colloquium

Colloquium Speaker

Speaker:		Peter A. Dinda Carnegie Mellon University

Topic:		Predicting Running Time Using Resource Signals

Date:		Monday, March 27, 2000

Time:		1:30 PM

Place:		Gould-Simpson, Room 701

Refreshments will be served in the 7th-floor lobby of Gould-Simpson at 1:15 PM

ABSTRACT

Consider an application running in a typical off-the-shelf distributed computing environment, one whose resources are shared by competing users and that supports neither resource reservations nor global priorities. Due to highly dynamic resource availability, the running time of a task on any particular host can vary widely, making it difficult for the application to deliver consistent high performance. However, the application has considerable freedom---it can choose which host will run the task and it may also be able to choose the task's resource requirements. If the application could accurately predict the running time of the task given the different choices, it could exploit this freedom to control the running time of the task. I have developed a system, the running time advisor, that can provide these predictions for compute-bound tasks. Each prediction is expressed as confidence interval in order to characterize the expected prediction error in an application-friendly manner and to enable valid statistical reasoning during the application's decision-making process.
The running time advisor is based on applying statistical signal analysis and prediction techniques to characterize and predict resource signals, which are easily measured, time-varying, scalar quantities that are strongly correlated with resource availability. The specific signal used is host load. Despite its complex properties, which include self-similarity and epochal behavior, host load can be usefully predicted using linear time series models. These predictions can in turn be used to estimate confidence intervals for the running times of tasks via a model of the Unix scheduler. I believe that the approach embodied in this system can be extended in a number of ways and to other resource signals, resulting in even richer prediction-based services that help applications deliver consistent high performance to their users.