|Topic:||A New Framework for Data Locality Optimization|
|Date:||Thursday, September 27, 2001|
|Place:||Gould-Simpson, Room 701|
With the growing processor-memory performance gap, understanding and optimizing a program's reference locality, and consequently, its cache performance, is becoming increasingly important. Unfortunately, traditional optimization frameworks rely heavily on static analysis and are ill suited to such cache-level memory system optimizations, which primarily use program profiles. This problem is compounded by the lack of a suitable representation of a program's dynamic data reference behavior. Consequently, current data layout techniques use profile information in a fairly ad hoc manner to guide data placement decisions.
We address these issues by proposing a quantitative basis for understanding and optimizing reference locality, and by describing efficient data reference representations and an exploitable locality abstraction that support this framework. Our data reference representations (Whole Program Streams and Stream Flow Graphs) are compact--two to four orders of magnitude smaller than the programs data reference trace--and permit efficient analysis--on the order of seconds to a few minutes--even for complex applications. Next, we describe a data layout optimization framework that uses our Whole Program Stream representation to replace the current profile-directed approach with an iterative profile-analysis approach, which prototypes and evaluates a transformation before it commits to it. Our framework unifies and strengthens current data layout optimizations and also enables a new style of optimization. The salient feature of our framework is that it relies neither on static analysis, nor uses the profile for merely assisting in tuning a single, hard-coded transformation, as is common in traditional profile-directed optimizations. Instead, the framework uses profile information to explore a broad space of possible optimizations, synthesizing a good one.