Instance Based Locality Optimization

In this project, we have developed a semi-automatic method to restructure performance-critical code sections for better cache usage, and implemented this method in a tool called iblOpt. Locality is a property of programs that reflects the level of temporal and spatial concentration of the accesses to the same data or memory block, during program execution. Locality optimization increases the degree of a program's locality, and thus the chance that a memory block moved into cache still resides there when it is accessed again. The iblOpt project has focused on reducing cache misses of sequential programs, moreover some initial work has been done on applying the approach to automatic parallelization.

Instance-based locality optimization assumes to be given a short, performance-critical program section. As its central idea, the method does not optimize this program section directly, but instead considers one or several small instances thereof, which are obtained by setting loop bounds and other variable values to small constants. The method is based on the assumption that these program instances (PIs) have a similar pattern of cache usage like the original program (for a smaller cache), and thus the original program profits from the same restructurings as the PIs. Vice versa, if the PIs can not be profitably restructured, iblOpt concludes that the original program can not be restructured as well.

The iblOpt approach has a wider scope than current compiler transformations insofar as it considers general restructurings as opposed to a fixed set of transformations. By general restructuring, we mean any re-scheduling of the statements in the PI (that respects data dependencies) in combination with any re-grouping of the data into memory blocks.

The iblOpt tool automatically optimizes the PIs. Moreover, it supports the user in recognizing patterns in the suggested restructuring by incorporating regularity as a secondary goal in the optimization process, and through a visualization component. The tool has been implemented in C and Java, and was tested for loop programs from the field of scientific computing.

For further information please contact one of the following members of our staff:
Prof. Dr. Claudia Leopold
Michael Süß