Register Pressure in Instruction Level Parallelism

Sid-Ahmed-Ali Touati

This thesis was defended in June, 25 th 2002 at the University of Versailles (France). The Ph.D. committee was composed of :

  1. Dr. Christine Eisenbeis, INRIA, Director
  2. Dr. Alain Darte, ENS Lyon, Referee
  3. Pr.Reinhard Wilhelm, University of Saarlandes, Referee
  4. Dr. Michael Schlansker, HP Labs, Referee
  5. Pr. François Bodin, University of Rennes
  6. Pr. William Jalby, University of Versailles, Director
  7. Pr. Dominique Barth, UVSQ, President

An electronic version on my Ph.D. thesis can be retrieved here.

Abstract

It has become a truism that memory accesses play the major role of degrading program performance. This is because the continuous increasing of the gap between instruction level parallelism (ILP) processor speed and memory access latency. Optimizing compilers must avoid requesting data from memory if possible by using at the best the available registers of underlying hardware. This thesis reconsiders the register pressure concept so that it gets higher priority than ILP scheduling, but with full respect to intrinsic fine grain parallelism. We propose to handle register pressure early in optimization process, before instruction scheduling. Two main strategies are developed.

In the first strategy, we handle data dependence graphs (DDGs) so that we guarantee register constraints without increasing critical execution paths if possible. We introduce and study the concept of register saturation (RS), which is the exact upper-bound of register requirement for all valid schedules independently of architectural constraints. Its aim is to add some serial arcs to the original DDG such that the worst register need does not exceed the number of available registers. On the other hand, register sufficiency (RF) is the exact minimal register requirement. Its aim is to detect unavoidable spilling decisions when it exceeds the number of available registers. After RS and RF analysis steps, ILP scheduler is free from register constraints and final allocator may not require avoidable spilling.

Our second strategy consists in directly applying an early register allocation with optimized ILP loss. It is built directly into the input DDG and hence register constraints are fixed.

Our thesis addresses basic blocks, acyclic control flow graphs (multiple basic blocs with branches) and innermost loops intended for software pipelining. We assume a generic architecture model so that it matches current ILP processors. We give an exact formulation with integer programming for all register pressure problems. We also provide algorithmic solutions. Experimental results show that our heuristics are nearly optimal. Our thesis proves that we can and must handle register constraints early while keeping freedom for a further ILP scheduling. This is more beneficial than a combined approach which tries to carry out register allocation and ILP scheduling in a single complex pass.

Keywords

Instruction Level Parallelism, Register Allocation, Register Saturation, Register Requirement, Register Sufficiency, Software Pipelining, Integer Linear Programming, Code Optimization, Optimizing Compilation.


Ce document a été traduit de LATEX par HEVEA