CMOS technology scaling has been a primary driving force to increase the processor performance. A drawback of this trend lies in a continuing increase in leakage power dissipation, which now accounts for an increasingly large share of processor power dissipation. On-chip SRAM memories such as caches, branch predictor, and TLBs account for a large fraction of total processor power consumption and much of it is leakage power because of their large size. High leakage power dissipation not only increases the overall processor power dissipation but also increases its temperature. The positive feedback loop between temperature and leakage power causes a further increase in both of them. Furthermore, some of the SRAM-based structures are temperature hot-spots on a chip, e.g. register files, BTB, and ITLB. Finally, higher temperature reduces chip reliability and usable lifetime and increases the complexity of packaging and cooling design.
A number of process and circuit techniques have been proposed to significantly reduce the leakage of the memory cell array in SRAMs. Recent results have shown that leakage in SRAM peripheral circuits, such as word-line, input and output drivers, etc. are now the main sources of leakage. In response to the improving share of leakage in peripheral circuits, this thesis explores an integrated circuit and architectural approach to reduce leakage in on-chip SRAM memories peripheral circuitry.
At the circuit level, one approach to reduce the sub-threshold leakage in SRAM peripheral circuits is to use stacked sleep transistors. The drawback of using sleep transistors is the time delay that they add to SRAM access time, which may lead to increased execution time and therefore potentially higher energy consumption. To reduce SRAM “wakeup” delay this work proposes sharing sleep transistors and using them in a zig-zag, or alternating, fashion across stages of multi-stage drivers, such as the SRAM word-line driver. We show that by adapting the bias voltage of the sleep transistor in zigzag share circuit one can trade leakage reduction and wakeup delay in the zig-zag share scheme. We thus propose to use several low-leakage power modes with different wakeup times to better control the SRAM peripheral circuit leakage. We further explore the design space of sleep transistor insertion in SRAM peripheral circuitry and showed the effect of sleep transistor size, its gate bias and the number of horizontal and vertical level sharing on the trade off between the leakage power savings and the impact on instability, area, dynamic power, propagation delay, wakeup delay, rise time and fall time of the peripheral circuit of SRAM.
Now, the question is when and how to use these different low-leakage modes for each of the SRAM units to maximize the leakage reduction while minimizing the wakeup delay and its impact on performance. We answer to these questions at architecture level, by proposing several micro-architectural techniques to control multiple sleep modes.
We explored an integrated circuit and architectural approach to minimize leakage power dissipation and consequently also reduce the temperature of on-chip SRAMs in the L2, DL1 and IL1 caches, Branch Predictor, Floating Point and Integer Register Files, Floating Point and Integer Rename units, and Instruction and Data TLBs in both high performance and embedded processors.
We evaluated the leakage reduction in individual units and showed a large leakage power reduction. Resulting temperature reduction, including the effect of positive feedback between temperature and leakage power, is also evaluated. A significant temperature reduction is achieved in each unit.