Power

Home
Up

Approximations

A first order approximation of the  power consumption of the design is performed by applying blanket toggle rates to all nodes in the netlist of the SET-CSE design.  This gives us a ball-park idea of the power consumption in various components of the design, and the expected savings of a DET-CSE design.   Although this first-order analysis provides some insights into the power consumption of the logic gates, memories, and clocked storage elements, it does not give a good indication of the power consumption in the clock tree distribution network.  The netlist, output from synthesis, treats the clock node as an ideal node with infinate drive.  Thus, the wiring needed to balance and route the clock, as well as the buffers needed to drive the clock throughout this network is not accounted for.  To account for a real clock tree distribution network, an estimate of the total loading of an H-tree distribution network had been derived by previous research at ACSEL conducted by Nikola Nedovic.  The equation given below is a derived estimate of the total capacitance in an H-tree clock distribution network, based on fan-out of four (FO4) rules.

     equation (1)
 
equation 1 definitions:


The M*Cclk term is the total loading of the clock net due to the CSEs in the design.  This term can be taken straight from the netlist results.  The s*cw term can be calculated based on the total area reported by the synthesis results minus the "non-routable" area (i.e. memory cells, or other macros).  A capacitance per unit lenght of wire to be used is 0.2fF.  The number of levels in the H-tree should ensure that the leaf buffers are not driving too big of a load (i.e. also around FO4).   Incorporating equation 1 for the capacitance of the H-tree, an equation for the total power of the H-tree is derived and shown as equation (2).
 
    equation (2)
 
equation 2 definitions:



The summation term in equation 2 is the total power consumed by the clock buffers when switching.  The Energy of the clock buffer is a parameter that is easily obtained from any standard cell technology library. 
 

The results of this approximate power estimate will be posted in the results section of the Project web-page when available. 

More accurate Power analysis

A more accurate power analysis of the design can be obtained by linking in toggle rate data from simulations and applying them to the design.  The toggle data can be a testbench simulation (which probably does not reflect nominal operation of the design), or simulations running a sequence of instructions on the design (e.g. a piece of benchmark code).  A simplified tool flow for this method is show below in Figure 1. 


 



Figure 1 - Simplified tool flow for power analysis with toggle data
 

Details of Tool flow

  1. Synthesis - generate netlist of design
    bulletInputs: 
      1. RTL description of design
      2. Standard cell library
    bulletOutputs:
    1. Technology specific netlist.
       
  2.  Functional Simulation
    bulletInputs:
    1. Technology library behavioral models.
    2. Netlist of design.
    bulletOutputs:
    1. Toggle data for each node in netlist.
       
  3. Power Analysis -
    bulletInputs:
    1. Netlist of design.
    2. Toggle data for each node in netlist.
    bulletOutputs:
    1. Power Report


This flow produces more accurate results for the power consumption of the combinatorial logic functions throughout the design.  It provides the same inaccuracies for estimating the clock tree power as the method described in the previous section.  More accurate measurements of the clock tree can be obtained, by extracting wire-loading and paracitics from a floor plan or layout, and/or performing clock-tree synthesis for the design.  Currently it is not deemed necessary to acheive this level of acurracy for the clock tree power.  However, if in the future these steps are needed to refine the power analysis, the performance analysis section has more details on the tool flow involved for this procedure. 

 

Power savings of DET-CSE design

As mentioned previously, the power savings of the DET-CSE design is acheived by running the design at a lower clock frequency (half the SET-CSE clock frequency) thereby reducing the power consumed in the clock distribution network.  Equation 1 gives an expression for the total switching capacitance in the clock distribution network.  The first term in equation one is the total switching capacitance due to clock storage element loading, and the second term is due to clock distribution network wire loading.  For the case where M >> 4^L >> 1 the two terms can be approximated as shown in equatin 3 and 4. 

 

    equation (3)


     equation (4)
 


A useful parameter for estimating the power savings with using DET-CSEs is the alpha coefficient shown below is equation 5.  Once again, these results are the product of previous research at ACSEL conducted by Nikola Nedovic.  Alpha is the ratio of power consumption in the clock tree for DET-CSEs over SET-CSEs. 


 

equation 5 and 6

 
w is the the average capacitance of the wire needed to route the clock throughout the entire hierarchy of the clock distribution network.  Figure 2 shows is a plot of the power ratio alpha versus the loading of the SET-CSE device.  In the case where the loading of the clock tree is dominated by the wiring capacitance, or where the loading of the DET-CSE and SET-CSEs are close, the DET-CSE design saves 50% of the power consumed in the clock tree (i.e. alpha = 0.5).  As shown, DET-CSEs offer power savings even for designs where wire capacitance does not dominate and there is a significantly higher loading for the DET-CSE.
 

Home | Background | Project Details | Members | Schedule | Archive | Results

Last updated: 07/28/04.