# TRANSIENT POWER MINIMIZATION THROUGH DATAPATH SCHEDULING IN MULTIPLE SUPPLY VOLTAGE ENVIRONMENT

Saraju P. Mohanty, N. Ranganathan and Sunil K. Chappidi

Department of Computer Science and Engineering Nanomaterial and Nanomanufacturing Research Center University of South Florida, Tampa, FL 33620 {smohanty,ranganat,chappidi}@csee.usf.edu

## ABSTRACT

In designs for battery driven portable applications, the reduction of peak power, peak power differential, average power and energy are equally important. In [1], a parameter called "cycle power profile function" is defined that captures the above power parameters and a heuristic algorithm is proposed using multiple voltages and dynamic clocking for its minimization. In this paper, we redefine the CPF, denoted as CPFMC for multiple voltages and multicycling (MVMC). Then, we modify nonlinear CPFMC to facilitate its minimization using ILP through datapath scheduling. Experiments conducted for various high level synthesis bechmarks reveal significant reductions in all power parameters alongwith.

## 1. INTRODUCTION

With the increase in chip densities and clock frequencies, the demand for design of low power integrated circuits has increased and reliability has become a critical issue. Both peak power and peak power differential drive the transient characterstics of the CMOS circuit. Large current ¤ow due to high peak power causes IR drop in the power line, which leads to reduction of the supply voltage levels. High current ¤ow can reduce reliability because of hot electron effects and high current density. If the power dissipation is large, then the heat generated out of the system is large. The larger  $\frac{di}{dt}$  for larger peak power differential can cause power supply noise because of self inductance of power supply lines and can also cause crosstalk. The more the power ¤uctuation lesser is the electrochemical conversion, hence shorter battery life. If the average power (or energy) consumption is high battery life time may reduce.

Several datapath scheduling algorithms have been proposed that minimize energy or average power. But, there are few datapath scheduling techniques minimizing peak power or peak power differential. The datapath scheduling techniques, such as [2, 3] use multiple voltages for minimization of energy, but not the transient power. In [4], genetic algorithms have been used for optimization of both average and peak power through simultaneous assignment and scheduling. ILP based scheduling and force directed scheduling have been proposed in [5, 6] to minimize peak power under latency constaints. In [7], the authors propose ILP based datapath scheduling schemes for peak power minimization under resource constraints using multiple voltages, dynamic clocking and multicycling. The authors in [8] propose the use of data monitor operations for reduction of peak power and peak power differential. However, these works do not consider the energy minimization. In this work, we consider simulatenous minimization of transient power, average power and energy using multiple voltage and multicycling.

## 2. CYCLE POWER PROFILE FUNCTION (CPFMC)

In this section, a parameter called cycle power pro£le function is introduced that captures peak power, peak power differential, average power and mean cycle difference power of datapath circuit. The CPFMC characterizes the transient power and its minimization using multiple voltages also results in minimization of energy. The datapath is represented as a sequencing data ¤ow graph (DFG). The following notations are used in description :

| N              | : total number of control steps                      |
|----------------|------------------------------------------------------|
| c              | : a control step and $1 \le c \le N$                 |
| $P_c$          | : power consumption in c                             |
| $P_p$          | : peak power consumption                             |
| P              | : average power consumption                          |
| $P_n$          | : normalised average power                           |
| $DP_c$         | : difference power for cycle $c$                     |
| $DP_p$         | : peak differential power                            |
| $DP^{-}$       | : mean of the difference powers                      |
| $DP_n$         | : normalised DP                                      |
| $R_c$          | : number of resources active in step $c$             |
| $\alpha_{i,c}$ | : switching activity of resource $i$ , active in $c$ |
| $V_{i,c}$      | : operating voltage of resource $i$ , active in $c$  |
| $C_{i,c}$      | : load capacitance of resource $i$ , active in $c$   |
| f              | : clock frequency                                    |

The power consumption for any step c is given by,

$$P_{c} = \sum_{i=1}^{R_{c}} \alpha_{i,c} C_{i,c} V_{i,c}^{2} f$$
 (1)

The peak power consumption of the DFG is the maximum power consumption over all the control steps,

$$P_p = max \left( P_c \right)_{\forall c} = max \left( \sum_{i=1}^{R_c} \alpha_{i,c} C_{i,c} V_{i,c}^2 f \right)_{\forall c} \quad (2)$$

The mean cycle power (P) that captures the average power consumption of the datapath can be defined as,

$$P = \frac{1}{N} \sum_{c=1}^{N} P_c = \frac{1}{N} \sum_{c=1}^{N} \left( \sum_{i=1}^{R_c} \alpha_{i,c} C_{i,c} V_{i,c}^2 f \right)$$
(3)

The normalised mean cycle power  $(P_n)$  is given as,

$$P_{n} = \frac{P}{P_{p}} = \frac{\frac{1}{N} \sum_{c=1}^{N} \sum_{i=1}^{R_{c}} \alpha_{i,c} C_{i,c} V_{i,c}^{2} f}{max \left( \sum_{i=1}^{R_{c}} \alpha_{i,c} C_{i,c} V_{i,c}^{2} f \right)_{\forall c}}$$
(4)

The cycle power  $\bowtie$  uctuation  $(DP_c)$  for a control step is,

$$DP_c = |P - P_c| \tag{5}$$

The maximum power  $\bowtie$  uctuation  $(DP_p)$  is given by :

$$DP_p = max (|P - P_c|)_{\forall c} \tag{6}$$

The mean cycle difference power (DP) is the mean (average) of the cycle power  $\bowtie$  uctuation  $(DP_c)$ .

$$DP = \frac{1}{N} \sum_{c=1}^{N} DP_c = \frac{1}{N} \sum_{c=1}^{N} |P - P_c|$$
(7)

The normalised mean cycle difference power is,

$$DP_n = \frac{DP}{DP_p} \tag{8}$$

The cycle power pro£le function is de£ned as equally weighted sum of normalized mean cycle power and normalized mean cycle difference power as given below.

$$CPFMC = P_n + DP_n = \frac{P}{P_p} + \frac{DP}{DP_p}$$
(9)

From the Eqn. 9, we observe that CPFMC is a nonlinear function due to the absolute function in the differential component and also due to its fractional form. Nonlinear optimization techniques need to be used for its optimum minimization, which are of large time and space complexity. In this work, we aim at developing ILP-based model for its minimization. In order to simplify the ILPbased model, we modify the CPFMC. We know, the denominator parameters,  $P_p$  equals to  $max(P_c)_{\forall c}$  and the  $DP_p$  equals to  $max(|P-P_c|)_{\forall c}$ . It is evident that  $|P-P_c|$ is upper bounded by  $P_c$  for any c, since  $|P - P_c|$  is a measure of absolute deviation of  $P_c$  from mean P. Thus, we conclude that  $DP_p$  is upper bounded by  $P_p$ . We modify the CPFMC by substituting  $DP_p$  with  $P_p$  and define  $CPFMC^*$  as follows :

$$CPFMC^* = \frac{P}{P_p} + \frac{DP}{P_p} = \frac{P+DP}{P_p}$$
(10)

The absence of  $DP_p$ , in the denominator helps in reducing the complexity of the ILP formulations in a greater extent.

#### 3. ILP FORMULATIONS TO MINIMIZE CPFMC

In this section, we describe the ILP formulations for modifed cycle power profile function  $(CPFMC^*)$  using multiple supply voltages and multicycling. In this scheme, the functional units (FU) are operated at multiple supply voltages and the lower operating voltage functional units are scheduled in consecutive control steps. The following notations are used to formulate an ILP based model :

| 0             | : total number of operations in the DFG        |
|---------------|------------------------------------------------|
| $O_i$         | : any operation $i, 1 \leq i \leq O$           |
| $F_{k,v}$     | : FU of type $k$ operating at voltage $v$      |
| $M_{k,v}$     | : maximum number of $F_{k,v}$                  |
| $S_i$         | : ASAP time stamp for the operation $o_i$      |
| $E_i$         | : ALAP time stamp for the operation $o_i$      |
| P(i, v, f)    | : power consumption of $F_{k,v}$ used by $o_i$ |
| $y_{i,v,l,m}$ | : decision variable which takes the value      |
|               | of 1 if operation $o_i$ uses $F_{k,v}$ and     |
|               | scheduled in control steps $l \rightarrow m$   |
| $L_{i,n}$     | : latency for operation $o_i$ using $F_{k,v}$  |

(a) Objective Function : The objective is to minimize the  $CPFMC^*$  of the whole DFG over all control steps. Using Eqn. 10, 3 and 7, this can be represented as :

Minimize: 
$$\frac{\frac{1}{N}\sum_{c=1}^{N}P_{c}+\frac{1}{N}\sum_{c=1}^{N}|P-P_{c}|}{P_{p}}$$
 (11)

As discussed in the previous section, this objective function has the two types of non-linearities introduced because of the absolute function and the fractional form. The fractional non-linearity [9] is removed by introducing the denominators as a constraint; corresponding constraints are known as "peak power constraints". Then, the problem in Eqn. 11 tranforms to the one given below.

Minimize: 
$$\frac{1}{N} \sum_{c=1}^{N} P_c + \frac{1}{N} \sum_{c=1}^{N} |P - P_c|$$
 (12)  
Subject to : Peak power constraints

This transformed problem has still the non-linearity in it because of the absolute function. We remove the absolute function non-linearity [9] by modifying the peak power constraints which give rises to "modi£ed peak power constraints". Thus, the problem in Eqn. 12 transforms to,

Minimize : 
$$\frac{1}{N} \sum_{c=1}^{N} P_c + \frac{1}{N} \sum_{c=1}^{N} (P + P_c)$$
 (13)  
Subject to : Modi£ed peak power constraints

The "peak power constraint" and "modi£ed peak power constraint" will be discussed in later part of the section. Using the Eqn. 3 the problem in Eqn. 13 is simpli£ed to :

Minimize : 
$$\left(\frac{3}{N}\right)\sum_{c=1}^{N}P_c$$
 (14)  
Subject to : Modified peak power constraints

Using the decision variables the objective function becomes,

Min:  $\left(\frac{3}{N}\right) \sum_{l} \sum_{i \in F_{k,v}} \sum_{v} y_{i,v,l,(l+L_{i,v}-1)} P(i,v,f)$  (15) Subject to :Modi£ed peak power constraints

(b) Uniqueness Constraints : These constraints ensure that every operation  $o_i$  is scheduled in appropriate control steps

within the mobility range  $(S_i, E_i)$  with a particular supply voltage. When the operators are operating at highest voltage, they are scheduled in one unique control step, whereas, when they are to be operated at lower voltages they need more than one clock cycle for completion. Thus, for lower voltage the mobility is restricted. We represent them as,  $\forall i, 1 \le i \le O$ ,

$$\sum_{v} \sum_{l=S_i}^{S_i+E_i+1-L_{i,v}} y_{i,v,l,(l+L_{i,v}-1)} = 1$$
 (16)

(c) <u>Precedence Constraints</u>: These constraints guarantee that for an operation  $o_i$ , all its predecessors are scheduled in earlier control steps and its successors are scheduled in later control steps. These constraints also take the multicycling into consideration. These constraints are enforced as,  $\forall i, j, o_i \in Pred_{o_i}$ ,

$$\sum_{v} \sum_{l=S_{i}}^{E_{i}} (l+L_{i,v}-1) * y_{i,v,l,(l+L_{i,v}-1)} - \sum_{v} \sum_{l=S_{i}}^{E_{j}} l * y_{j,v,l,(l+L_{j,v}-1)} \le -1$$
(17)

(d) <u>Resource Constraints</u> : These constraints ensure that no control step needs  $F_{k,v}$  more than available  $(M_{k,v})$  and are enforced as,  $\forall v$  and  $\forall l$ ,  $1 \leq l \leq N$ ,

$$\sum_{i \in F_{k,v}} \sum_{l} y_{i,v,l,(l+L_{i,v}-1)} \le M_{k,v}$$
(18)

(e) <u>Peak Power Constraints</u> : To eliminate the fractional non-linearity these constraints are used. These constraints ensure that the maximum power consumption of the DFG does not exceed  $P_p$  for any control step. We enforce these constraints as follows,  $\forall l, 1 \leq l \leq N$ ,

$$\sum_{i \in F_{k,v}} \sum_{v} y_{i,v,l,(l+L_{i,v}-1)} * P(i,v,f) \le P_p \quad (19)$$

(f) <u>Modi£ed Peak Power Constriants</u>: To eliminate the non-linearity introduced due to the absolute function, we modify the above peak power constraints (as outlined in Eqn. 13 to 15, [9]) to,  $\forall l, 1 \leq l \leq N$ ,

$$\frac{1}{N} \sum_{l} \sum_{i \in F_{k,v}} \sum_{v} y_{i,v,l,(l+L_{i,v}-1)} * P(i,v,f) - \sum_{i \in F_{k,v}} \sum_{v} y_{i,v,l,(l+L_{i,v}-1)} * P(i,v,f) \le P_p^*$$
(20)

The  $P_p^*$  is a modified peak constraint which is added to the objective function and minimized alongwith it.

#### 4. SCHEDULING ALGORITHM

The target architecture model assumed by the scheduling schemes is same as the one used in [3]. All functional units have one register each and one multiplexor. The register and the multiplexor operate at the same voltage level as that of the functional units. Level converters are used when a low-voltage functional unit is driving a highvoltage functional unit. A controller decides which of the functional units are active in each control step and those that are not active are disabled using the multiplexors. The ILP based scheduling scheme using multiple voltage and multicycling is outlined below.

| Step 2 : Determine the mobility for each node.   |  |
|--------------------------------------------------|--|
| Step 3 : Modify mobility graph for multicycling. |  |
| Step 4 : Construct ILP formulations for the DFG. |  |
| Step 5 : Solve ILP formulations using LP-Solve.  |  |
| Step 6 : Obtain the scheduled DFG.               |  |
| Step 7 : Estimate the power, energy and delay.   |  |

The inputs to the algorithm are an unscheduled data  $\mathbb{P}$ ow graph (UDFG), the resource constraints, the allowable voltage levels, delay of each resource, switching capacitance of each resource, The resource constraint includes the number of ALUs and multipliers at different voltage levels. The scheduling algorithm determines the proper time stamp for each operation, and voltage level such that the function  $CPFMC^*$  is minimum.

## 5. RESULTS AND CONCLUSIONS

The scheduling scheme is tested for the same benchmarks using the same characterised datapath cells as in [7]. Following are the notations used to express the results.

| S                 | : single voltage operation                                                                   |
|-------------------|----------------------------------------------------------------------------------------------|
| MC                | : multiple voltages and multicycling                                                         |
| $P_{pS}, P_{pMC}$ | : peak power consumption                                                                     |
| $P_{mS}, P_{mMC}$ | : minimum power consumption                                                                  |
| $P_S, P_{MC}$     | : average power consumption                                                                  |
| $T_S, T_{MC}$     | : critical path delay (ns)                                                                   |
| $E_S, E_{MC}$     | : total energy consumption $(nJ)$                                                            |
| $EDP_S$           | $: (= E_S * T_S) (10^{-18} J_s)$                                                             |
| $EDP_{MC}$        | $: (= E_{MC} * T_{MC}) (10^{-18} Js)$                                                        |
| $\Delta P_p$      | : reduction in $P_p$                                                                         |
| *                 | $\left(\frac{(P_{p_S} - P_{p_{MC}})}{P_{p_S}} * 100\right)$                                  |
| $\Delta DP$       | : reduction in differential power                                                            |
|                   | $\left(\frac{(P_{p_S} - P_{m_S}) - (P_{p_MC} - P_{m_MC})}{(P_{p_S} - P_{m_S})} * 100\right)$ |
| $\Delta P$        | : reduction in $P\left(\frac{P_S - P_{MC}}{P_S} * 100\right)$                                |
| $\Delta E$        | : reduction in $E\left(\frac{E_S - E_{MC}}{E_S} * 100\right)$                                |
| $\Delta EDP$      | $: \left(= \frac{(EDP_S - EDP_{MC})}{EDP_S} * 100\right)$                                    |

The sets of resource constraints used are given below.

| Multipliers                                | ALUs                                            |
|--------------------------------------------|-------------------------------------------------|
| $\overline{2}$ at $3.3V$ and $1$ at $5.0V$ | $1 \ {\rm at} \ 3.3V$ and $1 \ {\rm at} \ 5.0V$ |
| 3  at  3.3V                                | $1 \ {\rm at} \ 3.3V$ and $1 \ {\rm at} \ 5.0V$ |
| 2  at  3.3V                                | 2  at  5.0 V                                    |
| 1 at $3.3V$ and 1 at $5.0V$                | and ALUs 1 at $5.0V$                            |
| 2  at  3.3V                                | 1  at  5.0V                                     |

The experimental results for various benchmarks are reported in Table 1. The power / energy estimate include the power consumption of the overheads. It is assumed that each resource has equal switching activity of 0.5. From the experimental results it is evident that signifcant energy and power reduction could be achieved for all the benchmarks and resource constraints. There are no peak power reductions for resource constraint RC4 in case of EXP and ARF benchmarks. The scheduling scheme did not degrade the performance of the datapath circuit proven

| Bench-                      | R | $P_{p_S}$          | $P_{PMC}$ | $\Delta P_p$ | $P_{mS}$ | $P_{mMC}$ | $\Delta DP$ | $P_S$ | $P_{MC}$ | $\Delta P$ | $E_S$ | $E_{MC}$ | $\Delta E$ | $\Delta EDP$ |
|-----------------------------|---|--------------------|-----------|--------------|----------|-----------|-------------|-------|----------|------------|-------|----------|------------|--------------|
| marks                       | С | $(m\widetilde{W})$ | (mW)      | (%)          | (mW)     | (mW)      | (%)         | (mW)  | (mW)     | (%)        | (nJ)  | (nJ)     | (%)        | (%)          |
| 1                           | 2 | 3                  | 4         | 5            | 6        | 7         | 8           | 9     | 10       | 11         | 12    | 13       | 14         | 15           |
|                             | 1 | 79.3               | 56.9      | 28.2         | 2.0      | 1.4       | 28.2        | 40.7  | 27.6     | 32.0       | 6.7   | 4.2      | 37.6       | 16.8         |
| (1)                         | 2 | 79.3               | 51.8      | 34.6         | 2.0      | 1.4       | 34.8        | 40.7  | 26.4     | 35.1       | 6.7   | 2.9      | 55.9       | 41.2         |
| E                           | 3 | 79.3               | 34.5      | 56.4         | 2.0      | 2.0       | 57.9        | 40.7  | 21.3     | 47.5       | 6.7   | 3.0      | 55.0       | 25.0         |
| Х                           | 4 | 40.7               | 57.9      | 0            | 1.0      | 1.0       | 0           | 30.5  | 29.2     | 4.2        | 6.7   | 5.5      | 18.3       | 18.3         |
| Р                           | 5 | 79.3               | 35.6      | 55.1         | 1.0      | 1.0       | 55.8        | 30.5  | 21.3     | 30.0       | 6.7   | 3.0      | 55.0       | 43.7         |
|                             |   | Average V          | alues     | 43.6         |          |           | 35.3        |       |          | 29.8       |       |          | 44.4       | 29.0         |
|                             | 1 | 80.3               | 74.2      | 7.6          | 1.0      | 1.0       | 7.7         | 40.3  | 30.3     | 24.8       | 11.2  | 6.2      | 44.2       | 33.1         |
| (2)                         | 2 | 118.9              | 51.8      | 56.4         | 1.0      | 0.4       | 56.4        | 40.5  | 29.1     | 28.1       | 11.2  | 4.9      | 56.4       | 47.7         |
| F                           | 3 | 80.3               | 35.5      | 55.7         | 1.0      | 1.0       | 56.4        | 40.5  | 25.2     | 37.5       | 11.2  | 5.0      | 55.3       | 37.4         |
| I                           | 4 | 79.3               | 57.9      | 26.9         | 1.0      | 1.0       | 27.3        | 40.5  | 32.0     | 20.8       | 11.2  | 8.7      | 22.1       | 6.5          |
| R                           | 5 | 80.3               | 35.5      | 55.7         | 1.0      | 1.0       | 56.4        | 40.5  | 25.2     | 37.5       | 11.2  | 5.0      | 55.3       | 37.4         |
|                             |   | Average V          | alues     | 40.5         |          |           | 40.9        |       |          | 29.7       |       |          | 29.2       | 32.4         |
|                             | 1 | 80.3               | 74.2      | 7.6          | 2.0      | 1.5       | 7.8         | 60.7  | 36.7     | 39.5       | 13.5  | 8.4      | 37.8       | 6.6          |
| (3)                         | 2 | 119.9              | 52.2      | 56.4         | 2.0      | 1.5       | 56.9        | 60.7  | 35.0     | 42.3       | 13.5  | 6.0      | 55.5       | 33.2         |
| Н                           | 3 | 81.3               | 36.6      | 55.0         | 2.0      | 2.0       | 56.4        | 60.7  | 30.3     | 50.0       | 13.5  | 6.0      | 55.2       | 21.6         |
| A                           | 4 | 80.3               | 57.9      | 27.9         | 1.0      | 1.0       | 28.2        | 48.6  | 38.8     | 20.2       | 13.5  | 11.0     | 18.4       | 2.1          |
| L                           | 5 | 80.3               | 35.5      | 55.7         | 1.0      | 1.0       | 56.4        | 48.6  | 26.5     | 45.3       | 13.5  | 6.0      | 55.2       | 28.3         |
|                             |   | Average V          | alues     | 40.5         |          |           | 41.1        |       |          | 39.5       |       |          | 44.4       | 18.4         |
|                             | 1 | 118.9              | 74.2      | 37.6         | 1.0      | 0.4       | 37.4        | 50.6  | 38.0     | 24.7       | 11.2  | 8.6      | 23.0       | 3.8          |
| (4)                         | 2 | 118.9              | 52.2      | 56.0         | 1.0      | 0.4       | 56.0        | 50.6  | 29.1     | 42.5       | 11.2  | 4.9      | 56.4       | 34.6         |
| I                           | 3 | 80.3               | 34.5      | 57.0         | 1.0      | 1.0       | 57.7        | 40.5  | 22.1     | 45.5       | 11.2  | 5.0      | 55.3       | 28.4         |
| 1                           | 4 | 80.3               | 57.9      | 27.9         | 1.0      | 1.0       | 28.2        | 40.5  | 28.3     | 30.0       | 11.2  | 8.7      | 22.1       | 6.5          |
| R                           | 5 | 80.3               | 35.5      | 55.7         | 1.0      | 1.0       | 56.4        | 40.5  | 22.1     | 45.3       | 11.2  | 5.0      | 55.3       | 64.2         |
|                             |   | Average V          | alues     | 46.8         | 1.0      | <u> </u>  | 47.1        | 20.5  | 10.0     | 37.6       |       | 5.0      | 42.4       | 27.5         |
| (5)                         | 1 | 40.7               | 35.0      | 13.9         | 1.0      | 0.4       | 12.8        | 20.6  | 12.2     | 40.7       | 11.5  | 5.0      | 56.4       | 43.3         |
| (5)                         | 2 | 40.7               | 35.0      | 13.9         | 1.0      | 0.4       | 12.8        | 20.6  | 12.2     | 40.7       | 11.5  | 5.0      | 56.4       | 43.3         |
| A                           | 3 | 40.7               | 35.5      | 12.5         | 1.0      | 1.0       | 12.8        | 20.6  | 13.9     | 32.5       | 11.5  | 5.2      | 54.2       | 40.4         |
| R                           | 4 | 40.7               | 57.9      | 0            | 1.0      | 1.0       | 0           | 20.6  | 14.3     | 30.6       | 11.5  | 6.4      | 43.3       | 26.4         |
| F                           | 5 | 40.7               | 35.5      | 12.5         | 1.0      | 1.03      | 12.8        | 20.6  | 13.9     | 32.5       | 11.5  | 5.2      | 54.2       | 40.4         |
| Average Values              |   |                    | 10.6      |              |          | 10.2      |             |       | 35.4     |            |       | 52.9     | 38.7       |              |
| Average over all benchmarks |   |                    | 36.4      |              |          | 34.9      |             |       | 34.4     |            |       | 42.7     | 29.2       |              |

Table 1. Power, energy and EDP estimates for benchmarks

by the fact that the power and energy reductions are accompanied by the reductions in energy delay products.

The  $CPFMC^*$  parameter defined and used in this work essentially facilitates simultaneous optimization of energy and transient power using ILP formulations. The datapath scheduling algorithms described are useful for synthesizing data intensive ASICs. To keep track of the effect of scheduling algorithms on circuit performance, we estimated the EDP for scheduled DFGs. The scheduling algorithm do not consider exact switching activity for power or energy estimations. The scheduling scheme need to be extended to consider pipelined datapath.

### 6. REFERENCES

- S. P. Mohanty and N. Ranganathan, "A framework for energy and transient power reduction during behavioral synthesis," in *Proc. of Intl. Conf. on VLSI Design*, Jan 2003, pp. 539–545.
- [2] J. M. Chang and M. Pedram, "Energy minimization using multiple supply voltages," *IEEE Trans. on VLSI Systems*, vol. 5, no. 4, pp. 436–443, Dec 1997.
- [3] M. Johnson and K. Roy, "Datapath scheduling with multiple supply voltages and level converters," ACM Trans. on Design Automation of Electronic Systems, vol. 2, no. 3, pp. 227–248, July 1997.
- [4] R. S. Martin and J. P. Knight, "Optimizing power in asic behavioral synthesis," *IEEE Design & Test of Computers*, vol. 13, no. 2, pp. 58–70, Summer 1996.

- [5] W. T. Shiue, "High level synthesis for peak power minimization using ilp," in Proc. of IEEE International Conference on Application Speci£c Systems, Architectures and Processors, 2000, pp. 103–112.
- [6] W. T. Shiue and C. Chakrabarti, "Ilp based scheme for low power scheduling and resource binding," in *Proc.* of ISCAS, 2000, pp. III.279–III.282.
- [7] S. P. Mohanty, N. Ranganathan, and S. K. Chappidi, "Peak Power Minimization Through Datapath Scheduling," in *Proceedings of the IEEE Computer Society Annual Symposium on VLSI*, Feb 2003, pp. 121–126.
- [8] V. Raghunathan, S. Ravi, A. Raghunathan, and G. Lakshminarayana, "Transient power management through high level synthesis," in *Proc. of ICCAD*, 2001, pp. 545–552.
- [9] B. A. McCarl and T. H. Spreen, *Applied Mathematical Programming using Algebric Systems*, Online Book at : http://agecon.tamu.edu/faculty/mccarl/regbook.htm, 1997.