# A DOE-ILP Assisted Conjugate-Gradient Based Power and Stability Optimization in High-K Nano-CMOS SRAM

Garima Thakral, Saraju P. Mohanty, and Dhruva Ghai Department of Computer Science & Engineering University of North Texas Denton, TX 76203. saraju.mohanty@unt.edu

# ABSTRACT

In this paper, a novel design flow is presented for power minimization of nano-CMOS SRAM (static random access memory) circuits, while maintaining their performance. A 32nm high- $\kappa$ /metalgate SRAM is used as an example circuit. The baseline SRAM circuit is subjected to power minimization using a dual- $V_{Th}$  assignment based on a novel Design of Experiments-Integer Linear Programming (DOE-ILP) approach. However, this leads to a 15%reduction in the Static Noise Margin (SNM) of the SRAM, which is an indicator of the stability degradation of the SRAM. This reduction in the SNM is then overcome using a conjugate gradient optimization, while maintaining the minimum power consumption. The final SRAM design shows 86% reduction in power (including leakage) consumption and 8% increase in the SNM compared to the baseline design. The variability analysis of the optimized cell is carried out considering the variability effect in 12 parameters to study the robustness of the optimal SRAM circuit. An  $8 \times 8$  array is constructed to show the feasibility of the proposed SRAM.

## **Categories and Subject Descriptors**

B.7.1 [**Integrated Circuits**]: Types and Design Styles—VLSI (very large scale integration)

### **General Terms**

Design, Optimization

## **Keywords**

SRAM, Nano-CMOS, Power, Leakage, Static Noise Margin

## 1. INTRODUCTION

The increasing complexity of integrated circuits has kept the power dissipation an unresolved issue, especially for the batterypowered portable applications. In the processor-based system-onchips (SoCs), the memories occupy an increasing part of the area budget and are the main contributor of the power dissipation. The

GLSVLSI'10, May 16-18, 2010, Providence, Rhode Island, USA.

Copyright 2010 ACM 978-1-4503-0012-4/10/06 ...\$5.00.

Dhiraj K. Pradhan Department of Computer Science University of Bristol, UK. pradhan@compsci.bristol.ac.uk

trend in the nanoscale technologies is towards an increased contribution of the static power consumption, which is a major problem for the most frequent Static Random Access Memory (SRAM) application, the cache memories. This increased contribution becomes extremely important in applications with long idle modes as in the case of wireless micro sensor systems, in which the standby period is much longer than the active mode.

Stability is an important concern for embedded SRAMs. The SNM serves as a figure of merit for stability of SRAMs. It is a challenge to maintain an acceptable SNM in embedded SRAMs while scaling the minimum feature sizes and supply voltages of the SoCs. Process variation is critical for the nanoscale technologies. Precise control of the process parameters is difficult and the increased process variations are translated into a wider distribution of transistor and circuit characteristics. Any asymmetry in the SRAM cell structure due to process variations renders the affected cells less stable. Under adverse operating conditions such SRAMs may inadvertently flip and corrupt the stored data.

The novel contributions of this paper are as follows:

(1) A novel design flow is proposed for power minimization and stability maximization in nano-CMOS SRAM circuits.

(2) A high- $\kappa$ /metal-gate 32nm 10-transistor SRAM is subjected to this methodology to show it's effectiveness.

(3) A novel DOE-ILP based approach is proposed for power minimization in a SRAM circuit.

(4) A conjugate-gradient based algorithm is proposed for SNM maximization of the SRAM.

(5) Process variation analysis for robustness of study the SRAM.

(6) An  $8 \times 8$  array is constructed using optimal SRAM cells.

The paper is organized as follows: Related research is discussed in Section 2. Section 3 discusses the proposed flow. The SRAM is discussed in Section 4. Section 5 presents the DOE-ILP based power minimization. Section 6 briefs the SNM maximization using conjugate-gradient approach. Section 7 studies the effect of process variation. Conclusions are presented in Section 8.

### 2. PRIOR RESEARCH IN SRAM

In [2], a dual- $V_{Th}$ /dual- $T_{ox}$  method is presented for low-power SRAM. In [1], an SRAM stability analysis method in the presence of random parameters fluctuation is proposed. A read-stability and power-consumption optimal 9-transistor SRAM is proposed in [8, 7]. A Schmitt-trigger based SRAM proposed in [6] provides better read-stability and write-ability while achieving process variation tolerance. In [11], a 10-transistor SRAM is presented which is tolerant to process variation induced read failure. A 10-transistor SRAM at a low voltage and faster readout operation is proposed in [9]. In [14], a DOE-ILP based methodology is proposed for dual- $V_{Th}$  assignment in a 7-transistor SRAM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

## 3. PROPOSED FLOW FOR OPTIMAL DE-SIGN OF HIGH-κ NANO-CMOS SRAM

The motivation of design flow shown in Fig. 1 is power minimization (including leakage) of SRAM, while maximizing it's stability. The input to the flow is a baseline SRAM design.

The figures of merit under consideration are measured for the baseline design. Dual- $V_{Th}$  voltage technique is considered as it has strong impact on power dissipation and SNM of the of the SRAM.

This is performed using a DOE-ILP based approach. The motivation of using DOE is that it is an efficient way to understand relationship between input factors and response [10]. For determining the settings of input factors which optimize the response, ILP is used which solves the linear equations, ensures minimum power SRAM cell configuration. However, this results in degradation in the stability (SNM) of the SRAM.

To improve the stability of the SRAM, the minimum-power configuration SRAM is subjected to the conjugate-gradient based optimization loop for SNM maximization. Conjugate gradient method is an efficient methodology for the target objectives compared to other methodologies. This method is suitable for handling linear systems from difference equations approximating boundary value problems [4]. First, at each step of the algorithm an estimate of the solution is given, which is improvement over the one given in the preceding step. Second, at any step a new experiment can be started by a simple device, which is an improvement over the one given in preceding step.

The parameter set for optimization includes the widths and lengths of the access, load and driver transistors of the SRAM cell. The output of this optimization loop is a highly stable SRAM cell, which consumes minimum power.



Figure 1: The proposed design flow for optimal SRAM.

## 4. HIGH- $\kappa$ BASED 10-TRANSISTOR SRAM

The baseline 10-transistor SRAM cell is shown in Fig. 2 [11]. This topology is tolerant to read failures induced by nano-CMOS process variations [11]. The SRAM cell is composed of two inverters connected back to back in a closed loop fashion in order to store the 1-bit information. Three transmission gates read, write and hold states, instead of access transistors used in the traditional 6-transistor SRAM. Transmission gates carefully input and output the data to/ from the cell node Q at full logic level. This provide full swing during write and read operation. The use of sense amplifier using SRAM is better for small signal handling but has difficulty in handling threshold voltages thus this feature of the 10-transistor SRAM eliminates the use of sense amplifier and pre-charging circuitry for pre-charging of bit and bit-bar lines prior to read and write operations and will lead to area efficient cache.



Figure 2: A ten transistor SRAM cell.

#### 4.1 Operations of the 10-transistor SRAM

The 10-transistor SRAM cell initiates the read operation with the read (Read and Read) nodes. In read '1' operation, the Read node will connect the NMOS transistor of transmission gate that provides path for the Q and further to Data out node. The Read node goes to high level and so does node Q as it connects the path. In case of read operation the transmission gates 9 and 10 are in the 'ON' state, thus carrying active-drain current. The transmission gate at the read side are also 'ON' hence carry active-drain current. Transistor 4 and transistor 5 have active-drain current being in the 'ON' state and transistor 3 and transistor 6 have subthreshold current as they are in 'OFF' state as shown in Fig. 3(a). During the read '0' operation (shown in Fig. 3(b)) transistors 3 and 6 carry active-drain current in 'ON' state, whereas transistors 4 and 5 have subthreshold current in 'OFF' state. Subthreshold current flows through the transmission gate at the write node and transmission gates of the read side have active-drain current. Similarly, write '1' and write '0' operations can be analyzed, which are not discussed due to lack of space.

#### 4.2 High-κ Nano-CMOS SRAM Models

For the design and simulation of SRAM presented in this paper, a 32nm high- $\kappa$ /metal-gate CMOS PTM [15] is used. In the absence of published data and device models, PTM provides timely and effective analysis. The simulations results obtained are highly accurate and the calculated data are of comparable accuracy to TCAD simulations which are typically time and computation intensive. For the PTM based on BSIM4/5, two methods are adopted:



(a) Current for read 1



(b) Current for read 0

#### Figure 3: Current paths in the 10-transistor SRAM.

(1) The model parameter in the model file that denotes relative permittivity (EPSROX) is changed.

(2) The equivalent oxide thickness (EOT) for the dielectric under consideration is calculated.

Using these steps, the EOT is calculated so as to keep the ratio of relative permittivity over dielectric thickness constant. The EOT is calculated by using the following expression:

$$T_{ox}^* = \left(\frac{\kappa_{SiO_2}}{\kappa_{gate}}\right) \times T_{gate},\tag{1}$$

where  $\kappa_{gate}$  is the relative permittivity and  $T_{gate}$  is the thickness of the gate dielectric material other than SiO<sub>2</sub>, while  $\kappa_{SiO_2}$  is the dielectric constant of SiO<sub>2</sub>(= 3.9). We have taken  $\kappa_{gate}$  = 21 to emulate a HfO<sub>2</sub> dielectric. The EOT is calculated to be 0.9*nm*.

The total power of a nano-CMOS circuit is defined as the summation of dynamic power and subthreshold leakage. The use of high- $\kappa$  metal-gate technology eliminates the gate leakage in SRAM. The power dissipation is calculated by using the following:

$$P_{total} = P_{dynamic} + P_{subthreshold},\tag{2}$$

where  $P_{dynamic}$  is the dynamic power consumed by the transistors and  $P_{subthreshold}$  is the subthreshold leakage.

The simulation setup for SNM measurement is as follows. Two equal voltage sources  $V_N$  with opposite polarity are applied be-

#### Table 1: Results of the baseline SRAM

| Parameters                      | values         |  |
|---------------------------------|----------------|--|
| Average power P <sub>SRAM</sub> | $2.27 \ \mu W$ |  |
| $SNM_{SRAM}$                    | 271 mV         |  |



Figure 4: Pareto plot for SRAM power.

tween the two inverters of the SRAM cell. These voltages are varied from 0 to 0.5V or more until the cell data flips. The voltage sources  $V_N$  are said to be the static noise sources. Since the SRAM cell holds the data and its compliment, it is considered that the cell nodes carry lower noises than during read access also because the noise sources are isolated from the data in and out nodes. The value of SNM obtained is the worst-case. The value of  $V_N$  at which the node voltages change the cell logic states is called as SNM of the memory cell. SNM is defined as the length of the largest square that is fitted inside the smallest lobes of the butterfly curves [2].

For the supply voltage  $V_{dd}$  of 0.7V, the experimental results for the baseline 10-transistor SRAM are presented in Table 1. The butterfly curve for SNM measurement is shown in Fig. 6(a).

## 5. DOE-ILP APPROACH FOR MINIMUM POWER/LEAKAGE CONFIGURATION

An approach that uses both DOE and ILP is deployed for power minimization of the SRAM. The baseline 10-transistor SRAM is first subjected to a Design of Experiments [3, 5] based approach using a 2-Level Taguchi L-12 array. The factors are the  $V_{Th}$  states of 10 transistors of the SRAM cell (Fig. 2), and the response under consideration is the average power consumption of the cell  $(f_{P_{SRAM}})$ . Each factor can take a high  $V_{Th}$  state (+1) or a nominal  $V_{Th}$  state (-1). Taguchi approach is used in order to reduce the run time of our experiments. For example, using other methods like full factorial would take  $2^{10} = 1024$  runs, whereas the L-12 Taguchi array requires 12 runs. Also, Taguchi is used for screening critical factors, which will be considered in our future research where a large number of factors like the transistor sizes will be considered along with the  $V_{Th}$  states. After running the experiments, the half-effects are recorded using the following expression:

$$\left(\frac{\Delta(n)}{2}\right) = \left(\frac{avg(+1) - avg(-1)}{2}\right),\tag{3}$$

where  $\left[\frac{\Delta(n)}{2}\right]$  is the half-effect of *n*th transistor, avg(+1) is average power when transistor *n* is in high- $V_{Th}$  state, and avg(-1) is average power when it is in nominal  $V_{Th}$  state. Fig. 4 shows a pareto plot which is constructed using these half-effects.



Figure 5: Minimum power configuration SRAM cell. The circled transistors are high  $V_{Th}$  transistors.

From this data, predictive equation of following form is obtained:

$$\widehat{f_{P_{SRAM}}} = \overline{f_{P_{SRAM}}} + \sum_{n=1}^{10} \left(\frac{\Delta(n)}{2} \times x_n\right),\tag{4}$$

where  $\widehat{f_{P_{SRAM}}}$  is the response,  $\overline{f_{P_{SRAM}}}$  is the average response,  $\left[\frac{\Delta(n)}{2}\right]$  is the half effect of the nth transistor, and  $x_n$  is the  $V_{Th}$  state of the *n*th transistor. The predictive equation obtained for power is the following:

$$\widehat{f_{P_{SRAM}}}(nW) = 2192.4 + 223.9 \times x_1 + 243.7 \times x_2 +902.8 \times x_3 - 1352.5 \times x_4 + 211.9 \times x_5 -29.2 \times x_6 - 179.1 \times x_7 + 92.6 \times x_8 -128.2 \times x_9 - 170.72 \times x_{10}.$$
(5)

Where  $x_1$  represents the  $V_{Th}$  state of transistor 1 (Fig. 2),  $x_2$  represents the  $V_{Th}$  state of transistor 2, and so on. From this, an ILP problem is formulated as follows:

$$\begin{array}{ll} \min & f_{P_{SRAM}} \\ \text{s.t.} & -1 \leq x_1 \leq +1, -1 \leq x_2 \leq +1, \\ & -1 \leq x_3 \leq +1, -1 \leq x_4 \leq +1, \\ & -1 \leq x_5 \leq +1, -1 \leq x_6 \leq +1, \\ & -1 \leq x_7 \leq +1, -1 \leq x_8 \leq +1, \\ & -1 \leq x_9 \leq +1, -1 \leq x_{10} \leq +1. \end{array}$$

$$(6)$$

Where the constraints '+1' and '-1' represent coded values for high  $V_{Th}$  and nominal  $V_{Th}$  states, respectively. ILP is used for small circuit, but the methodology is automated, and hence can be used for larger circuits. We form the predictive equations for power ( $f_{PWR}$ ) and RSNM ( $f_{RSNM}$ ) based on the experiments performed on the  $V_{Th}$  state (high or nominal) of the transistors in SRAM cell.

The predictive equations and constraints are considered to be linear. Therefore solving the ILP problem the optimal solution is obtained as  $P_{SRAM} = [x_1 = 0, x_2 = 0, x_3 = 0, x_4 = 1, x_5 = 0, x_6 = 1, x_7 = 1, x_8 = 0, x_9 = 1, x_{10} = 1]$ . This can be interpreted as transistors 4, 6, 7, 9, 10 are high  $V_{Th}$  transistors, and transistors 1, 2, 3, 5, 8 are nominal  $V_{Th}$  transistors. Fig. 5 shows the SRAM cell with the high  $V_{Th}$  transistors circled. Basically, integer linear programming is a technique which is used for optimization of a linear objective function subject to constraints and the predictive functions are solved using this approach.

The experimental results obtained from the minimum power configuration are presented in Table 2. It shows 86.15% power re-

Table 2: Minimum power configuration results.

| Parameter                | Value        |
|--------------------------|--------------|
| Average power $P_{SRAM}$ | $314.5 \ nW$ |
| SNM $SNM_{SRAM}$         | 230.4 mV     |

Algorithm 1 SNM optimization using conjugate gradient method.

- 1: **Input:** Minimum power configuration SRAM, Baseline model file, High-threshold model file, Objective Set  $F = [SNM_{SRAM}, P_{SRAM}]$ , Stopping criteria S, parameter set  $D = [W_{pl}, L_{pl}, W_{nd}, L_{nd}, W_{pa}, L_{pa}, W_{na}, L_{na}]$ , Lower parameter constraint  $C_{low}$ , Upper parameter constraint  $C_{up}$ .
- Output: Optimized objective set F<sub>opt</sub>, Optimal parameter set D<sub>opt</sub> for S ≤ ±β. {where 1% ≤ β ≤ 5%}
- 3: Run initial simulation with initial guess of D.
- 4: while  $(C_{low} < D < C_{up})$  do
- 5: Use conjugate gradient method to generate new set of parameters  $D' = D \pm \delta D$ .
- 6: Compute  $F = [SNM_{SRAM}, P_{SRAM}]$ .
- 7: **if**  $(S \le \pm \beta)$  then
- 8: return  $D_{opt} = D'$ .
- 9: end if
- 10: end while
- 11: Using  $D_{opt}$ , simulate the optimal SRAM.
- 12: Record  $F_{opt}$  for the optimal SRAM.

duction over the baseline design. However, it also results in 15% degradation in SNM as evident from Fig. 6(b).

## 6. CONJUGATE-GRADIENT ALGORITHM FOR SNM MAXIMIZATION

The DOE-ILP method achieved the objective of minimum power consumption of the SRAM. To improve the SNM which is degraded during power optimization the conjugate gradient method shown in Algorithm 1 is used. The conjugate-gradient method is an approach for the numerical solution of systems of linear equations whose matrix is symmetric and positive-definite, offering the advantages of low memory requirements and high convergence speed. The minimum-power configuration SRAM cell is subjected to conjugate gradient based SNM maximization, where the parameter set takes on different values, till the specifications are met [13]. The parameters considered during optimization are as follows: (1) Lna: NMOS access transistor channel length, (2) Lpa: PMOS access transistor channel length, (3)  $W_{na}$ : NMOS access transistor channel width, (4)  $W_{pa}$ : PMOS access transistor channel width, (5)  $L_{nd}$ : NMOS driver transistor channel length, (6)  $W_{nd}$ : NMOS driver transistor channel width, (7)  $L_{pl}$ : PMOS load transistor channel length, (8)  $W_{pl}$ : PMOS load transistor channel width.

The algorithm initially starts with the guess of D followed by iterations to improve the guess each time until it is close enough to the objective set of  $F_{opt}$  with the stopping criteria S. S is the stopping criteria for the optimization to stop the iteration when the objective set is reached, which is within  $\pm \epsilon$ , assumed as  $\epsilon \leq 5\%$ ; where  $\epsilon$  is designer specified error margin, in percentage. The algorithm satisfies the stopping criteria S with the output of optimized objective set  $F_{opt}$  and the optimal values of the design variable set  $D_{opt}$  along with the upper and lower parameter constraints.

The optimization algorithm converged in 9 iterations with each iteration lasting for 4 minutes. Table 3 shows the final values of the

Table 3: Optimized values of the parameter set.

| D        | $C_{low}$ | $C_{up}$       | $D_{opt}$      |
|----------|-----------|----------------|----------------|
| $W_{pl}$ | 64 nm     | 1.28 µm        | $1.18 \ \mu m$ |
| $L_{pl}$ | 64 nm     | $1.28 \ \mu m$ | 1.28 μm        |
| $W_{nd}$ | 64 nm     | $1.28 \ \mu m$ | $1.28 \ \mu m$ |
| $L_{nd}$ | 64 nm     | $1.28 \ \mu m$ | 32.28 nm       |
| $W_{pa}$ | 64 nm     | $1.28 \ \mu m$ | 1.28 μm        |
| $L_{pa}$ | 64 nm     | $1.28 \ \mu m$ | 74.8 nm        |
| $W_{na}$ | 64 nm     | $1.28 \ \mu m$ | $1.28 \ \mu m$ |
| $L_{na}$ | 64 nm     | $1.28 \ \mu m$ | 32 nm          |

Table 4: SRAM results after Optimization.

| Parameters          | Values   |
|---------------------|----------|
| Average power PSRAM | 314.5 nW |
| $SNM SNM_{SRAM}$    | 295 mV   |

parameter set for SNM optimal SRAM. The results obtained after the optimization are presented in Table 4.

The experimental results show that 86.15% power reduction could be achieved over the baseline SRAM design. At the same time 8%improvement in SNM is obtained. The butterfly curve for the optimal SRAM is shown in Fig. 6(c).

As per the design flow, an  $8 \times 8$  array is constructed using the optimized cell, which is shown in Fig. 8. The average power consumption of the array is 1.2  $\mu W$ .

## 7. PROCESS VARIATION ANALYSIS OF THE 10-TRANSISTOR SRAM

The attributes of SRAM such as power (leakage) dissipation and SNM is strongly affected by device threshold voltage. The process variations introduced in the threshold voltage have impact on power and SNM and needs analysis to ensure robustness of the SRAM. Current sensing does not require large voltage swings to maintain acceptable noise margins. The threshold voltage variation is strongly related to the device geometry and doping profile. Eqn. 7 shows the threshold voltage standard deviation ( $\sigma_{V_{Th}}$ ) relation with the gate dielectric thickness, the channel dopant concentration ( $N_{ch}$ ) and the channel length and the device width [12]:

$$\sigma_{V_{Th}} = \left(\frac{\sqrt[4]{4q^3\epsilon_{Si}\phi_B}}{2}\right) \times \left(\frac{T_{gate}}{\epsilon_{gate}}\right) \times \left(\frac{\sqrt[4]{N_{ch}}}{\sqrt{W \times L}}\right), \quad (7)$$

where  $\phi_B = 2 \times \kappa_B \times T \times \ln(N_{ch}/n_i)$  (with  $\kappa_B$  Boltzmann's constant, *T* the absolute temperature,  $n_i$  the intrinsic carrier concentration, *q* the elementary charge), and  $\epsilon_{gate}$  and  $\epsilon_{Si}$  are the permittivity of the gate and silicon, respectively. The above expression is consistent with observations that  $\sigma_{V_{Th}}$  is inversely proportional to the square root of the device area.

The SRAM is exhaustively evaluated through 1000 Monte Carlo simulations to ensure there is no process variation induced failures. The 12 process parameters, which are dependent on the threshold voltage variation as shown in the above equation are considered for variability: (1)  $T_{gaten}$ : NMOS gate dielectric thickness (nm), (2)  $T_{gatep}$ : PMOS gate dielectric thickness (nm), (3)  $L_{na}$ , (4)  $L_{pa}$ , (5)  $W_{na}$ , (6)  $W_{pa}$ , (7)  $L_{nd}$ , (8)  $W_{nd}$ , (9)  $L_{pl}$ , (10)  $W_{pl}$ , (11)  $N_{chn}$ : NMOS channel doping concentration  $(cm^{-3})$ , (12)  $N_{chp}$ : PMOS channel doping concentration  $(cm^{-3})$ .

Each of the parameters is assumed to have a Gaussian distribution with mean taken as the nominal values specified in the PTM



(c) Power/SNM Optimal

Figure 6: Butterfly curves for different SRAMs configurations.



Figure 7: SNM and power comparison for the SRAM.

[15] and standard deviation as 10% of the mean. The parameters are not independent. As a typical case, a correlation coefficient of 0.9 between  $T_{gaten}$  and  $T_{gatep}$  is assumed. Fig. 9(a) shows the effect of process variations on the butterfly curve of SRAM. Fig. 9(b) shows the distributions for "SNM High" and "SNM Low" extracted from the Monte Carlo simulations, where "SNM High" is the higher SNM and "SNM Low" is the lower SNM due to asymmetry in the cell, for each Monte Carlo run. "SNM Low" is treated as the actual SNM. Table 5 shows the corresponding statistical data. Fig. 9(c) shows the distribution of average power of the SRAM. The average power distribution is observed to be Lognormal in nature.

## 8. CONCLUSIONS

A methodology is presented for cell-level optimization of SRAM power and stability. A 32nm high- $\kappa$  metal gate 10-transistor SRAM is subjected to the proposed methodology which has shown 86% reduction in power and 8% increase in SNM. A novel DOE-ILP approach has been used for power minimization, and conjugate gra-



Figure 8: One row of the 8  $\times$  8 array constructed using proposed 10-transistor SRAM cells.



Figure 9: Butterfly curve, SNM distribution, and Power distribution for the optimal SRAM under process variations.

| Table 5: Sta | atistical data | for SNM of | f optima | SRAM. |
|--------------|----------------|------------|----------|-------|
|--------------|----------------|------------|----------|-------|

| SNM      | $\mu (mV)$ | $\sigma (mV)$ |
|----------|------------|---------------|
| SNM High | 330.7      | 71.9          |
| SNM Low  | 290.3      | 12.7          |

dient method is used for SNM maximization. The effect of process variation of 12 parameters on the proposed SRAM is evaluated. A  $8 \times 8$  array has been constructed using the optimized cell and data for power and read access time is presented. The future scope of this research involves array-level optimization of SRAM. For array optimization, both mismatch and process variation will be considered as part of the design flow.

#### 9. **REFERENCES**

- K. Agarwal and S. Nassif. Statistical Analysis of SRAM Cell Stability. In *Proceedings of the Design Automation Conference*, pages 57–62, 2006.
- [2] B. Amelifard, and et al. Reducing the Sub-threshold and

Gate-tunneling Leakage of SRAM Cells using Dual- $V_t$  and Dual- $T_{ox}$  Assignment. In *Proc. of DATE*, pages 1–6, 2006.

- [3] D. Ghai, and et al.. Variability-aware optimization of nano-CMOS Active Pixel Sensors using design and analysis of Monte Carlo experiments. In *Proc. of the International Sympo. on Quality Electronic Design*, pages 172–178, 2009.
- [4] M. R. Hestenes and E. Stiefel. Methods of Conjugate Gradients for Solving Linear Systems. *Journal of Research of the National Bureau of Standards*, 49(6):409–436, Dec 1952.
- [5] E. Kougianos and S. P. Mohanty. Impact of Gate-Oxide Tunneling on Mixed-Signal Design and Simulation of a Nano-CMOS VCO. *Elsevier Microelectronics Journal*, 40(1):95–103, January 2009.
- [6] J. P. Kulkani and et al. Process Variation Tolerant SRAM Array for Ultra Low Voltage Applications. In *Proceedings of* the Design Automation Conference, pages 108–113, 2008.
- [7] S. Lin, et al. A low leakage 9t sram cell for ultra-low power operation. In *Proc. of GLSVLSI*, pages 123–126, 2008.
- [8] Z. Liu and V. Kursun. High Read Stability and Low Leakage Cache Memory Cell. In *Proceedings of International Sympo.* on Circuits and Systems, pages 2774–2777, 2007.
- [9] S. Okumura and et. al. A 0.56-V 128kb 10T SRAM Using Column Line Assist (CLA) Scheme. In Proc. of International Sympo. on Quality Electronic Design, pages 659–663, 2009.
- [10] S. R. Schmidt and R. G. Launsby. Understanding Industrial Design of Experiments: 4th Edn.. Air Academy Press, 1994.
- [11] J. Singh and et al. A Nano-CMOS Process Variation Induced Read Failure Tolerant SRAM Cell. In Proc. of International Sympo. Circuits and Systems, pages 3334–3337, 2008.
- [12] P. A. Stolk, and et al. Modeling Statistical Dopant Fluctuations in MOS Transistors. *IEEE Trans. Electron Devices*, 45(9):1960–1971, September 1998.
- [13] X. D. Tan and et. al. Reliability-constrained area optimization of VLSI power/ground networks via sequence of linear programmings. *IEEE Transactions on CAD of Integrated Circuits and Systems*, 22(12):1678–1684, 2003.
- [14] G. Thakral, et al. A Combined DOE-ILP Based Power and Read Stability Optimization in Nano-CMOS SRAM. In Proc. 23rd International Conf. VLSI Design, pages 45–50, 2010.
- [15] W. Zhao and Y. Cao. New Generation of Predictive Technology Model for sub-45nm Design Exploration. In *Proc. of the ISQED*, pages 585–590, 2006.