# Ultra-Fast Design Exploration of Nanoscale Circuits through Metamodeling

Saraju P. Mohanty, Oghenekarho Okobiah, and Geng Zheng NanoSystem Design Laboratory (NSDL) Dept. of Computer Science and Engineering University of North Texas, Denton, TX 76203, USA. Email: saraju.mohanty@unt.edu

#### Presented By Oghenekarho Okobiah





#### **Outline of the Talk**

- Nanoscale Design Challenges
- The Proposed Ultra-Fast Solution
- Metamodel Types and Proposed Techniques
- Algorithms for Optimization over Metamodels
- Experiments Using Case Studies
- Conclusions and Future Research





#### **Outline of the Talk**

- Nanoscale Design Challenges
- The Proposed Ultra-Fast Solution
- Metamodel Types and Proposed Techniques
- Algorithms for Optimization over Metamodels
- Experiments Using Case Studies
- Conclusions and Future Research





# **Analog/Mixed-Signal Systems**



- A typical consumer electronics is an Analog/Mixed-Signal System-on-a-Chip (AMS-SoC).
- Individual subsystems can also be mixed-signal, e.g. Phase-Locked Loop (PLL).





#### **Nano-CMOS Circuit: Design Space**







# One of the Key Issues: Time/Effort

The simulation time for a Phase-Locked-Loop (PLL) lock on a fullblown (RCLK) parasitic netlist is of the order of many days!



PLL



- Issues for AMS-SoC components:
  - How fast can design space exploration be performed?
  - How fast can layout generation and optimization be performed?





#### **Standard Design Flow – Very Slow**



Standard design flow requires multiple manual iterations on the back-end layout to achieve parasitic closure between front-end circuit and back-end layout.

- Longer design cycle time.
- Error prone design.
- Higher non-recurrent cost.
- Difficult to handle nanoscale challenges.



Mohanty 7

#### 06/26/2012

#### **Outline of the Talk**

- Nanoscale Design Challenges
- The Proposed Ultra-Fast Solution
- Metamodel Types and Proposed Techniques
- Algorithms for Optimization over Metamodels
- Experiments Using Case Studies
- Conclusions and Future Research





#### Automatic Optimization on Netlist (Faster than manual flow; still slow)







06/26/2012

#### Ultra-Fast Design Exploration Through Metamodeling



The Actual Circuit (Netlist) Optimization -- Slow Approach



The Metamodel-Based Approach -- Ultra-Fast Approach







#### **Two Tier Speed Up**







#### **Proposed Flow: Key Perspective**

- Novel design and optimization methodology that will produce robust AMS-SoC components using ultra-fast automatic iterations over metamodels (instead of netlist) and two manual layout steps.
- The methodology easily accommodates multidimensional challenges, reduces design cycle time, improves circuit yield, and reduces chip cost.





#### **Metamodel-Based Design Flow**



UNT UNIVERSITY OF N

Discover the power of ideas

**Mohanty 13** 

#### 06/26/2012

# Metamodeling vs. Macromodeling Macromodeling

- Simplified version of the circuit.
- Used in the same simulation tool.
- Hard to create.

#### Metamodeling

- Mathematical representation of output.
- Based on prediction equation or algorithm.
- \*Language and tool independent.
- Reusable for different specifications.
- Can be applied using non-EAD tools like MATLAB.





#### **Outline of the Talk**

- Nanoscale Design Challenges
- The Proposed Ultra-Fast Solution
- Metamodel Types and Proposed Techniques
- Algorithms for Optimization over Metamodels
- Experiments Using Case Studies
- Conclusions and Future Research









![](_page_15_Picture_2.jpeg)

**Mohanty 16** 

06/26/2012

# Metamodels : Polynomial Example

![](_page_16_Figure_1.jpeg)

![](_page_16_Picture_2.jpeg)

![](_page_16_Picture_3.jpeg)

Mohanty 17

06/26/2012

## Metamodeling – Key Points

- Accuracy -- Capability of predicting the system response over the design space.
- Efficiency -- Computational effort required for constructing the metamodel.
- Transparency -- Capability of providing the information concerning contributions and variations of design variables and correlation among the variables.
- Simplicity -- Simple methods should require less user input and be easily adapted to different problem.

![](_page_17_Picture_5.jpeg)

![](_page_17_Picture_6.jpeg)

#### Metamodels: Performance Analysis

Root-Mean Square Error (RMSE): Represents departure of metamodel from real-simulation (golden). Smaller RMSE means more accurate:

$$RMSE = \sqrt{\left(\frac{1}{N}\right)\sum_{k=1}^{N} \left(FoM(x_k) - \widehat{FoM}(x_k)\right)^2}$$

Relative Average Absolute Error (RAAE): Smaller RAAE means more accurate metamodel:

$$RAAE = \left(\frac{\sum_{k=1}^{N} |FoM(x_k) - \widehat{FoM}(x_k)|}{N \times Standard \ Deviation}\right)$$

• **R-Square**: Larger R-square means more accurate metamodel:  $R^2 = \left(1 - \frac{MSE}{Variance}\right)$ 

![](_page_18_Picture_6.jpeg)

![](_page_18_Picture_7.jpeg)

![](_page_19_Figure_0.jpeg)

## Sampling Techniques: 45nm Ring Oscillator Circuit (5000 points)

**Monte Carlo** 

MLHS

![](_page_20_Figure_3.jpeg)

![](_page_20_Figure_4.jpeg)

LHS

DOE

![](_page_20_Figure_7.jpeg)

![](_page_20_Picture_8.jpeg)

![](_page_20_Picture_9.jpeg)

![](_page_20_Picture_10.jpeg)

![](_page_21_Figure_0.jpeg)

#### **Polynomial Metamodels**

- The generated sample data can be fitted in many ways to generate a metamodel.
- The choice of fitting algorithm can affect the accuracy of the metamodel.
- A simple metamodel has the following form:

$$y = \sum_{i,j=0}^{\kappa} \left( \alpha_{ij} \times x_1^i \times x_2^j \right)$$

y is the response being modeled (e.g. frequency),  $x = [W_n, W_p]$  is the vector of variables and  $\alpha_{ij}$  are the coefficients.

![](_page_22_Picture_6.jpeg)

![](_page_22_Picture_7.jpeg)

### **Metamodel: Polynomial Comparison**

| Case Study               | Polynomial | $\mu$ error | $\sigma$ error |
|--------------------------|------------|-------------|----------------|
| Circuits                 | Order      | (in MHz)    | (in MHz)       |
|                          | 1          | 571.0       | 286.7          |
| Ring Oscillator          | 2          | 195.4       | 78.1           |
| King Osemator            | 3          | 37.2        | 18.0           |
| 45nm CMOS                | 4          | 20.0        | 10.7           |
| Target f : 10GHz         | 5          | 17.1        | 9.6            |
|                          | 1          | 42.3        | 40.1           |
| I C-VCO                  | 2          | 39.4        | 37.8           |
| Leveo                    | 3          | 35.4        | 33.9           |
| 180nm CMOS               | 4          | 30.5        | 29.3           |
| Target <i>f</i> : 2.7GHz | 5          | 26.5        | 25.2           |

**Ring oscillator – Order 1** 

LC-VCO – Order 1

| $f(W_n, W_p)$ | = | $7.94 \times 10^9 + 1.1 \times 10^{16} W_n$ | $f(W_n, W_p)$ | = | $2.38 \times 10^9 - 3.49 \times 10^{12} W_n$ |
|---------------|---|---------------------------------------------|---------------|---|----------------------------------------------|
|               |   | $+1.28 \times 10^{15} W_p.$                 |               |   | $-6.66 \times 10^{12} W_p.$                  |

![](_page_23_Picture_5.jpeg)

![](_page_23_Picture_6.jpeg)

![](_page_23_Picture_7.jpeg)

#### **Neural Network Metamodeling**

- Feed-forward dual layer
   NNs (FFDL) are considered.
- FFDL network created for each FoM:
  - Nonlinear hidden layer functions are considered each varying hidden neurons 1-20:

![](_page_24_Figure_4.jpeg)

 $b_j(v_j) = \tanh(\lambda v_j)$ 

![](_page_24_Picture_6.jpeg)

![](_page_24_Picture_7.jpeg)

06/26/2012

## Metamodel Comparison: Polynomial Vs Nonpolynomial

Nonpolynomial (Neural Network) is more suitable large circuits.

180nm CMOS PLL with Target Specs: f = 2.7GHz, P = 3.9mW,  $8.5\mu s$ .

| Figures-of-<br>Merits (FoM) | Polynomial<br># of Coefficients RMSE |           | Nonpolynomial<br>(Neural Network) |
|-----------------------------|--------------------------------------|-----------|-----------------------------------|
| Frequency                   | 48                                   | 77.96 MHz | 48MHz                             |
| Power                       | 50                                   | 2.6mW     | 0.29mW                            |
| Locking Time                | 56                                   | 1.9µs     | 1.2µs                             |

56% increase in accuracy over polynomial metamodels.

On average 3.2% error over golden design surface.

![](_page_25_Picture_7.jpeg)

![](_page_25_Picture_8.jpeg)

#### **Outline of the Talk**

- Nanoscale Design Challenges
- The Proposed Ultra-Fast Solution
- Metamodel Types and Proposed Techniques
- Algorithms for Optimization over Metamodels
- Experiments Using Case Studies
- Conclusions and Future Research

![](_page_26_Picture_7.jpeg)

![](_page_26_Picture_8.jpeg)

![](_page_27_Figure_0.jpeg)

![](_page_27_Picture_1.jpeg)

![](_page_27_Picture_2.jpeg)

![](_page_27_Picture_3.jpeg)

#### **Exhaustive Search : 45nm RO**

10 GHz Frequency Contour

![](_page_28_Figure_2.jpeg)

Searches over two parameter space.

Parameters incremented over specified steps.

![](_page_28_Picture_6.jpeg)

![](_page_28_Picture_7.jpeg)

![](_page_29_Figure_0.jpeg)

![](_page_29_Picture_1.jpeg)

#### Comparison of the Running Time of Heuristic Algorithms: 45nm RO Optimization without

![](_page_30_Figure_1.jpeg)

**metamodels:** the tabu search optimization is faster by  $\sim 1000 \times$  than the exhaustive search and  $\sim 4 \times$  faster than the simulated annealing optimization.

Optimization with metamodels: the simulated annealing optimization is faster by ~1000× than the exhaustive search and ~6× faster than the tabu search optimization.

![](_page_30_Picture_4.jpeg)

![](_page_30_Picture_5.jpeg)

# **Bee-Colony Optimization: Overview**

- 1. Initial food sources are produced for all worker bees.
- **2. Do** 
  - 1) Each worker bee goes to a food source and evaluates its nectar amount.
  - 2) Each onlooker bee watches the dance of worker bees and chooses one of their sources depending on the dances and evaluates its nectar amount.
  - 3) Determine abandoned food sources and replace with the new food sources discovered by scout bees.
  - 4) Best food source determined so far is recorded.
- 3. While (requirements are met)

A food source  $\rightarrow$  a solution; A position of a food source  $\rightarrow$  a design variable set; Nectar amount  $\rightarrow$  Quality of a solution; Number of worker bees  $\rightarrow$  number of quality solutions.

![](_page_31_Picture_9.jpeg)

#### **Bee Colony Optimization: States**

![](_page_32_Figure_1.jpeg)

![](_page_32_Picture_2.jpeg)

06/26/2012

#### **Outline of the Talk**

- Nanoscale Design Challenges
- The Proposed Ultra-Fast Solution
- Metamodel Types and Proposed Techniques
- Algorithms for Optimization over Metamodels
- Experiments Using Case Studies
- Conclusions and Future Research

![](_page_33_Picture_7.jpeg)

![](_page_33_Picture_8.jpeg)

# **Case Study Circuit: 180nm PLL**

![](_page_34_Figure_1.jpeg)

Block diagram of a PLL.

- PLL circuit is characterized for frequency, power, vertical and horizontal jitter (for simple phase noise), and locking time.
- Metamodels are created for each FoM from same sample set.

![](_page_34_Figure_5.jpeg)

PLL for 180nm.

![](_page_34_Picture_8.jpeg)

![](_page_34_Picture_9.jpeg)

#### PLL: Polynomial Metamodels ...

- PLL circuit is characterized for output frequency, power, vertical and horizontal jitter (to simplify the phase noise calculations), and locking time (or settling time).
- A separate metamodel is created for each FoM from the same sample set.
- The Root Mean Square Error (RMSE) and coefficient of determination R<sup>2</sup> are the metrics used for goodness of fit.

Discover the power of ideas

![](_page_35_Figure_4.jpeg)

Generated R<sup>2</sup> and R<sup>2</sup><sub>adj</sub> for various orders of the polynomial metamodel for settling time. Notice possible overfitting.

#### PLL: Polynomial Metamodels ...

- The number of coefficients corresponding to the order of the generated metamodel for settling time.
- This means that the model is over fitted, therefore for the metamodel that represents settling time, a polynomial order of 4 will be used.

![](_page_36_Figure_3.jpeg)

#### PLL: ABC over Poly. Metamodels ...

![](_page_37_Figure_1.jpeg)

#### **Power and Jitter Results of the PLL**

| Metric            | Before        | After        | Improvement   |
|-------------------|---------------|--------------|---------------|
|                   | Optimization  | Optimization |               |
| Power             | 9.29 mW       | 0.87 mW      | 90.6%         |
| Jitter Vertical   | $168.35\mu V$ | 3.28 nV      | ${\sim}100\%$ |
| Jitter Horizontal | 189 ps        | 180 ps       | 4.8%          |

![](_page_37_Picture_4.jpeg)

![](_page_37_Picture_5.jpeg)

![](_page_37_Picture_6.jpeg)

#### **PLL: ABC over Poly. Metamodels**

#### PLL parameters with constraints

#### and optimized values.

| Circuit        | Parameter   | Min    | Max     | Optimal    |
|----------------|-------------|--------|---------|------------|
|                |             | (m)    | (m)     | Value (m)  |
|                | $W_{ppd1}$  | 400n   | $2\mu$  | $1.66\mu$  |
|                | $W_{npd1}$  | 400n   | $2\mu$  | $1.11\mu$  |
| Phase Detector | $W_{ppd2}$  | 400n   | $2\mu$  | 784n       |
| Thase Detector | $W_{npd2}$  | 400n   | $2\mu$  | 689n       |
|                | $W_{ppd3}$  | 400n   | $2\mu$  | $1.54\mu$  |
|                | $W_{npd3}$  | 400n   | $2\mu$  | 737n       |
|                | $W_{nCP1}$  | 400n   | $2\mu$  | $1.24\mu$  |
| Charge Pump    | $W_{pCP1}$  | 400n   | $2\mu$  | $1.35\mu$  |
| Charge I dilip | $W_{nCP2}$  | $1\mu$ | $4\mu$  | $1.35\mu$  |
|                | $W_{pCP2}$  | $1\mu$ | $4\mu$  | $2.88\mu$  |
| LC-VCO         | $W_{nLC}$   | $3\mu$ | $20\mu$ | $18.62\mu$ |
| 10-100         | $W_{pLC}$   | $6\mu$ | $40\mu$ | $37.48\mu$ |
|                | $W_{p1Div}$ | 400n   | $2\mu$  | $1.65\mu$  |
|                | $W_{p2Div}$ | 400n   | $2\mu$  | $1.54\mu$  |
| Divider        | $W_{p3Div}$ | 400n   | $2\mu$  | $1.38\mu$  |
|                | $W_{p4Div}$ | 400n   | $-2\mu$ | $1.96\mu$  |
|                | $W_{n1Div}$ | 400n   | $2\mu$  | $1.09\mu$  |
|                | $W_{n2Div}$ | 400n   | $2\mu$  | $1.17\mu$  |
|                | $W_{n3Div}$ | 400n   | $-2\mu$ | $1.29\mu$  |
|                | $W_{n4Div}$ | 400n   | $2\mu$  | $1.95\mu$  |
|                | $W_{n5Div}$ | 400n   | $2\mu$  | 536n       |

 An exhaustive search of the design space of 21 parameters with 10 intervals per parameter requires 10<sup>21</sup> simulations.

- 10<sup>21</sup> SPICE simulations is slow; 10min per one.
- 10<sup>21</sup> simulations using polynomial metamodels is fast.
- Time savings: ≈10<sup>20</sup>×
   SPICE simulation time.

![](_page_38_Picture_9.jpeg)

![](_page_38_Picture_10.jpeg)

#### **PLL: ABC Optimization: Poly. Vs NN** • Figure-of-Merit used for optimization objective function of PLL: $FoM = \left(\frac{1}{Power \times Locking Time}\right)$ .

![](_page_39_Figure_1.jpeg)

## **PLL: ABC Optimization: Poly. Vs NN**

#### **Optimization Results**

| FoM           | Poly. Metamodel | ANN Metamodel |
|---------------|-----------------|---------------|
| Average Power | 3.9 mW          | 3.9 mW        |
| Frequency     | 2.6909 GHz      | 2.7026 GHz    |

#### **Optimization Time Comparison**

| Algorithm                             | Circuit Netlist                                                                     | Poly. Metamodel                      | ANN Metamodel                                              |
|---------------------------------------|-------------------------------------------------------------------------------------|--------------------------------------|------------------------------------------------------------|
| ABC<br>(100<br>iterations)            | <pre>#bees(20) * 5 min * 100 iteration = 10,000 minutes = 7 days (worst case)</pre> | 5 mins                               | 0.12 mins                                                  |
| <b>Metamodel</b><br><b>Generation</b> | 0                                                                                   | 11 hours for LHS<br>+ 1 min creation | 11 hours for LHS +<br>10mins training<br>and verification. |

![](_page_40_Picture_5.jpeg)

![](_page_40_Picture_6.jpeg)

#### **Outline of the Talk**

- Nanoscale Design Challenges
- The Proposed Ultra-Fast Solution
- Metamodel Types and Proposed Techniques
- Algorithms for Optimization over Metamodels
- Experiments Using Case Studies
- Conclusions and Future Research

![](_page_41_Picture_7.jpeg)

![](_page_41_Picture_8.jpeg)

#### **Related Prior Research**

![](_page_42_Figure_1.jpeg)

![](_page_42_Picture_2.jpeg)

![](_page_42_Picture_3.jpeg)

Mohanty 43

06/26/2012

#### **Conclusions** ...

- Polynomial/nonpolynomial metamodels are explored.
- Use of metamodels and optimization algorithm speed up the design-space exploration for AMS circuits.
- LHS was identified as an accurate sampling method.
- Polynomial metamodels are easier create but can be applied for small circuits.
- 56% increase in accuracy is observed using feed forward NN over polynomial metamodels.
- On average 3.2% error is observed using NN.

![](_page_43_Picture_7.jpeg)

![](_page_43_Picture_8.jpeg)

#### Conclusions

- As a case study, a 180nm PLL, the circuit was parameterized with 21 parameters and optimized using the ABC algorithm.
- The final outcome of the design flow was 90% power savings and and average of 52% jitter minimization.
- Only 100 simulations are used to generate the accurate metamodels and ABC converged faster.
- An exhaustive search of the design space of 21 parameters with 10 intervals per parameter would require 10<sup>21</sup> simulations. The time savings are enormous (≈10<sup>20</sup>× SPICE simulation time).

![](_page_44_Picture_5.jpeg)

![](_page_44_Picture_6.jpeg)

### Our Selected Publication on this Research

- O. Garitselov, S. P. Mohanty, and E. Kougianos, "A Comparative Study of Metamodels for Fast and Accurate Simulation of Nano-CMOS Circuits", *IEEE Transactions on Semiconductor Manufacturing* (*TSM*), Vol. 25, No. 1, February 2012, pp. 26--36.
- O. Garitselov, S. P. Mohanty, and E. Kougianos, "Fast-Accurate Non-Polynomial Metamodeling for nano-CMOS PLL Design Optimization", in *Proceedings of the 25th IEEE International Conference on VLSI Design (VLSID)*, pp. 316—321.
- O. Garitselov, S. P. Mohanty, E. Kougianos, and O. Okobiah, "Metamodel-Assisted Ultra-Fast Memetic Optimization of a PLL for WiMax and MMDS Applications", in *Proc. 13th IEEE International Symposium on Quality Electronic Design (ISQED)*, pp. 580—585.
- O. Garitselov, S. P. Mohanty, and E. Kougianos, "Fast Optimization of Nano-CMOS Mixed-Signal Circuits Through Accurate Metamodeling", in *Proceedings of the 12th IEEE International Symposium on Quality Electronic Design (ISQED)*, pp. 405--410, 2011.

![](_page_45_Picture_6.jpeg)

![](_page_45_Picture_7.jpeg)

#### **Future Research**

- Capturing statistical process variations using metamodels
- Kriging metamodeling
  - Effective handle correlations
  - Accurately model process variations
- Integration in HDLs
  - Used for accurate behavioral simulations
- Application to MEMS/NEMS
  - Unified simulation and design exploration of heterogeneous components

![](_page_46_Picture_10.jpeg)

![](_page_46_Picture_11.jpeg)

# Thank you !!!

06/26/2012