# Static Timing Analysis of a block for better Power, Performance and Area

Major Project Report

Submitted in Partial Fulfillment of the Requirements for the Degree of

#### MASTER OF TECHNOLOGY

IN

### ELECTRONICS AND COMMUNICATION ENGINEERING (VLSI DESIGN)

Submitted By Rhea Biji 19MECV15

Guided By Dr. Niranjan Devashrayee



Department Of Electronics and Communication Engineering Institute Of Technology Nirma University Ahmedabad-382481 May 2021

### Certificate

This is to certify that the M.Tech thesis entitled "Static Timing Analysis of a block for better Power, Performance and Area" submitted by Rhea Biji (Roll No: 19MECV15), towards the partial fulfillment of the requirements for the award of degree of Master of Technology in Electronics and Communication (VLSI Design) of Nirma University, Ahmedabad, is the record of work carried out by her under our supervision and guidance. In our opinion, the submitted work has reached a level required for being accepted for examination. The results embodied in this M.Tech Thesis, to the best of our knowledge, haven't been submitted to any other university or institution for award of any degree or diploma.

Internal Guide: Dr. Niranjan Devashrayee Adjunct Professor M.Tech(VLSI Design) Institute of Technology, Nirma University, Ahmedabad Dr. Usha Mehta PG-Coordinator M. Tech(VLSI Design) Professor (EC) Institute of Technology, Nirma University, Ahmedabad

Dr. Dhaval Pujara Professor and HOD Electronics and Communications Engineering Institute of Technology, Nirma University, Ahmedabad Dr. Rajesh N Patel Director Institute of Technology, Nirma University, Ahmedabad

#### **Statement of Originality**

I, Rhea Biji, Roll No: 19MECV15, give undertaking that the M.Tech thesis entitled "Static Timing Analysis of a block for better Power, Performance and Area" submitted by me, towards the partial fulfillment of the requirements for the degree of Master of Technology in Electronics & Communication Engineering (VLSI Design) of Institute of Technology, Nirma University, Ahmedabad, contains no material that has been awarded for any degree or diploma in any university or school in any territory to the best of my knowledge. It is the original work carried out by me and I give assurance that no attempt of plagiarism has been made.It contains no material that is previously published or written, except where reference has been made. I understand that in the event of any similarity found subsequently with any published work or any dissertation work elsewhere; it will result in severe disciplinary action.

Signature of Student Date: Place:

> Endorsed by Dr Niranjan Devashrayee (Signature of Guide)

### Acknowledgements

I take this opportunity to express my profound gratitude to my internal guide **Dr.Niranjan Devashrayee**, Adjunct Professor, M.Tech VLSI Design for his exemplary guidance, and constant encouragement throughout this project.

I am grateful to respected **Dr.Dhaval Pujara**, Head of Department, Electronics and Communication Engineering, Nirma University, and **Dr.Usha Mehta**, P.G. Coordinator, M.Tech VLSI Design for the kind support, motivation and providing a healthy research environment.

I express my sincere gratitude to my manager Mr.Kartik Ayyar and mentor Mr.Ankur Shukla for the valuable guidance during the project. Sincere thanks to my entire STA team, at Qualcomm for the support and guidance. I also want to thank Qualcomm India Private Limited for assigning me such a project and guiding me through.

I am grateful to **Dr. Rajesh N Patel** Hon'ble Director, Institute of Technology, Nirma University, Ahmedabad for the motivation extended throughout the course of this work.

I would also thank the Institution, all faculty members of Electronics and Communication Engineering Department, Nirma University, Ahmedabad for their special attention and suggestions towards the project work.

> Rhea Biji 19MECV15

#### Abstract

Timing analysis is an important step in the VLSI design flow. It helps in validating if the design can operate at the rated speed. The chip must not only meet the functionality requirements, but also the timing requirements. Static timing analysis is a technique to exhaustively verify the timing of a design. This type of analysis performs timing checks for all the possible paths and scenarios of a design. It checks for all the timing violations without any need of applying data vectors at input pin. Thus, STA is a faster and efficient technique for verifying the timing.

In the project work, Timing runs in the flow is carried out. In those STA runs, in the initial stage of a project various critical corners are included. These critical corners depends on the project we are working on. And later other corners are added as per requirement, for the exhaustive timing verification across all the possible scenarios i.e. when the chip is working in real time. STA checks include setup, hold and Timing DRCs like maximum capacitance, maximum data and clock transitions, minimum period, minimum pulse width. Also, other checks like clock duty cycle distortions, recovery and removal time. Once the timing runs are completed, reports are dumped out and analysed thoroughly. Based on that setup, hold, Timing DRCs fixes of those violating paths are done using various techniques. Some of those techniques are Upsizing the cell, adding buffers in the path, breaking the net, swapping of the threshold voltage levels and so on depending on the type of violation.

### List of Abbreviations

| DRC  | Design Rule Check                     |
|------|---------------------------------------|
| ECO  | Engineering Change Order              |
| PVT  | Process Voltage and Temperature       |
| SDC  | Synopsys Design Constraints           |
| SDF  | Standard Delay Format                 |
| SPEF | Standard Parasitics Extraction Format |
| STA  | Static Timing Analysis                |
| VLSI | Very Large Scale Integration          |

# Contents

| Certifi                                                           | cate                                                                                                                                                                     | ii                                            |
|-------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------|
| Statem                                                            | nent of Originality                                                                                                                                                      | iii                                           |
| Acknow                                                            | wledgements                                                                                                                                                              | iv                                            |
| Abstra                                                            | $\mathbf{ct}$                                                                                                                                                            | $\mathbf{v}$                                  |
| List of                                                           | Abbreviations                                                                                                                                                            | vi                                            |
| List of                                                           | Figures                                                                                                                                                                  | ix                                            |
| List of                                                           | Tables                                                                                                                                                                   | x                                             |
| 1 Intr<br>1.1<br>1.2<br>1.3<br>1.4<br>1.5                         | coduction         Company Profile         Group Profile         Motivation         Objectives         Thesis Organization                                                | <b>1</b><br>1<br>2<br>2<br>2                  |
| 2 STA<br>2.1<br>2.2<br>2.3<br>2.4                                 | A concepts         STA at different stages         Configuration of STA environment         Timing Verification         Crosstalk and Noise Analysis                     | <b>4</b><br>4<br>7<br>9                       |
| <ul> <li>3 Flow</li> <li>3.1</li> <li>3.2</li> <li>3.3</li> </ul> | wchart of the projectIntroductionSTA nodes3.2.1Detailed explanation of each node:Timing Analysis3.3.1Two ways to analyze the timing paths3.3.2IR drop aware STA analysis | <b>12</b><br>12<br>12<br>13<br>15<br>15<br>16 |
| 4 Tim<br>4.1<br>4.2<br>4.3<br>4.4                                 | ing fixesFixing Setup ViolationsFixing Hold ViolationsFixing Timing DRCArea and Power recovery                                                                           | 17<br>17<br>18<br>19<br>19                    |

| <b>5</b> | Res | ults and Conclusion       | <b>20</b> |
|----------|-----|---------------------------|-----------|
|          | 5.1 | STA Results of project 1: | 20        |
|          | 5.2 | STA results of project 2: | 23        |
|          |     | 5.2.1 Hierarchical run    | 23        |
|          |     | 5.2.2 Flat run            | 24        |
|          | 5.3 | Conclusion                | 24        |

# List of Figures

| 2.1 | Digital design flow                       | 5  |
|-----|-------------------------------------------|----|
| 2.2 | STA I/P's and O/P's                       | 6  |
| 2.3 | Setup timing check                        | 6  |
| 2.4 | Hold timing check                         | 7  |
| 2.5 | Glitch in victim net due to aggressor net | 10 |
| 2.6 | Positive crosstalk                        | 10 |
| 2.7 | Negative crosstalk                        | 11 |
| 3.1 | Flow diagram                              | 13 |
| 3.2 | Nodes in an STA 1                         | 13 |

# List of Tables

| 5.1  | Setup and Hold uncertainty for High performance mode    | 20 |
|------|---------------------------------------------------------|----|
| 5.2  | Setup and Hold uncertainty for Nominal performance mode | 21 |
| 5.3  | Setup and Hold uncertainty for low performance mode     | 21 |
| 5.4  | Extraction summary for various iterations               | 22 |
| 5.5  | Setup summary                                           | 23 |
| 5.6  | Extraction summary for various iterations               | 23 |
| 5.7  | Setup summary                                           | 24 |
| 5.8  | Hold summary                                            | 24 |
| 5.9  | Extraction summary for various iterations               | 24 |
| 5.10 | Setup summary                                           | 25 |
| 5.11 | Hold summary                                            | 25 |

# Chapter 1

### Introduction

#### 1.1 Company Profile

Qualcomm is a wireless technology company. The company is engaged in the development, launch and expansion of technologies like fifth-generation (5G). The company operates through three segments: QCT (Qualcomm CDMA Technologies) segment, QTL (Qualcomm Technology Licensing) segment and QSI (Qualcomm Strategic Initiatives) segment. The QCT develops and supplies integrated circuits and software based on 3G/4G/5G and other technologies for use in mobile devices, wireless networks, consumer electronic devices. The QTL team grants licenses to use portion of the Intellectual property portfolio which includes patent rights for the sale and management of certain wireless products. QSI makes strategic investments.

### 1.2 Group Profile

I am an Interim Engineering Intern in the QCT DSP core department. I am working in the STA team. The team focuses on the timing closure of various cores in a chip. Various iterations happen and feedback from one team to other based on the analysis happens for improved results. Once, the Place and route team release their db, various teams like STA,Physical verification, low power, logical equivalence check do their validation checks and provide feedback. Many Timing ECO cycles are done where timing violations are fixed.

#### 1.3 Motivation

In VLSI design, three performance parameters are Power, speed and area. The desired frequency, power and area budget should be met for a design. Timing analysis validates that the chip works at the rated speed. In the nanometer technology nodes, coupling in the interconnect traces causes noise and crosstalk. This can limit the operating speed of a design and must be taken into consideration while doing the timing analysis. Static Timing analysis does exhaustive timing verification of a design. Also, signal integrity issues and On-chip variations are included during STA for verifying that the design is robust.

#### 1.4 Objectives

- Static timing analysis runs for all the scenarios of a design.
- Crosstalk, noise and IR drop aware STA.
- Analysing extraction, setup and hold summary and fixing the violations.
- Analysing the Timing DRCs , noise violations and fixing the violations.

#### 1.5 Thesis Organization

#### Chapter 1: Introduction

This chapter includes a brief description of the project. Section 1.1 and 1.2 is an overview of the company and the my role in the STA team. Section 1.3 describes the motivation behind making the project and Section 1.4 lists out the objectives of the project.

#### Chapter 2: STA concepts

This chapter discusses the concepts related to STA. Section 2.1 shows all the stages in a VLSI design flow where STA is performed. Section 2.2 describes all the configurations to setup an environment for the STA runs.Section 2.3 explains in detail the timing checks done for verifying a design.Also the type of analysis done is explained.Section 2.4 explains about the Crosstalk and noise analysis.Also,the techniques used to fix crosstalk violations is explained.

#### Chapter 3: Flowchart of the project

In this chapter the actual flow which is followed for the Timing analysis is described. Also, the timing fixes techniques are explained. In section 3.1, introduction of the flow of the project is given. In section 3.2, the STA nodes and its detailed explanation is given. In section 3.3, ways in which the timing paths are analyzed is given. Also, the IR drop aware STA analysis is explained.

#### Chapter 4: Timing Fixes

In section 4.1,techniques of fixing the setup violations are explained. In section 4.2,techniques of fixing the hold violations are explained. In section 4.3 Fixing the Timing DRCs is described. In section 4.4, area and power recovery flow is described.

**Chapter 5: Results and Conclusion** In this chapter, section 5.1 shows the results obtained during the project 1 and the significance of those. Section 5.2 shows the STA results of the project 2 and the detailed explanation of it.Section 5.3 is the conclusion of the project.

### Chapter 2

### STA concepts

#### 2.1 STA at different stages

In a VLSI design flow, timing checks are done at various stages so that the worst and critical paths can be known and corrective actions can be taken. The flow can be understood in Figure 2.1. Once the synthesis takes place and is mapped into a technology specific netlist, STA is run so that the worst timing paths can be identified and optimizations in the netlist can be done. In the logical design phase, we have ideal interconnects which are defined by wireload model, ideal clock trees, so only an estimate can be obtained. Estimation of the RC values is done based on the fanout of the cells.[1] But analysis at this stage is also useful to understand the overall design and it's charateristics.

In physical design stage, once the placement of the cells are done, STA analysis is done. After this stage we have the global routes using which estimation of the routing distance is done for extraction of the RC parastics. Once the Clock tree synthesis and the detailed routing is completed, we have real clock tress, real routing. [1]So actual RC extraction values are obtained for accurate STA analysis. Based upon many iterations and feedbacks to teams, final Timing closure for a design can be achieved.

#### 2.2 Configuration of STA environment

STA of a technology mapped netlist (where cells are defined and taken from the timing libraries) is performed depending on various parameters like interconnect modelling :ideal or real routes , clock modelling :ideal or real(propogated) , signal integrity – crosstalk analysis is included or not.



Figure 2.1: Digital design flow

SDF file provides the cell and interconnect delays of the design. SPEF file provides the RC parasitic information of the nets in a design. It is a compact format and at the same time provides detailed parasitic information. SDC is the file where timing related specifications like clock period, uncertainty, latency are mentioned. Along with that the timing constraints to be followed for the input and the output paths in the design are mentioned. The liberty file contains the propogation delay of a cell, slew(both rising and falling), skew. The inputs and outputs of STA tool is shown in Figure 2.2.Outputs mainly shows us the timing summary, noise reports and also the Timing window file where aggressor and victim nets switching timing is mentioned and we can see where is it overlapped. Overlapping area signifies that crosstalk will occur and for that timing window, signal integrity aware STA is done.

There are many STA checks like Setup Timing as shown in Figure 2.3, Hold Timing as shown in Figure 2.4, Clock Duty cycle distortions, data to data paths, clock-gating checks, asynchronous checks like recovery and removal checks. Various timing DRCs like



Figure 2.2: STA I/P's and O/P's

maximum capacitance, maximum data transition, maximum clock transition, Minimum pulse width violations , minimum period violations , signal integrity issues i.e. noise violations are also to be analysed.



Figure 2.3: Setup timing check



Figure 2.4: Hold timing check

Setup timing check: (Tlaunch + Tck2q + Tcomb) < (Tcapture + Tcycle - Tsetup)Hold timing check: (Tlaunch + Tck2q + Tdp) > (Tcapture + Thold)

Limitations of STA are indeterminate state handling, reset sequence, functional behavior across clock cycles. The logic level 0 and 1 only can be propogated through the paths, while the value X cannot be propogated for the analysis. We cannot check if all the flops have went into reset state and a required logic level using STA. As this a static type of timing analysis, we cannot the correctness of functionality using it. Despite these limitations in STA, it is very useful in exhaustively verifying the timing of a design. [1]

#### 2.3 Timing Verification

A designer has to consider three parameters namely speed, power and area. On wanting high speed cells , the power budget of the design may not meet. So, a designer must keep in mind these tradeoffs and reach to a optimal solution so as both timing and power budget is satisfied.

**On-chip variations** :Practically, two regions of the same chip may not be at same PVT conditions. This is called On-chip variations and due to these variations, cell and net delays on the different paths needs to be derated accordingly. Both setup and hold timing check is done including these derate values for On-chip variations. The three types of operating conditions are worst-case slow, best-case fast and nominal/typical condition.

In worst case the process is slow, temperature is high and voltage is low.But in nanometer technologies, temperature inversion happens as Vt effect dominates.So delay decreases at high temperature.Now, in best-case fast :Process is fast,temperature is low and voltage is high. In typical condition, all parameters are nominal.

Various modes of On-chip variations :Traditional On-chip variation (OCV) mode uses the same derate values and is independent of the type of cell.So,Advanced On-chip variation (AOCV)concept is used to solve the traditional pessimistic approach.In this mode, the derate values are dependent on the logical depth and distance of cell.But with the increase in circuit complexity, Advanced on-chip variation mode has certain drawbacks.Another drawback is that it does not consider the impact of transition value or slew of the cell and load on the delay variation. For technologies under 16nm, Parametric On-chip variations(POCV) is used where cell delays are statistically calculated.This is a very accurate approach in STA. [2]

What is a Scenario? A scenario is defined as a combination of parasitic corners, operating mode and PVT corner.Regarding PVT corner, we have talked in the last paragraph. Now , in operating modes there are two types namely functional and test modes.Functional mode includes high-speed, nominal, slow clocks, sleep, debug mode. Test modes includes scan capture, scan shift, jtag, bist mode. RC parasitic corners include nominal, MaxC, MinC, MaxRC, MinRC. In these corners, as the name suggests, the interconnect resistance, capacitance are either maximum or minimum.Largest delay corresponds to the max path analysis and smallest delay paths correspond to the min path analysis.[1].

STA is run for many such scenarios so that it covers all the possibilities that a chip can be at while it is in use. This ensures that the timing requirement is met even if certain variations in metal width , metal etch occur during manufacturing process or any PVT condition variation occurs based on where that chip is being used.

**Single scenario session :** A single scenario session is invoked to examine in detail the timing violations. After detailed analysis, those violations are fixed. There are various techniques to fix setup and hold violations. For the final timing closure of a chip , setup and hold, Timing DRC or any other violations must be fixed using the appropriate techniques.Only the violations which are waived off can be kept unfixed , all the others must be fixed. Towards the end of final timing closure, waiver in the form of percentage is taken into consideration. In general, if a chip has unfixed setup violations, the chip will not work at the rated speed but it will work at reduced speed. Reduced speed implies reduced frequency which implies that chip has reduced performance than expected. So, if setup violation prevails we will see a performance degradation in the chip. But, the chip will atleast work , though with minor degradation in performance. This is not the case if the hold violation prevails. We have to just throw the chip away if it is not satisfying the hold timing.

Multi-mode multi corner scenario session :There is a multi-mode multi corner analysis also. This parallel approach as compared to a single session scenario can be advantageous in terms of runtime of the STA analysis. In single session analysis, for each mode and corner the design file , parasitic information file-spef needs to be loaded individually. So this individual loading of files has to be done multiple number of times depending on the number of scenarios we want the STA run. Thus, huge amount of resources is to be spent to do single session run for the whole design. Setting up the environment and the analysis scripts is very complex. So, multi-mode multi corner analysis helps in saving the resources as the design and parasitic information needs to be loaded only once or twice. The runtime and complexity of setting the environment and analysis of the scripts is also reduced. An additional advantage in this parallel approach of analysis is that fixing the timing violations in one scenario will not result in introducing timing violations across other scenarios.[1]

#### 2.4 Crosstalk and Noise Analysis

In nanometer technologies, noise is a factor to be considered for analysis. The reasons are high metal layers, high frequency devices, routing density increase, interconnects. So, crosstalk glitch and crosstalk delay analysis is included.

• Glitch analysis: Between two nets close to each other, coupling capacitance occurs and causes glitch in the other net as shown in Figure 2.5. Due to the aggressor net, victim net experiences glitch of varying height, width. This depends on the slew value of the aggressor, victim net driving strength and the victim ground capacitance. If the glitches are within the tolerable limits, it won't affect the circuit performance. Else it can be harmful to the circuit.



Figure 2.5: Glitch in victim net due to aggressor net

Multiple aggressor effect is also considered and their switching times seen. The timing window is analysed and glitch magnitude is calculated and worst glitch is taken for pessimistic calculations.

• Delay analysis: If aggressor net is in steady state, there is no crosstalk. If aggressor net is switching in the opposite direction as the victim net, delay increases in the victim net and thus used in the max path analysis as shown in Figure 2.6. If aggressor net is switching in the same direction as the victim net, delay decreases in the victim net and is used for min path analysis as shown in Figure 2.7.



Figure 2.6: Positive crosstalk



Figure 2.7: Negative crosstalk

Aggressor victim timing and functional correlation is also taken into consideration in the crosstalk delay analysis similar to the crosstalk glitch analysis.

**Timing Verification using Crosstalk Delay :**Signal-integrity based STA verifies the timing of the design. Worst condition for setup and hold timing is considered for better results.[3]. Worst-case is applied to both, data and clock path.For setup timing check, the worst-case occurs when the capture clock path has negative crosstalk Figure 2.7 and launch clock path , data path has positive crosstalk Figure 2.6. For hold timing check, the worst-case occurs when the capture clock path has positive crosstalk Figure 2.6 and launch clock path , data path has negative crosstalk Figure 2.7.

**Techniques for noise avoidance :** The first technique is to place shield wires to ensure that coupling between the critical signal wires is least. Second technique is that the wire spacing must be kept such that there is minimum coupling between them. Also, isolation of a block can be done by adding routing halos towards the boundary of a block.

How to fix crosstalk violations? First method is to increase the drive strength of the victim nets. This can be done by upsizing the driver cell or by inserting buffers. Another method is to increase the wire spacing between nearby victim and aggressor nets.

# Chapter 3

# Flowchart of the project

### 3.1 Introduction

In the flow of a project, runs are done on many nodes based on specific teams like Synthesis, Place and Route, Clock tree synthesis, STA etc.

Validation checks: PnR team after their database is released after running all place and route stages, gives the database to the teams like Physical Verification, Power delivery network, Logical equivalence check, Low power team and STA for all kinds of validation checks. Feedback from those analysis is given back to PnR so that improvement in the placement and routing can be done in the further runs.

### 3.2 STA nodes

In STA various nodes needs to be run and corresponding to each node , reports are dumped out for further analysis. Major nodes are Place and route Input database, Constraints, extract , and the timing node STA. Reports of setup, hold, parasitics extraction, Timing DRCs such as maximum capacitance, minimum period, minimum pulse width, noise violations, maximum data and clock transition are analysed. Engineering change orders are made by STA team for timing fixes and they are mentioned to the required teamsFigure 3.1.In the further section detailed explanation of each of the STA node is provided.



Figure 3.1: Flow diagram

#### 3.2.1 Detailed explanation of each node:

As seen in Figure 3.2, the nodes in an STA run are Library, Place and Route input, Extraction, constraints and STA.



Figure 3.2: Nodes in an STA

• Input scripts:Before starting the runs, initial environment setup needs to done so as to ensure that the STA run is carried out smoothly.Correct paths of the scripts needed to dump out the output reports is essential.Memory and space requirements is also to be taken care for the timing runs.

- Library: Various timing libraries version is generated by the library team and released considering the transition limits. When there is a change in design Verilog .v,Design exchange format file .def having physical layout, library, gds , the library versions change. For IP's, macros, memories, standard cell, if library limits /transition limit is more feedback given to library team.The latest version needs to be picked up for the all the STA runs.
- Extraction node: Input to extraction node : Design exchange format .def file has net locations, block's physical location, pin to pin connections. Output of the extraction provides the SPEF files needs for all the scenarios. The parasitics information includes the extracted RC values of the net. This needs to be properly extracted and mentioned in the directory. Due to opens and shorts in a design, setup and hold timings are changed. Thus, timing results and extraction summary is obtained and needs to be analysed for further timing fixes.
- **Constraints:** There is a constraint directory where clock definitions, false paths, multicycle paths, input and output delay, maximum and minimum capacitance, fanout and transition values are given so that the STA team can take those timing constraints for efficient and faster timing closure.
- STA node: In the STA node, all the corners and scenarios which can occur is considered and timing runs are performed.Different teams need different corner analysis of STA . So, different nodes are run for power related, extracted timing models,DFT related.Unified power format loaded timing session for power readout,timing session for timing window file generation,Extracted timing models run for library generation, atspeed :scan in and scan out DFT related seen in functional mode also. So we must fix those and provide it to DFT team.Reports of setup, hold timing, clock

Insertion delay, Clock skew, SDF file are dumped.

• Duty Cycle Distortions: There are possibilities of certain clock duty cycle distortions in the corners fast-slow and slow-fast. This causes Duty cycle distortion violations. In slow-fast (sf)case, nMOS is slow and pMOS is faster or viceversa in fast-slow(fs) case nMOS is fast and pMOS is slower. So, in slow-fast(sf) case, charging will be fast through pMOS and discharging will be slower through nMOS. This implies that the rise time of signal will be fast and fall time will be slow. While, in the fast-slow (fs) case, discharging will be fast through nMOS and charging will be slower through pMOS. This implies that the fast slow. In both the cases the duty cycle will get distorted and the violations needs to be fixed. This is one of the STA checks called Clock DCD(Duty cycle distortion).

#### 3.3 Timing Analysis

In the STA node, timing runs of various corners are carried out. The timings paths are analyzed by different methods whose explanations are given below.

#### 3.3.1 Two ways to analyze the timing paths

First is the Graph based analysis and second is the Path based analysis.

- In the graph-based analysis mode, worst input transition of a cell is considered. That is used to calculate the output transition of that cell. Further it continues considering the worst slew values in the timing paths after it. Thus graph-based analysis mode gives very pessimistic for timing path analysis.
- In the path based analysis mode, as the name suggests it considers the cell and net delays of the elements present in that timing path.So, the final slew value will be calculated depending on the delays considered in that path.Thus, an optimal consideration can be seen in this mode, rather that worst-case values in the previous mode.

Two types of options are available in the path-based analysis mode. One is path and another is exhaustive mode.

- Path option has lesser runtime but the accuracy is not good. The reported paths may or may not be the ones with the worst slack. So, for an initial analysis it can be used, to understand the impact of path based analysis.
- But for the accurate analysis, exhaustive option is chosen. It ensures that the reported paths are the paths with worst slack value.

#### 3.3.2 IR drop aware STA analysis

This kind of analysis is done so that in the actual chip even if variations of voltage occurs, those are taken care and timing violations are fixed beforehand. The timing numbers such as Worst negative slack, total negative slack and violating paths are analysed pre and post IR-drop considerations. This process is done for all the critical corners where voltage drop can occur. Also, considering additional margin in the clock uncertainty, timing is over fixed. This ensures that there will be no violation even if the voltage is reduced at different points of a chip.

The next chapter describes about the STA checks and the techniques to fix the setup, hold and TDRC violations. The feedback is given to other teams and many such iterations of ECO cycles happen before the final timing closure of a chip.

### Chapter 4

## Timing fixes

In chapter 2, detailed description of setup and hold timing was given.Now, if the setup and hold timings are not met, violations of the respective timings paths occur.Slack value for the violating paths are negative.We need to fix those violating paths and bring down the slack to zero or if possible a postive value.

#### 4.1 Fixing Setup Violations

Setup violation signifies that the nth data which has been launched from the launch flipflop does not arrive at the data pin of the capture flop before the capture time of the capture clock. Data path , which starts from the launch flop's clock pin, Clock to Q of the flop, combinational path delay is very slow as compared to the clock path. So for the mitigation of this issue we need to make the data path faster.

- Reduction in buffering in data paths :For this we need to reduce the delay of the cells and the amount of buffers must be reduced which are in the path. All this must be reducing the effective data path delay. Due to reduction of buffering in the path, the wire delay must not increase compared to the cell delay.
- Reduce interconnect delay :Another way of making the data path faster is to place two inverters in place of a buffer so that we observe transition time reduces which will help in reduction in interconnect delay (RC delay of a wire).
- Vt swaps :Some threshold voltage swapping can help in decreasing the propogation delay as this reduces the transition time. But this way of mitigating setup violation has a disadvantage of increasing the leakage power. So power and speed is a tradeoff

and an optimal solution needs to be made according to the timing and power budget available for a project.

- Increasing drive strength of cell :Yet another method of making the data path faster is increasing the drive strength of a cell. This is called upsizing of a cell which results in a cell having higher speed. Again disadvantage of increasing area and power consumption comes. So according to the limits in each of the parameters namely performance, power and area ,decisions are made. These were the setup fixes where we focused on how to make the data path faster. So, data path is taken into consideration at first.
- Positive skew in clock : If setup is not fixed by the above methods, then we can make the capture clock path slower. This can be done by delaying the clock in that path. Add required amount of positive skew to the clock to fix the setup violation. This must be done with utmost care as the paths which are dependent on this clock domain may be affected. Thus, the Slack = required time –arrival time must be positive to ensure that there is no setup violation.

### 4.2 Fixing Hold Violations

Hold violations occur when the data path is faster than the clock path and the correct data is not captured. It misses the data as before the capture clock can capture, the nth data arrived and now the n+1 th will be captured. So to resolve this issue, we can add delays in the data path so that it is slower.

- Addition of buffers in data paths :Hold buffers or inverter pairs or some delay cells can be included in the data path for fixing the hold violation. Few things must be taken care of such as functionality must not be disturbed, no other setup violations must be created while fixing the hold violations. So, we need to look to all the paths, startpoints and endpoints before fixing any kind of violation. This is because setup and hold violation both are opposite kind of violations.
- **Downsize the cell** :In setup violation fix, upsize of the cell was the solution while in the hold fix the solution is to downsize the cell in the data path. We need to reduce the drive strength of cell in the data path so that the overall delay increases and the data path becomes slower.[4]

### 4.3 Fixing Timing DRC

Various Timing Timing Design rule checks are there like Maximum capacitance, maximum transition of data and clock path,maximum fanout. So, if there is a violation in maximum capacitance check, it shows that the driver is not able to handle as per the library limit. So we need to upsize the driver cell. Also breaking a long net by adding buffers in between can help in reducing the transition times. Cloning of a cell can also be done. So,suppose 'A' cell was driving 8 fanout cells and that is causing violation, it can be cloned and 'A' can drive 4 fanout and replication of 'A' cell can drive another 4 fanouts. Thus whichever net is degraged and is causing delay and violating the paths, needs to be improved using the following techniques.

Certain limits of clock also needs to be satisfied for proper capturing of the data. These are minimum pulse width, minimum period. If pulse width is too narrow that the edge cannot be detected properly, capturing of data in the flip-flop is not proper. Minimum time period of a clock also must be met. Noise reports are also analysed so that crosstalk does not cause much violation in the paths.

#### 4.4 Area and Power recovery

There is a area and power budget for a project. Those limits are considered throughout the design flow. If additional area and power can be recovered it is always beneficial. In the design if a timing path is met by a large positive margin, some cells in that path can be downsized till the path is met by a fairly positive value. ECO's generated by downsizing cells will help in recovering some area . Also, dynamic power recovery is possible. For leakage power recovery, some threshold voltage swaps considering the slack value of the timing path is done.

# Chapter 5

# **Results and Conclusion**

STA runs are done across various scenarios: PVT conditions,Parasitic corners and different operating modes.Following are the frequency values for high,nominal and low voltage mode.These will vary with different RC corner,PVT corners and functional and test modes.

### 5.1 STA Results of project 1:

The following results are for a particular corner high, nominal and low voltage mode.

• High performance mode: 1.1 GHz

This is the high performance mode where uncertainty in the clock period will be the least in comparison to other modes. In the table below, few clocks and their setup and hold uncertainty are listed. There are many other clocks in the design in categories of main, virtual and generated. So, this value is taken into consideration while calculating the slack.

| Object | Hold uncertainty(ps) | Setup uncertainty (ps) |
|--------|----------------------|------------------------|
| CLK1   | 28                   | 72                     |
| CLK2   | 28                   | 137                    |
| CLK3   | 28                   | 184                    |

Table 5.1: Setup and Hold uncertainty for High performance mode

• Nominal performance mode: 800 MHz

This is the nominal performance mode where uncertainty in the clock period will be the greater in comparison to other high performance mode and less in comparison to low performance mode. In the table below, few clocks and their setup and hold uncertainty are listed. There are many other clocks in the design in categories of main, virtual and generated. So, this value is taken into consideration while calculating the slack for corners run for nominal case.

| Object | Hold uncertainty(ps) | Setup uncertainty (ps) |
|--------|----------------------|------------------------|
| CLK1   | 47                   | 87                     |
| CLK2   | 47                   | 164                    |
| CLK3   | 47                   | 212                    |

Table 5.2: Setup and Hold uncertainty for Nominal performance mode

#### • Low performance mode: 450 MHz

This is the low performance mode where uncertainty in the clock period will be the greatest in comparison to other modes. In the table below, few clocks and their setup and hold uncertainty are listed. There are many other clocks in the design in categories of main, virtual and generated. So, this value is taken into consideration while calculating the slack.

| Object | Hold uncertainty(ps) | Setup uncertainty (ps) |
|--------|----------------------|------------------------|
| CLK1   | 163                  | 290                    |
| CLK2   | 163                  | 670                    |
| CLK3   | 163                  | 688                    |

Table 5.3: Setup and Hold uncertainty for low performance mode

**STA run flow -Extract node:** Extraction summary is to be observed after the STA runs in various scenarios. This shows whether there are any opens and shorts in a design. If these values are huge, it is informed to PnR and PV teams for new metal fill database. Otherwise the timing violations also will be huge. So, after New metal fill is done, again the extraction summary is observed and then we proceed to the timing summary. This process is done for various iterations throughout the project. Finally a database is fixed which is clean for all the validation checks like STA, Physical Verification, Low

power and Power Delivery team.

To ensure that there is no error, certain files are observed whether the proper def, fill paths are taken or not. For each iteration done, checks are done and corresponding to each db, extraction summary is published.

| Iteration | Opens | Shorts |
|-----------|-------|--------|
| 1         | 0     | 140    |
| 2         | 0     | 0      |
| 3         | 0     | 40     |
| 4         | 0     | 0      |
| 5         | 0     | 2      |
| 6         | 0     | 0      |

Table 5.4: Extraction summary for various iterations

**STA run flow -STA node:** Next will be STA summary where setup and hold timing reports are dumped.For each of the corners where STA run has been done, the worst negative slack(WNS), total negative slack(TNS) and the total number of violating paths. Then the timing tool is invoked so that the we can fix the violations of setup and hold. Each cell and net delay, transition, capacitance value can be observed. So according to techniques like Vt swap, sizing the cell or breaking any net by adding buffers, timing can be fixed accordingly.

report\_timing, all\_clocks, all\_fanouts, check\_timing, report\_case\_analysis, report\_si\_noise\_ analysis, size\_cell are some of the commands used for doing the timing analysis.

Initial setup numbers was: WNS=-0.047 ns, TNS=-12.4 ns and total violating paths =450. After fixing in various iterations and getting the waivers, final setup violatons are 0. The waivers indicate the violations that can be unfixed and that won't affect the overall timing of the SoC.

**Timing fixes:**At each stage, slack is observed and the paths which are violating is fixed. The transition values, capacitance, fanout all are analysed and appropriate technique for fixing the setup violation is chosen. The net is to be improved so as the delay is reduced.

| Worst negative slack(ps) | Total negative slack(ns) | No. of failing end points |
|--------------------------|--------------------------|---------------------------|
| -47                      | -12.4                    | 450                       |
| -27                      | -1.2                     | 54                        |
| -18                      | -0.25                    | 20                        |

Table 5.5: Setup summary

**Hold summary:** Similarly hold numbers in the initial cycle was WNS= -0.0080 ns, TNS=-0.0120 ns and 4 violating paths. So hold fixing also done using various techniques.

### 5.2 STA results of project 2:

Below are the results obtained in the 2nd project of where Static Timing analysis is performed. This is a complex core than the first project. Two kinds of runs are there namely Hiearchical and Flat run. They are mentioned in the sections with the summaries of setup and hold timings. Also, area and power recovery flow is explained.

#### 5.2.1 Hierarchical run

In hierarchical or the blockwise run, timing paths of only that particular block and the sub-block whose pins are given by the lef life is checked. Suppose block A has block B inside it and the block B has block C inside it.When the hierarchical run of block A is done, lef file (having pin location and block position) of block B is given.When the hierarchical run of block B is done, lef file (having pin location and block position) of block position) of block C is given.When the hierarchical run of block C is done, no need of any lef file (having pin location and block position). This blockwise STA run helps in understanding the the timings paths having high failing end points,worst negative slack and total negative slack. Further ECO's are provided based on the feedbacks from the teams. Next round of runs occur to see the the impact of the given ECO.

| Iteration | Opens | Shorts |
|-----------|-------|--------|
| 1         | 2     | 783    |
| 2         | 2     | 260    |
| 3         | 6     | 216    |
| 4         | 7     | 172    |

Table 5.6: Extraction summary for various iterations

| Worst negative slack(ns) | Total negative slack(ns) | No. of failing end points |
|--------------------------|--------------------------|---------------------------|
| -0.428                   | -82.67                   | 856                       |
| -0.508                   | -65.26                   | 642                       |
| -0.384                   | -44.78                   | 267                       |

Table 5.7: Setup summary

| Worst negative slack(ns) | Total negative slack(ns) | No. of failing end points |
|--------------------------|--------------------------|---------------------------|
| -0.408                   | -30.68                   | 1827                      |
| -0.285                   | -9.675                   | 871                       |
| -0.3                     | -1.827                   | 253                       |

Table 5.8: Hold summary

#### 5.2.2 Flat run

In the flat STA run, all the blocks and it's sub-blocks are integrated. Every block to block, macros to macros, timing paths are also checked.So, the .v , .def of all the sub-blocks are provided for the runs.Some macros and IPs are present and spread over all the sub-blocks. Thus, it's flattened at all times. The power information for all the blocks is also provided for the runs. After the full flat STA run is done, it helps in understanding the overall timing paths in the core. On analysing the reports and further many rounds of ECO's , the timing closure of the core is said to be done.

| Iteration | Opens | Shorts |
|-----------|-------|--------|
| 1         | 26    | 642    |
| 2         | 18    | 602    |
| 3         | 29    | 569    |
| 4         | 6     | 276    |

Table 5.9: Extraction summary for various iterations

Area and Power recovery flow: On doing area and power recovery, 2 and 1.5 percentage respectively can be recovered by keeping all the timing paths intact. All the validation checks are the done before the final timing closure of the chip.

#### 5.3 Conclusion

In this project ,Static timing analysis of a block is performed. Timing runs across all the scenarios are done to exhaustively verify the timing. On analysing the reports of extraction, setup, hold,TDRC, timing fixes are given. Depending on the nature of violation,

| Worst negative slack(ns) | Total negative slack(ns) | No. of failing end points |
|--------------------------|--------------------------|---------------------------|
| -0.1511                  | -10.07                   | 567                       |
| -0.1197                  | -7.6192                  | 389                       |
| -0.0734                  | - 1.872                  | 184                       |

| Table 5.10: 5 | Setup | summary |
|---------------|-------|---------|
|---------------|-------|---------|

| Worst negative slack(ns) | Total negative slack(ns) | No. of failing end points |
|--------------------------|--------------------------|---------------------------|
| -0.467                   | -267.138                 | 8712                      |
| -0.0863                  | -13.612                  | 1111                      |
| -0.0732                  | -9.673                   | 953                       |

Table 5.11: Hold summary

fixing techniques are applied. At the final stage of the project, area and power recovery is targeted. Final timing check is done after the area and power recovery flow. This ensures the timing closure of the block.

# Bibliography

- J. Bhasker and R. Chadha, Static timing analysis for nanometer designs: A practical approach. Springer Science & Business Media, 2009.
- [2] X. Peng, H. Wang, S. Wang, and J. Du, "A new generation of static timing analysis technology based on n7+ process—pocv," in 2019 IEEE 4th International Conference on Integrated Circuits and Microsystems (ICICM), IEEE, 2019, pp. 199–203.
- [3] S. PrimeTime, User guide, version d-2010.06, 2010.
- [4] Static timing analysis, 2011. [Online]. Available: http://www.vlsi-expert.com/
   2011/03/static-timing-analysis-sta-basic-timing.html.