# DESIGN AND CHARACTERIZATION OF HIGH SPEED 10T SRAM AND ANALYSIS OF MEMORY COMPILER

Major Project Report

Submitted in partial fulfillment of the requirements

for the degree of

Master of Technology

in

Electronics & Communication Engineering

(VLSI Design)

By

Shah Hetansh Pareshbhai (13MECV30)



Department of Electrical Engineering Institute of Technology Nirma University Ahmedabad-382 481 May 2015

# DESIGN AND CHARACTERIZATION OF HIGH SPEED 10T SRAM AND ANALYSIS OF MEMORY COMPILER

### Major Project Report

Submitted in partial fulfillment of the requirements for the degree of

Master of Technology in Electronics & Communication Engineering (VLSI Design)

By

### Shah Hetansh Pareshbhai

### (13 MECV30)

Under the guidance of

External Project Guide: Mr. Nitesh Gautam Manager, R&D, SYNOPSYS India Pvt. Ltd., Noida Internal Project Guide: Prof. Piyush Bhatasana Assistant Professor (EC Dept.), Institute of Technology, Nirma University, Ahmedabad



Department of Electrical Engineering Institute of Technology Nirma University Ahmedabad-382 481 May 2015

## Declaration

This is to certify that

- a. The thesis comprises of my original work towards the degree of Master of Technology in VLSI Design at Nirma University and has not been submitted elsewhere for a degree.
- b. Due acknowledgment has been made in the text to all other material used.

- Shah Hetansh P.

## Certificate

This is to certify that the Major Project entitled "DESIGN AND CHARAC-TERIZATION OF HIGH SPEED 10T SRAM AND ANALYSIS OF MEM-ORY COMPILER" submitted by Shah Hetansh Pareshbhai (13MECV30), towards the partial fulfillment of the requirements for the degree of Master of Technology in VLSI Design, Nirma University, Ahmedabad is the record of work carried out by him under our supervision and guidance. In our opinion, the submitted work has reached a level required for being accepted for examination. The results embodied in this major project, to the best of our knowledge, haven't been submitted to any other university or institution for award of any degree or diploma.

**Prof. Piyush Bhatasana** Internal Project Guide Mr. Nitesh Gautam External Project Guide

Dr. N. M. Devashrayee PG Co-ordinator (EC-VLSI Design) **Dr. P. N. Tekwani** Head of EE Dept.

Dr. Ketan Kotecha Director, IT-NU

Date:

Place: Ahmedabad

### Acknowledgements

I would have never succeeded in completing my Thesis without the cooperation, encouragement and help provided to me by various people. Firstly, my sincere thanks to the Memory Design team for their help and support during this training.

I am highly indebted to my Group manager Mrs. Nutan Agarwal, project manager Mr. Nitesh Gautam, my immediate supervisors Mr. Dinesh Gautam and Ms. Shivangi Mittal for providing necessary information regarding the project and also for their constant guidance, supervision, kind co-operation, and invaluable support in all aspects. My thanks and appreciations also go to my colleagues and team members in developing the project and for providing me with a lively and energetic work environment.

I would like to express my sincere gratitude to **Dr. Ketan Kotecha** (Director, Nirma University, Ahmedabad) for his continuous guidance, support and enthusiasm. I would take this opportunity to thank **Dr. P. N. Tekwani** (Head of Department, Electrical Engineering), **Dr. N. M. Devashrayee** (Professor and Program Coordinator, M.Tech - EC (VLSI Design)), Internal Guide **Prof. Piyush Bhatasana** and all the faculties at Nirma University (VLSI Design), for their vision and relentless effort, support, and encouragement to provide me with this excellent opportunity to carry out my project work in such a highly renowned and esteemed organization, Synopsys Inc. I am equally thankful to Synopsys Inc. for providing me the invaluable exposure to the industry and the current market trends.

> - Shah Hetansh P. 13MECV30

### Abstract

With the development of CMOS technology, memory occupies a large part on the entire chip area and hence becomes the main source of power dissipation in the SOC. SRAM is widely used in on-chip memory and as the channel length of MOSFET is scaling down, SRAM stability, density and speed becomes the major concern for future technology. In this project, a novel highly stable dual port 10T SRAM cell is proposed which is much faster and stable as compared to 6T bitcell. All the simulations are performed for 28nm technology. As we add four additional transistors, the read speed increases by 31%. The read current to leak current ratio is also improved by 17.3%. This helps in increasing the maximum number of physical rows present in a memory array. The read and write ports are different here, so the storage nodes are not affected due to read operation. This increases the Read Static Noise Margin (SNM) and ultimately the stability of the 10T bitcell by 142%. As a result, we can operate 10T at ultralow voltages which we cannot do with 6T. Due to the addition of four extra transistors, the average power dissipation is 53% more than 6T. But this power dissipation can be reduced by operating 10T at low voltages and hence average power dissipation is also optimized. The major disadvantage of 10T SRAM is area occupied and leakage current. The area occupancy is almost 2.5 times that of 6T while leakage current increases by 32% This project also shows my work on memory compiler. In System on Chip (SoC) design, it is required to have a memory chip with different aspect ratio and different size. It is also required to have different features of memory for different requirements of the design. Memory compiler is the tool by which different instances of memory can be generated depending on the input given to the Memory compiler. A design and development flow of memory compiler is shown which depicts the step by step work on memory compiler development. The main work focuses on the various design checks, margin analysis and Quality Assurance (QA) checks.

# Contents

| Declaration   |                                      |                                                                                                           |                                                                                                                                          |  |  |  |  |  |
|---------------|--------------------------------------|-----------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|
| $\mathbf{C}$  | Certificate iv<br>Acknowledgements v |                                                                                                           |                                                                                                                                          |  |  |  |  |  |
| A             |                                      |                                                                                                           |                                                                                                                                          |  |  |  |  |  |
| $\mathbf{A}$  | bstra                                | $\mathbf{ct}$                                                                                             | vi                                                                                                                                       |  |  |  |  |  |
| $\mathbf{Li}$ | st of                                | Figures                                                                                                   | x                                                                                                                                        |  |  |  |  |  |
| $\mathbf{Li}$ | st of                                | Tables                                                                                                    | xi                                                                                                                                       |  |  |  |  |  |
| $\mathbf{A}$  | bbre                                 | viation Notation and Nomenclature                                                                         | xii                                                                                                                                      |  |  |  |  |  |
| 1             | <b>Intr</b><br>1.1<br>1.2<br>1.3     | oduction         Objective and scope of project         Memory classification         Memory Architecture | <b>1</b><br>1<br>3<br>5                                                                                                                  |  |  |  |  |  |
| <b>2</b>      | <b>Lit</b> e<br>2.1                  | erature survey<br>SRAM v/s DRAM                                                                           | <b>8</b><br>8                                                                                                                            |  |  |  |  |  |
|               | 2.2                                  | 6T SRAM                                                                                                   | 9<br>11<br>12                                                                                                                            |  |  |  |  |  |
|               | 2.3<br>2.4                           | Sense Amplifier                                                                                           | $     \begin{array}{r}       14 \\       15 \\       16 \\       16 \\       17 \\       18 \\       19 \\       22 \\     \end{array} $ |  |  |  |  |  |
|               | $2.5 \\ 2.6$                         | Disadvantages of 6T SRAM                                                                                  | 22<br>23<br>24                                                                                                                           |  |  |  |  |  |

|          |       | 2.6.1 What is Memory compiler    | 24 |
|----------|-------|----------------------------------|----|
|          |       | 2.6.2 Why Memory compiler        | 24 |
|          |       |                                  |    |
| 3        |       | alysis of Dual port 10T SRAM     | 25 |
|          | 3.1   | Read operation                   | 26 |
|          | 3.2   | Write operation                  | 27 |
|          | 3.3   | Read time analysis               | 29 |
|          | 3.4   | Read current analysis            | 31 |
|          | 3.5   | Leak current analysis            | 34 |
|          | 3.6   | SNM analysis                     | 36 |
|          | 3.7   | Power analysis                   | 40 |
|          | 3.8   | Power optimization               | 42 |
| 4        | 6T    | v/s 10T comparison               | 44 |
|          | 4.1   | Area                             | 44 |
|          | 4.2   | Read time                        | 45 |
|          | 4.3   | Read current                     | 47 |
|          | 4.4   | Leak current                     | 50 |
|          | 4.5   | Read/Leak current ratio          | 52 |
|          | 4.6   | Static Noise Margin (SNM)        | 53 |
|          | 4.7   | Average Power dissipation        | 56 |
|          | 7.1   |                                  | 50 |
| <b>5</b> | Me    | mory Compiler analysis           | 59 |
|          | 5.1   | Introduction                     | 59 |
|          | 5.2   | Memory Compiler development flow | 61 |
|          | 5.3   | Memory Compiler features         | 63 |
|          | 5.4   | Signal flow in memory compiler   | 68 |
|          |       | 5.4.1 Read signal flow           | 68 |
|          |       | 5.4.2 Write signal flow          | 69 |
|          | 5.5   | QA checks                        | 70 |
|          |       | 5.5.1 Primetime                  | 70 |
|          |       | 5.5.2 Timever                    | 71 |
|          |       | 5.5.3 LibCompare                 | 71 |
|          |       | 5.5.4 Ccsn & Ccst                | 71 |
|          |       | 5.5.5 Espev                      | 72 |
|          |       | 5.5.6 Redhawk                    | 72 |
|          |       | 5.5.7 Familyverify               | 72 |
|          |       | 5.5.8 IQA                        | 72 |
|          |       | ·                                |    |
| 6        | Cor   | nclusion and Future scope        | 73 |
| р        | oforo | ences                            | 74 |

# List of Figures

| 1.1  | Memory classification                                                                                                                 |
|------|---------------------------------------------------------------------------------------------------------------------------------------|
| 1.2  | Memory Architecture                                                                                                                   |
| 2.1  | 6T SRAM cell                                                                                                                          |
| 2.2  | $6T Read operation \dots \dots$ |
| 2.3  | 6T Write operation $\ldots \ldots 13$      |
| 2.4  | Sense Amplifier                                                                                                                       |
| 2.5  | 6T Read Current                                                                                                                       |
| 2.6  | 6T Leak Current                                                                                                                       |
| 2.7  | Butterfly curve                                                                                                                       |
| 2.8  | 6T SNM                                                                                                                                |
| 2.9  | Read margin 1                                                                                                                         |
| 2.10 | Read margin 2                                                                                                                         |
| 2.11 | Read margin settings $\ldots \ldots 21$                         |
| 2.12 | Static power dissipation                                                                                                              |
| 3.1  | 10T SRAM cell                                                                                                                         |
| 3.2  | 10T read operation $\ldots \ldots 27$                           |
| 3.3  | 10T write operation $\ldots \ldots 28$            |
| 3.4  | 10T SNM measurement circuit    36                                                                                                     |
| 4.1  | Read time comparison with pull down size variation                                                                                    |
| 4.2  | Read time comparison with pass gate size variation                                                                                    |
| 4.3  | Read time comparison with process variation                                                                                           |
| 4.4  | Read time comparison with supply voltage variation                                                                                    |
| 4.5  | Read time comparison with temperature variation                                                                                       |
| 4.6  | Read current comparison with pull down size variation                                                                                 |
| 4.7  | Read current comparison with pass gate size variation                                                                                 |
| 4.8  | Read current comparison with process variation                                                                                        |
| 4.9  | Read current comparison with supply voltage variation                                                                                 |
| 4.10 | Read current comparison with temperature variation                                                                                    |
| 4.11 | Leak current comparison with pull down size variation                                                                                 |
|      | Leak current comparison with pass gate size variation                                                                                 |
| 4.13 | Leak current comparison with process variation                                                                                        |

| 4.14 | Leak current comparison with supply voltage variation                              | 51 |
|------|------------------------------------------------------------------------------------|----|
| 4.15 | Leak current comparison with temperature variation                                 | 52 |
| 4.16 | SNM comparison with pull down size variation                                       | 53 |
| 4.17 | SNM comparison with pull up size variation                                         | 53 |
| 4.18 | SNM comparison with pass gate size variation                                       | 54 |
| 4.19 | SNM comparison with process variation                                              | 54 |
| 4.20 | SNM comparison with supply voltage variation                                       | 55 |
| 4.21 | SNM comparison with temperature variation                                          | 55 |
|      | Average static power dissipation with supply voltage variation $\ldots$            | 56 |
| 4.23 | Average read power dissipation with supply voltage variation                       | 56 |
| 4.24 | Average write power dissipation with supply voltage variation                      | 57 |
| 4.25 | Average dynamic power dissipation with supply voltage variation $\dots 5^{\prime}$ |    |
| 4.26 | Average power dissipation with supply voltage variation                            | 58 |
| 5.1  | Role of memory compiler                                                            | 60 |
| 5.2  | Memory compiler development flow                                                   | 61 |
| 5.3  | Center decoding concept                                                            | 64 |
| 5.4  | Column Muxing                                                                      | 65 |
| 5.5  | Read signal flow in compiler                                                       | 68 |
| 5.6  | Write signal flow in compiler                                                      | 69 |
|      |                                                                                    |    |

# List of Tables

| 3.1  | Read time analysis of 10T with varying tail transistor size | 29 |
|------|-------------------------------------------------------------|----|
| 3.2  | Read time analysis of 10T with varying pass transistor size | 29 |
| 3.3  | Read time analysis of 10T with Process variation            | 30 |
| 3.4  | Read time analysis of 10T with Supply voltage variation     | 30 |
| 3.5  | Read time analysis of 10T with temperature variation        | 31 |
| 3.6  | Read current analysis with varying tail transistor size     | 31 |
| 3.7  | Read current analysis with varying pass transistor size     | 32 |
| 3.8  | Read current analysis with process variation                | 32 |
| 3.9  | Read current analysis with supply voltage variation         | 33 |
| 3.10 | Read current analysis with temperature variation            | 33 |
| 3.11 | Leak current analysis with varying tail transistor size     | 34 |
| 3.12 | Leak current analysis with varying pass transistor size     | 34 |
| 3.13 | Leak current analysis with Process variation                | 35 |
| 3.14 | Leak current analysis with supply voltage variation         | 35 |
| 3.15 | Leak current analysis with temperature variation            | 35 |
| 3.16 | SNM analysis with varying pull down transistor size         | 37 |
| 3.17 | SNM analysis with varying pull up transistor size           | 37 |
| 3.18 | SNM analysis with varying pass transistor size              | 38 |
|      | SNM analysis with Process variation                         | 38 |
|      | SNM analysis with Supply voltage variation                  | 39 |
| 3.21 | SNM analysis with Temperature variation                     | 39 |
| 3.22 | Average static power dissipation with VDD variation         | 40 |
| 3.23 | Average read power dissipation with VDD variation           | 40 |
| 3.24 | Average write power dissipation with VDD variation          | 41 |
| 3.25 | Average dynamic power dissipation with VDD variation        | 41 |
| 3.26 | Average power dissipation with VDD variation                | 42 |
| 3.27 | 6T vs 10T average power dissipation with VDD variation      | 42 |
| 3.28 | 6T vs 10T SNM with VDD variation                            | 42 |
| 4.1  | 6T v/s 10T Area comparision                                 | 44 |
| 4.2  | 6T v/s 10T Read/Leak current ratio comparision              | 52 |

# Abbreviation Notation and Nomenclature

| SoC  | System on Chip                          |
|------|-----------------------------------------|
| CMOS | Complementary Metal Oxide Semiconductor |
| ROM  | Read Only Memory                        |
| SRAM | Static Random Access Memory             |
| DRAM | Dynamic Random Access Memory            |
| UV   | Ultra-Violet                            |
| VTC  | Voltage Transfer Characteristic         |
| SNM  | Static Noise Margin                     |
| VCVS |                                         |
| CMUX | Column Muxing                           |
| SAE  |                                         |
| ВТ   | Bitline                                 |
| WL   |                                         |
| RM   | Read Margin                             |
| WM   | Write Margin                            |
| PVT  | Process Voltage Temperature             |
| GDS  | Graphical Design Structure              |
| FE   | Front End                               |
| BE   | Back End                                |
| QA   | Quality Assurance                       |

# Chapter 1

# Introduction

## 1.1 Objective and scope of project

The advancement in CMOS technology had a significant effect on the field of semiconductor memories. Memory occupies a major area nowadays in any SoC and also the requirement of memory with different sizes and different features has increased the complexity of the design. Standard 6T bitcell is used in sram memory compiler design but with the need of high speed, high stability and reduction in the operating voltage, a new bitcell architecture is required whose performance is better and can serve as a replacement to 6T bitcell

The objective of my project is to design and characterize a novel dual port 10T SRAM with high performance and predict its actual behavior on silicon with different variations. Analysis of various design parameters such as Read current, Leak current, Read SNM, Power dissipation, Read Speed, Area occupied needs to be carried out and optimized by adjusting the transistor sizes. Here as we add 4 extra transistors, the power dissipation will be greater as compared to 6T. But as the SNM of 10T is better, we can operate 10T at lower supply voltage and hence power dissipation is to

design a 10T with single read bitline, but in that case the read speed is decreased significantly. Similarly we can increase the read speed by increasing the size of extra 4 transistors. But the area will also be increased drastically with that. So choosing a correct 10T SRAM design is very important and it depends on the specifications. 10T SRAM can be designed in different possible ways to get different advantages such as Read SNM, Read speed, low power etc. Here my main aim is to increase the speed and stability of SRAM and also to increase the read to leak current ratio by designing a novel 10T bitcell.

In any SoC, memory occupies major area on silicon. It may happen that depending on the requirement, memory with different sizes and different features needs to be generated. So memory compiler generates memory instance with different size and features as per customer requirement. Different features include column muxing, center decode, bank architecture, dual rail power supply, power gating, bist enable, redundancy enable. Read margin and write margin are very important concepts which has a direct impact on the design and are very critical parameters from design point of view. Different design and quality assurance (QA) checks are needed for the correct functionality and working of compiler. One can implement a memory compiler with wide number of features depending on the requirement but the complexity and price also increases simultaneously. Here, my main objective is to show the design and development flow of memory compiler i.e. step by step work carried out. To discuss the read and write margin parameters and how they affect the memory compiler design. Different design and QA checks are also discussed.

### **1.2** Memory classification



Figure 1.1: Memory classification

Nowadays, modern digital systems require that they are able to store and retrieve huge amount of information at very high speed. Semiconductor memories are those circuits or systems which can store all those digital information in very large quantities. Memory circuits are available in variety of forms such as SRAM, DRAM, ROM, EPROM, EEPROM, FLASH and FRAM. Each of them has different design but the basic structure and working are nearly the same.

Random Access Memory (RAM) stores the information in flip-flop style circuits, or simply as charge on capacitors. The read and write speed is approximately equal. The read-write memories are volatile because the data is stored in active circuits. Volatile means the stored data is lost if the power supply to the memory is turned off. The two most common types of RAM are SRAM (Static Random Access Memory) and DRAM (Dynamic Random Access Memory). SRAM's hold the data stored at the storage nodes till the power supply is ON. Once the power supply is interrupted, the stored data is lost. These are the high speed memories. DRAM's store the data as charge on capacitors. As the capacitor discharges with time, they are more succeptible to noise compared to SRAM. They are much slower and denser as compared to SRAM because the size of capacitor is very small. DRAM's are upto 4 times denser for a given technology.

Read Only Memory (ROM) can store information according to the transistor presence at cross section of rows and columns. The read speed for ROM is similar to that of RAM. ROM's are non volatile and the write speed is low as compared to RAM. It depends on the method used to enter stored data. A simple ROM is programmed during manufacturing phase by formation of physical patterns on the chip and changing of stored data after that is impossible. These are known as masked programmed ROM's. In contrast to that, Programmable Read Only Memory (PROM) can also perform write operation. They have a data path in between each row and column at the time of manufacturing. Logic 1 is stored in every data position at the time of manufacturing and based on user requirement selective cells can be switched to logic 0 only once after manufacture. Appropriate electric pulses are applied to blow out the row-column data path. Once programmed, data position blown to 0 cannot be changed to 1

All the bits are in one binary state initially for Erasable Programmable Read Only memory (EPROM). Similar to PROM, EPROM are also programmed electrically, but here all the bits can be erased when exposed to Ultraviolet (UV) light. When these components are packaged, transperent windows are present over the chip which allows the UV rays to pass through them.Electrically Erasable Programmable Read Only Memory can be written and erased by means of electrical voltage. EEPROM's or E2PROM or E-squared PROM's are selectively erased compared to EPROM's which has to be totally erased and written again to change a single bit. The writing operation consumed a lot of time as compared to read operation. It ranges from microseconds to milliseconds but the data is retained even when the supply is turned OFF and so ROM are non volatile memories. Flash memory can erase simultaneous blocks of memory. Here hot electron effect is used for the writing purpose whereas Fowler-Nordheim (FN) tunneling effect is used by E2PROM Flash for write operation. Both these types are erased with FN tunneling. The large storage capacity has made Flash an emerger for mass storage devices. It has also atarted to replace ROM's on many chips even though there is an additional processing required to manufacture Flash for a particular CMOS technology

FRAM or FeRAM are memories based on ferro-electric material. they can also be designed to retain stored data when power is turned off. The Perovskite crystal material which is used in the memory cells of this RAM is polarized in one direction or the other in order to store the desired value. The polarization is retained when the supply voltage is removed and thus creating a non volatile memory. But semiconductor memories are preferred as compared to ferroelectric memories for most of the applications because these ferroelectric memories are very costly. Their operating speed is low and the area occupied is also large as compared to semiconductor memories. Recently in certain applications like smart cards, FRAM's have proven to be more useful nonvolatile memory and can be more attractive in the future because of its extremely high storage density

### **1.3** Memory Architecture

The organization shown here is random access architecture.Random access name is given because the memory locations can be accessed in a random manner at a fixed rate for read and write purpose independent of the physical location. The storage array consists of horizontal rows and vertical columns and consists of simple circuits arranged to share connection. The horizontal lines driven from outside to access a particular cell is called wordline while the vertical lines through which the incoming and outgoing transfer of data takes place is called bitline. A wordline and bitline



accesses a particular cell for read and write operation. Each cell is capable of storing logic 0 and logic 1. 4, 8, 16, 32, 64 columns can be selected at once depending on the application. The binary address information is decoded to select row and column or group of columns. For eg. n bit decoder for row selection consists of 2n number of output lines, a different one enables for different n bit input code. The column decoder has m inputs and outputs 2m number of bitline access signals out of which 1, 4, 8, 16, 32, 64 can be enabled at once depending on user. A multiplexer circuit is used for bit selection to direct the output of corresponding cell to data registers. so a total of 2n x 2m number of cells are stored in the memory core array

Memory bitcell circuits are implemented in a wide number of ways. In main, the bitcells are based on the flip-flop designs because their intended function is to store the bits of data. These flip-flops however require a very large amount of chip area and therefore are not suitable for applications where we require millions of such cells. In fact, most of the memory circuits are simple as compared to the flip-flop and register circuits. The data storage function is preserved and other properties such as quantization of amplitude, input and output isolation, regeneration of logic levels and fanout driving capability are sacrifised for cell simplicity. The number of devices in a single cell can be reduced to one to six transistors in this way. With properly designed peripheral circuits, a desired logic property can be recovered at memory chip level. The peripheral circuits include the decoders, sense amplifiers, column precharging, buffers etc. These circuits are designed in such a way that they can be shared among many memory bitcells. Read and Write circuits determine whether the data is stored or being retrieved form the cells and accordingly perform necessary amplification, buffering and translation of voltage levels.

# Chapter 2

# Literature survey

## 2.1 SRAM v/s DRAM

RAM's can be categorized mainly into two types, each having its own advantages and disadvantages : SRAM (Static Random Access Memory) and DRAM (Dynamic Random Access Memory). Each of them holds the data in different way. Periodic refreshing of data is required in DRAM to retain the information while SRAM does not requires refreshing because the transistors indeed will continue to hold the data till the power supply is ON. This leads SRAM to have some advantages, one of which is faster speed for the read and write operation. In DRAM for refresh cycle, additional circuit and timing are needed which creates some complications and makes it slower as well as less desirable than the SRAM. One major disadvantage of DRAM is the much higher power consumption and dissipation because of the charging and discharging of capacitors. SRAM structure is rather simple as compared to DRAM and hence it is easier to create an interface to access the memory.

A large number of transistors are required by the SRAM structure in order to store a certain amount of memory. On the other hand, DRAM structure needs a capacitor and transistor to store one bit of data whereas SRAM requires 6 transistors. The number of transistors used determines the storage capacity in a memory module. So a DRAM module will have 6 times more the capacity when compared to a SRAM module. This ultimately reduces the price which is the major concern for most of the buyers. As DRAM is cheaper, it has been the mainstream in computer for main memory despite of being slower and consuming more power compared to SRAM. On the other hand, SRAM is still used in devices where speed is a major concern rather than the capacity. SRAM's are mostly used in the cache memory of the processors where speed is mare important. Optical drives, Hard drives and other devices use SRAM for cache memory and buffers.



### 2.2 6T SRAM

Figure 2.1: 6T SRAM cell

To retain the stored data if no periodic clock signal is required, then those memories are said to be static memories. Such memory cells have a direct access to supply voltage and ground both. SRAM's (Static Random Access Memory) are the readwrite memory bitcell circuits based on flip-flops. The basic structure of a SRAM cell is shown above in the figure. There are two cross coupled inverters and at the output of both there is an access or pass transistor. The gate terminal of these access transistor is connected to the wordline while the source/drain terminal of this pass transistor is connected to the bitlines. The wordline which is the output of the row decoder is used to select the cell depending on the n bit address while the bitlines are used to transfer the data in and out of the selected cell. As shown in the above figure, data is stored at XT node in the cell whereas its complement is stored at XB node. The Voltage Transfer Characteristic (VTC) conveys the important cell design consideration to read and write the data. In this configuration, stored value is represented by two stable states in the characteristic. When one of the internal node crosses the switching threshold, the cell will lose its current state otherwise it will retain its original state. So during a reaad operation, we should not disturb the current state of the cell and while performing the write operation we must cross the switching threshold voltage and change the state of the cell or flip the cell.

An important design criteria for 6T is the transistor sizing. The sizing of the cell should be such that a proper read and write operation can be performed. During read operation we need to prevent the cell from flipping while during write operation we need to flip the cell. So transistors should be sized accordingly. Sense amplifier is used for reading purpose. As transistor sizes are small and load capacitance is very high, read operation becomes very slow because of charging and discharging of capacitance. So sense amplifiers are enabled during read operation to increase read speed. Now we will discuss the read and write operation of 6T, Sense amplifier operation and transistor sizing in 6T. The we will see the important design parameters for SRAM memory.

#### 2.2.1 Read operation

Assume that a '0' is stored at XT and '1' on XB as shown in the figure below. Therefore MN1 is ON and MN2 is OFF. Initially BL and BLB are precharged to VDD by pair of column pull up transistors (not shown). The wordline is initially held in low standby state and then raised to VDD which now turns on the access transistors MN3 and MN4. As wordline turns ON, current begins to flow from MN3 and MN1 to ground, as shown in figure. As a result, cell current slowly discharges the capacitance of BL (CBL). Meanwhile, XB which is on the other side of the cell remains unaffected since there is no path for XB node to discharge to ground through MN2. The difference between BL and BLB is given to a sense amplifier which senses the difference and accordingly pulls down one node to ground and pulls up other one to VDD and the output is stored in a data buffer. After the read operation is completed, the wordline is turned OFF and the bitlines are again precharged to VDD for the next operation.



Figure 2.2: 6T Read operation

For read operation, we should ensure that the stored data must not be disturbed in anyways and hence we should design the transistor sizes accordingly. The main problem is the current flow through MN3 to MN1. When wordline turns ON and current starts flowing, the output voltage at storage node XT is raised. If this voltage rises above the threshold voltage of MN2, it can turn ON MN2 and bring down the voltage at XB node. Similarly the voltage at node XB may drop a little but it should not fall below the threshold of MN1. To avoid such situations, we must control the voltage at node XT by proper sizing of MN1 and MN3. If we increase the conductance of MN1 3 to 4 times that of MN3, then the drain voltage of MN1 would not rise above the threshold of MN2. The read stability criteria decides the exact ratio of these 2 transistors

$$W_{N1}/W_{N3} = 1.5 \tag{2.1}$$

### 2.2.2 Write operation

The figure below depicts the write operation in standard 6T SRAM. To write '0' or '1', one bitline is forced LOW while other bitline is forced HIGH. Here as shown in figure below, '0' and '1' are stored initially at XT and XB. Now we want to write '1' at XT. Hence we will force BL to VDD while BLB to GND. Now when wordline is turned ON, node XB starts to discharge because of the current flow from MP2 through MN4 to ground. The design of the cell must be such that the conductance of MN4 should be several times that of MP2 so that the drain terminal of MP2 is pulled below the threshold voltage of MN1. Due to this, a regenerative effect occurs between the two inverters and eventually MN1 turns OFF and the drain voltage of MN1 rises to VDD due to pull up of MN3 and MP1. At the same time, MN2 turns ON and pulls the node XB to LOW. When the cell flips and write operation is successful, the wordline turns OFF and returns into its low standby level.

The design of the pull up and pass gate transistors is very important for a proper write operation. We want to write '1' at XT node. Hence BL is at VDD while BLB is at GND. As the cell is turned ON, there forms a pseudo-NMOS inverter configuration. When wordline turns ON, current starts to flow from MP2 through MN4 to ground.



Figure 2.3: 6T Write operation

As time passes, XB node discharges and when the voltage at node XB reaches below the threshold of MN1, regenerative action takes place and MN1 turns OFF forcing the drain of MN1 to VDD via MP1 transistor. Here, the transistor sizes should be such that the voltage at node XB drops below the threshold of MN1. Now suppose the size of MP1 is greater than MN4, then the current flow from MP2 to XB node will be faster than the current flow from XB to MN4 to ground and hence the voltage at XB will never go below the threshold of MN1 and the cell won't flip. Hence the sizing of pass transistor should be greater than the pull up transistor and less than the pull down transistor otherwise it will affect the read operation

$$W_{N4}/W_{P2} = 1.5 \tag{2.2}$$

In general, the sizing of pull up, pass gate and pull down can be summarized as below

$$pullup < passgate < pulldown \tag{2.3}$$

### 2.3 Sense Amplifier



Figure 2.4: Sense Amplifier

Figure above shows a sense amplifier. It is a back to back connected or cross coupled inverters and resembles similarity to 6T SRAM except for the absence of pass transistor and an extra tail NMOS transistor. Sense amplifiers are used for read operation only and are OFF during write operation. The controlling of sense amplifier is done by the SAE (Sense Amplifier Enable) signal which is applied to the gate of the tail transistor. Sense Amplifier can work only when this tail transistor is turned ON. The transistors used in sense amp are very large as compared to 6T bitcell transistors because of the requirement to drive high load and increase the speed.

For read operation, we precharge RLB and RBLB both to VDD. Now suppose data stored at XT is '0' and that at XB is '1'. We want to read XT i.e. '0'. When wordline is turned ON, current starts to flow from RBL to pass gate to pull down transistor to ground. On the opposite side, no current flows because XB and RBLB both are at VDD. As current flows through XT, RBL discharges to ground. This discharge rate is very low since the sizing of transistors is very small. Hence we need sense amplifier to sense this differential voltage and increase the read speed by pulling RBL to ground. When a sufficient differential voltage of about 50-100 mv is established between RBL and RBLB, SAE signal goes high and sense amp comes in the picture. The voltage at node QB is VDD and voltage at node Q is VDD-dV. Hence the Vgs of N2 will be lower than the Vgs of N1. So according to the drain current equation which depends directly on Vgs, the current flowing through N1 will be larger than the current flowing through N2. As these are cross coupled inverters, it forms a positive feedback and more current starts flowing through N1 as compared to N2. In this way node Q is pulled down to logic '0' before node QB because of the bistable nature of this structure. The node Q is connected to output data buffer. In this way sense amplifier increases the read speed because its size is much larger than the bitcell. For each column there is a sense amplifier, so we can afford larger sizing of its transistors because at a time only 1 cell will be operating in a column. Normally the size of sense amplifier transistors is about 10 times that of bitcell but it may vary depending on the bitline load and speed requirement.

### 2.4 SRAM Design parameters

The proper designing of 6T SRAM bitcell is very important because it directly affects various parameters. For e.g. if we increase the pull down transistor size significantly to improve the read operation, then there will be difficulty in write operationand the cell wont flip. Similarly if we increase the pass transistor size to increase the read current then leak current and power dissipation also increases. So one needs to size the transistors such that every parameter is optimized. The various design parameters include Read current, Leak current, Read SNM (Static Noise Margin), Flip time, Read Margin etc. Here we will discuss them in detail.

#### 2.4.1 Read current

Figure below shows how read current is measured. Read current is the current flowing



Figure 2.5: 6T Read Current

through the pass transistor when word line turns ON for the read operation. When word line is turned ON, the pass transistor connected to node storing 0 is turned ON and current flows from bitline to ground via pull down transistor. As shown in the figure, XT is at '0' and XB is at '1'. BL and BLB are precharged to VDD. Read current is the current flowing through MN3 when WL is turned ON. Read current should be as high as possible because it determines the worst read/leak ratio through which we can estimate the maximum number of physical rows possible.

#### 2.4.2 Leak current

Leak current is the current flowing through the pass transistor when word line is OFF and the cell is in idle state. Even when the word line is OFF, leakage current flows from bitline to ground via pass gate and pull down transistor. As shown here in the figure, the cell is in idle state with XT and XB storage nodes at '1' and '0' respectively. Leak current is the current dlowing through MN4 when wordline is OFF. Leak current should be as low as possible because it determines the worst read/leak



Figure 2.6: 6T Leak Current

ratio through which we can estimate the maximum number of physical rows possible. Leak current also determines the static power dissipation in the memory array. So large leakege curent is not affordable since it can also flip the cell.

#### 2.4.3 Flip time

Flip time is the time required by the cell to flip its value. When we are performing write operation, one bitline is at VDD while other is at ground. When WL turns ON, bitline which is at VDD discharges and after some time cell flipping takes place. This time required to flip the cell is flip time. There are two ways to determine flip time.

1) WL driven :-

For WL driven, it is the time required for cell to flip from WL rise 50% to the storage nodes getting flipped.

2) BL driven :-

For BL driven, it is the time required for cell to flip from 50% fall of BL to the storage nodes getting flipped

### 2.4.4 Static Noise Margin (SNM)

Static Noise Margin in 6T SRAM is the maximum noise voltage that can be tolerated at the storage nodes which does not flip the cell value. It is a very important design



Figure 2.7: Butterfly curve [4]

parameter as it determines the stability of the cell. SNM can be determined by the butterfly curve as shown in figure above. This curve is nothing but VTC (Voltage Transfer Characteristic) of two back to back connected inverters. SNM is the largest possible square which can be fit in these curves. The next figure shows how we can



Figure 2.8: 6T SNM

actually measure SNM in 6T SRAM circuit. Noise can occur at any of the storage

nodes. Here we take worst case possible where noise voltage is applied at both the storage nodes. Vx is voltage source and E1 is VCVS. When Vx is increased, E1 also increases. To measure SNM for this circuit WL is turned ON for read operation and simultaneously Vx is also increased. the voltage Vx at which the cell flips is the SNM of 6T.

### 2.4.5 Read margin

The memory array bitlines are pre-charged prior to a memory cell access. After accessing the memory (Read cycle), a differential signal develops between the bitlines (bitline and bitline bar). This differential signal is fed to the input of the bitline sense amplifier to determine what data is stored in the bitcell. Since it takes time for the differential signal on the bitlines to develop, the greater the time delay prior to strobing the sense amplifier, the greater will be the differential signal at the input to the sense amplifier. However, delaying the time when the sense amplifier is strobed results in a longer cycle time, reducing maximum operating speed and increasing access time. Hence the trade off of memory speed verses yield/reliability. The longer you wait, the easier it is for the sense amplifier to determine what was stored in the memory cell. Thus the term Robustness. The longer you wait, the longer it takes to access the cell (i.e., access time). Thus, the term Speed Tradeoff. So Read Margin is one of the most important criteria while designing because we need to turn ON the sense amplifier at the right time otherwise read data will be incorrect. It also affects the speed. In memory compiler there are different modes of operation i.e. Default, Fast, Slow. The SAE signal is determined according to the mode selected. There are 4 pins available to user for RM. Depending on the configuration selected, the time required till the SAE signal goes high changes. This concept of Read Margin is shown in below figures.

As seen from the waveforms below, initially BT and BB both are precharged to VDD. Now when WL is turned ON for read operation, BT discharges to ground while BB remains at VDD and hence a differential voltage is developed between two bitlines. Now in first waveform, SAE is enabled fast and so we get output very fast. However in second waveform, SAE is enabled late due to which enough differential voltage is developed and we can ensure a correct read operation. But at the same time the time also increases. So depending on when SAE is enabled, we have three modes i.e. DEFAULT, FAST, SLOW.

Figure 2.11 shows how RM settings are done to select a particular mode out of these 3 modes. For fast mode we want SAE signal to be generated fast. So four pin combinations (PIN1, PIN2, PIN3, PIN4) are selected such that SAE signal is generated fastest which turns ON Sense amplifier quickly when low differential voltage is developed. Similarly for SLOW mode settings are kept so that SAE reaches very late to ensure enough differential voltage is developed. For DEFAULT mode in between settings are kept. Here all those settings work only when clk is active and bitlines are precharged.



Figure 2.9: Read margin 1



Figure 2.10: Read margin 2



Figure 2.11: Read margin settings

#### 2.4.6 Average power dissipation

Power dissipation which was previously considered an issue only in portable devices is rapidly becoming a significant design constraint in many system designs. SRAM memory is widely used and so power dissipation becomes an important design parameter. There are two types of power dissipation in SRAM. 1) Dynamic 2) Static. Dynamic power is the power dissipated when the memory is in active state. The main source of dynamic power dissipation is the switching activity from 0 to 1 and 1 to 0. Another source of dynamic power dissipation is the charging and discharging of load capacitance and operating frequency. The power dissipated in absence of any switching activity or when the memory is in idle state is known as static power dissipation. The main source of static power dissipation is the leakage current through transistors.



Figure 2.12: Static power dissipation

The figure above shows how to measure static power dissipation. When the cell is in idle state, wordline is OFF and so pass transistors are OFF. Suppose 0 is stored at XT and 1 at XB. Hence transistors MN1 and MP2 are ON while transistors MN2 and MP1 are OFF. Bitlines are precharged to VDD. Now leakage current flows through OFF transistors MN3, MN2 and MP1 due to potential difference between source and

drain. For MN4, source and drain both are connected to VDD. So no leakage current flows through MN4. The average static power dissipation here is the summation of all the leakage current multiplied by the supply voltage i.e. VDD.

Similarly the dynamic power is measured during read and write operation. The dynamic power is directly proportional to load capacitance, operating frequency and square of supply voltage. So we need to adjust all these components to optimize dynamic power. Average dynamic power is the summation of power dissipated in N cycles divided by N (no of clock cycles). For any cycle either read or write operation can take place. So average dynamic power dissipation is the average of read plus write power disipation for one clock cycle.

## 2.5 Disadvantages of 6T SRAM

6T SRAM is the most widely used for memory but with the advancement in CMOS technology and decreasing channel length, CMOS is facing lot of problems. At low channel length, stability, leakage current and power dissipation are of major concern. So a more robust SRAM is required. The disadvantages of 6T are :

- Low cell stability
- Low read current
- High read time
- High leakage
- More power dissipation
- Can not be operated at ultralow voltage

### 2.6 Memory compiler

#### 2.6.1 What is Memory compiler

Memory Compiler tailors memory circuits for specific design needs. The Memory Compiler contains software for the automatic generation of static memory circuits (SRAMs) based on parameters set by the user. It has the ability to generate a range of SRAMs with different output data formats for integrating memory into a design. Memory generators can also produce built-in self-test (BIST) logic, to speed manufacturing testing, and redundant rows/columns to improve yield. Compiler can be parameterized by number of words, number of bits per word, desired aspect ratio, number of sub banks, degree of column muxing, etc. Area, delay, and energy consumption complex function of design parameters and generation algorithm

### 2.6.2 Why Memory compiler

With the rapid advancement in technology, often hundreds of memory instances are required in a modern SoC with optimized area and performance and having different features. In ASIC flow, memory compilers are used to generate layout for SRAM blocks in design. Memory compilers consist of various generators to satisfy the requirements of the circuit. Each of the final building block, the physical layout, will be implemented as a stand-alone, densely packed, pitch-matched array. Using this complex layout generator and adopting state-of-the-art logic and circuit design technique, these memory cells can realize extreme density and performance. In each layout generator, we added an option which makes the aspect ratio of the physical layout selectable so that the ASIC designers can choose the aspect ratio according to the convenience of the chip level layout. Various features in memory compiler include column muxing, center decode, bank selection, power gating, dual rail supply, bist and redundancy enable etc. Customer can choose from all the options which he needs. As the flexibility increases, the complexity and price increases.

# Chapter 3

# Analysis of Dual port 10T SRAM

Figure 3.1 shows a schematic of a 10T SRAM with differential read bitlines (RBL and RBLB) and write bitlines (WBL and WBLB). Two NMOS transistors (MN5 and MN6) for the RBL and the other additional NMOS transistors (MN7 and MN8) for RBLB are appended to the conventional 6T SRAM. Precharge circuits must be implemented on the RBL and RBLB. This 10T SRAM is dual port because it has separate ports for read and write operation. It has also different wordlines for read (RWL) and write operation (WWL). So the respective wordline is turned on one at a time. The main disadvantage of this circuit is the area occupied because of the addition of the extra four NMOS transistors. The advantage of this circuit is the faster read operation as compared to 6T SRAM because of the larger size of extra four transistor. So read speed increases and hence the speed of memory increases. The other advantage is the stability of the cell. As read and write operation has different ports, the stability of 10T cell increases because the storage nodes are not affected by the read operation. This circuit requires precharge circuit to precharge the read bitlines for read operation. The two read bitlines are fetched by the sense amplifier to detect a small change in read bitlines for correct read operation.



Figure 3.1: 10T SRAM cell

# 3.1 Read operation

The read operation is shown in figure 3.2. For read operation, the write word line (WWL) is OFF. RBL and RBLB are precharged to VDD. Suppose data to be read is '1' i.e. XT = HIGH. Hence MN6 is ON and MN8 is OFF. Now as RWL goes high, RBL is connected to ground and discharges while RBLB remains at VDD. As the differential voltage between RBL and RBLB becomes 50 - 100 mv, the pass transistor is turned ON (RPASS=LOW) as shown in the waveform. As pass transistor turns ON, the RBL and RBLB are connected to BT and BB of sense amplifier which are precharged to VDD with the help of precharge circuit and SAPR signal. As RBL is connected to BT, BT starts to discharge while BB remains constant at VDD. At that time, sense amplifier enable (SAE) signal goes high. When SAE signal goes high, the



Figure 3.2: 10T read operation

small differential voltage between BT and BB amplifies and pulls down BT to zero, while BB remains at logic 1. BT is connected to inverter with output QT while BB is connected to inverter with output QB. When XT goes to logic 0, QT goes high and hence '1' is read correctly.

In 6T we cannot increase the size of pull down transistor beyond a limit because doing so will affect the SNM. But here in 10T, as read and write ports are different, we can increase the size of read transistors because doing so will not affect the SNM. So read speed increases as the differential voltage will be developed faster.

# 3.2 Write operation

For write operation, the read word line (RWL) is OFF and so the transistors MN5, MN6, MN7, MN8 are not used for the write operation. Hence the write operation is similar to that of 6T SRAM. For write operation, the data to be written is on write



Figure 3.3: 10T write operation

bitline (WBL) and its complement is on write bitlinebar (WBLB). Suppose initial data is XT='1' and XB='0'. Now we want to write '0'on XT. So WBL is connected to ground and WBLB is connected to VDD. When WWL is turned ON, XT starts to discharge and XB starts charging to VDD. When the voltage at node XT drops below the threshold of MN2, it turns OFF and due to positive feedback, XT is pulled to GND while XB is pulled to VDD. Hence in this way write operation is performed. We can measure flip time in two ways. 1) We turn ON the WWL gradually. Here flip time is time from 50% WWL rise to flipping of storage nodes. Second is we keep bitlines and WWL to VDD and gradually decrease on bitline from VDD to GND. Here flip time is time from 50% fall of WBL to storage nodes getting flipped. Now we will see the analysis of design parameters for 10T and in next chapter we will compare the simulation results for 6T and 10T.

#### Here all simulations are performed for 28nm technology.

# 3.3 Read time analysis

Read time is the time measured from 50% rise of read wordline (RWL) to the point where a differential voltage of 100mv is created between two bitlines.

### XT='1', XB='0', PVT(TT, 1.1, 25°C)

### i) (W/L) of MN5=MN7=8

Table 3.1: Read time analysis of 10T with varying tail transistor size

| (W/L) of MN6=N8 | Read time (ps) | Area change of standard 10T (%) |
|-----------------|----------------|---------------------------------|
| 4               | 103            | -34                             |
| 8               | 70.28          | -25                             |
| 20              | 53.71          | 0                               |
| 50              | 48.14          | 64                              |
| 100             | 46.42          | 170                             |

### ii) (W/L) of MN6=MN8=20

Table 3.2: Read time analysis of 10T with varying pass transistor size

| Read time (ps) | Area change of standard 10T (%) |
|----------------|---------------------------------|
| 53.71          | 0                               |
| 36.41          | 17                              |
| 33             | 26                              |
| 24.53          | 90                              |
| 20.63          | 196                             |
|                | 53.71<br>36.41<br>33<br>24.53   |

The above tables show the effect of tail and pass transistor variation on read time. As seen from the above tables, the read time decreases i.e. speed increases with increasing tail and pass transistor size since the discharge rate of bitline will be faster and the differential voltage will be developed quickly. But at the same time the area also increases drastically. So we need to choose proper transistor sizing. Its a tradeoff between speed and area. Now we will see the effect of PVT variation on read speed.

### iii) (W/L) of MN6=MN8=20, MN5=MN7=8, V=1.1V, T=25°C

| Process | Read time (ps) |
|---------|----------------|
| SS      | 66.44          |
| SF      | 63.71          |
| TT      | 53.71          |
| FS      | 45.63          |
| FF      | 43.76          |

Table 3.3: Read time analysis of 10T with Process variation

In 10T we use only additional 4 NMOS transistor for read operation. So as NMOS process corner changes from SLOW to TYPICAL to FAST, the read time also decreases and speed increases.

### iv) (W/L) of MN6=MN8=20, MN5=MN7=8, Process=TT, T=25°C

| Supply Voltage (V) | Read time (ps) |
|--------------------|----------------|
| 0.7                | 106.2          |
| 0.9                | 67.93          |
| 1.1                | 53.71          |
| 1.3                | 46.38          |
| 1.5                | 41.79          |

Table 3.4: Read time analysis of 10T with Supply voltage variation

As we increase the supply voltage, the Vgs of tail and pass transistor increases. This increases the current flow and ultimately fastens the differential voltage development. So read time required decreases with increase in supply voltage.

Similarly, the effect of temperature on read time is shown below. With increase in temperature the read speed decreases.

### v) (W/L) of MN6=MN8=20, MN5=MN7=8, Process=TT, V=1.1V

| Temperature (°C) | Read time (ps) |
|------------------|----------------|
| -40              | 53.37          |
| 0                | 53.59          |
| 25               | 53.71          |
| 50               | 53.83          |
| 100              | 53.98          |
| 125              | 54.02          |

Table 3.5: Read time analysis of 10T with temperature variation

# 3.4 Read current analysis

Read current is the current measured through the pass transistor MN5 or MN7 after the RWL goes high for read operation.

#### XT='1', XB='0'

### i) (W/L) of MN5=MN7=8, PVT(TT, 1.1, 25°C)

Table 3.6: Read current analysis with varying tail transistor size

| (W/L) of MN6=MN8 | Read current $(\mu A)$ | Area change of standard 10T (%) |
|------------------|------------------------|---------------------------------|
| 4                | 115                    | -34                             |
| 8                | 179                    | -25                             |
| 20               | 248                    | 0                               |
| 50               | 285                    | 64                              |
| 100              | 298                    | 170                             |

The current flow will increase up to a point and after reaching its maximum value it wont increase even though we increase the tail transistor size because pass transistor can pass current up to a certain limit only. Beyond that limit, it wont pass more current.

### ii) (W/L) of MN6=N8=20, PVT(TT, 1.1, 25°C)

| (W/L) of MN5=MN7 | Read current $(\mu A)$ | Area change of standard 10T (%) |
|------------------|------------------------|---------------------------------|
| 4                | 115                    | -9                              |
| 8                | 179                    | 0                               |
| 20               | 248                    | 26                              |
| 50               | 285                    | 90                              |
| 100              | 298                    | 196                             |

Table 3.7: Read current analysis with varying pass transistor size

Read current is the current flowing through the pass transistor during read operation. As we increase (W/L) of pass transistor, the current will increase. But this current will not increase after a certain limit since the tail transistor has a limit of sinking current. So it will remain constant after a particular limit.

### iii) (W/L) of MN5=N7=8, MN6=MN8=20 V=1.1V, T=25°C

| Process       | Read current $(\mu A)$ |
|---------------|------------------------|
| SS            | 200                    |
| SF            | 210                    |
| TT            | 248                    |
| FS            | 286                    |
| $\mathbf{FF}$ | 296                    |

Table 3.8: Read current analysis with process variation

As the process corner becomes fast, the current passing through the transistor increases. Here, pass transistor is NMOS. So as NMOS changes from SLOW to TYPI-CAL to FAST, the read current increases

### iv) (W/L) of MN5=MN7=8, MN6=MN8=20 Process=TT, T=25°C

| Supply voltage (V) | Read current $(\mu A)$ |
|--------------------|------------------------|
| 0.7                | 76                     |
| 0.9                | 159                    |
| 1.1                | 248                    |
| 1.3                | 338                    |
| 1.5                | 426                    |
|                    |                        |

Table 3.9: Read current analysis with supply voltage variation

With the increase in supply voltage, Vgs of transistor and hence the current flowing through it increases because current directly depends on Vgs. So read current flowing through pass transistor increases with increase in supply voltage.

### v) (W/L) of MN5=MN7=8, MN6=MN8=20 Process=TT, V=1.1V

| Temperature (°C) | Read current $(\mu A)$ |
|------------------|------------------------|
| -40              | 255                    |
| 0                | 251                    |
| 25               | 248                    |
| 50               | 246                    |
| 100              | 241                    |
| 125              | 239                    |

Table 3.10: Read current analysis with temperature variation

Similarly as temperature increases, the transistor becomes slow because of more and more scattering and so the read current decreases as we increase the temperature.

# 3.5 Leak current analysis

Leak current is the current measured through pass transistor MN5 or MN7 when the RWL is OFF or the cell is in idle state.

XT='1', XB='0'

### i) (W/L) of MN5=MN7=8, PVT(TT, 1.1, 25°C)

Table 3.11: Leak current analysis with varying tail transistor size

| Leak current (nA) | Area change of standard 10T (%) |
|-------------------|---------------------------------|
| 8.31              | -34                             |
| 8.31              | -25                             |
| 8.31              | 0                               |
| 8.32              | 64                              |
| 8.32              | 170                             |
|                   | 8.31<br>8.31<br>8.31<br>8.32    |

### ii) (W/L) of MN6=MN8=20, PVT(TT, 1.1, 25°C)

Table 3.12: Leak current analysis with varying pass transistor size

| (W/L) of MN5=MN7 | Leak current (nA) | Area change of standard 10T (%) |
|------------------|-------------------|---------------------------------|
| 4                | 4.279             | -9                              |
| 8                | 8.31              | 0                               |
| 20               | 20.42             | 26                              |
| 50               | 50.66             | 90                              |
| 100              | 101               | 196                             |
|                  |                   |                                 |

Leak current is the current flowing through the pass transistor when the cell is in ideal state. In ideal state wordline is OFF and so pass transistors MN5 and MN7 are OFF. But still leakage current flows which leads to unwanted power dissipation. The changing in tail transistor size has no effect on leak current as seen fro mthe table. With the increase in pass transistor size, the leak current increases significantly since current and transistor size are directly proportional to each other. So we need to choose transistor size such that leak current is optimum

### iii) (W/L) of MN5=MN7=8, MN6=MN8=20 V=1.1V, T=25°C

| Process | Leak current (nA) |
|---------|-------------------|
| SS      | 0.771             |
| SF      | 2.79              |
| TT      | 8.31              |
| FS      | 75.22             |
| FF      | 126.3             |

Table 3.13: Leak current analysis with Process variation

### iv) (W/L) of MN5=MN7=8, MN6=MN8=20 Process=TT, T=25°C

| Supply voltage (V) | Leak current (nA) |
|--------------------|-------------------|
| 0.7                | 3.09              |
| 0.9                | 5.15              |
| 1.1                | 8.31              |
| 1.3                | 13.16             |
| 1.5                | 20.72             |

Table 3.14: Leak current analysis with supply voltage variation

### v) (W/L) of MN5=MN7=8, MN6=MN8=20 Process=TT, V=1.1V

| Temperature (°C) | Leak current (nA) |
|------------------|-------------------|
| -40              | 0.373             |
| 0                | 2.935             |
| 25               | 8.31              |
| 50               | 20.28             |
| 100              | 86.07             |
| 125              | 155.9             |

Table 3.15: Leak current analysis with temperature variation

The leak current variation with PVT is shown in the above tables. Here the reason for leak current change is same as that of read current. For temperature variation, as temperature increases, the reverse saturation current increases. So the leak current also increases.

# 3.6 SNM analysis



Figure 3.4: 10T SNM measurement circuit

The circuit for measuring SNM of 10T SRAM is shown above. Similar to 6T, here also we add noise source between the storage nodes. E1 is VCVS. When RWL goes high and read operation starts, at the same time the noise voltage Vx is also increased. The noise voltage at which the cell flips its value is measured and this voltage is the SNM of that bitcell. Now we will see the effect of change in transistor size and PVT variation effect on the SNM. We need to choose the transistor sizes accordingly in order to get best SNM.

# XT='0', XB='1' PVT (TT, 1.1V, 25°C)

### i) Pull down size variation. Pull up (W/L)=4, Pass gate (W/L)=6

| (W/L) of MN1=MN2 | SNM (mV) | Area change of standard 10T $(\%)$ |
|------------------|----------|------------------------------------|
| 4                | 290      | -10.63                             |
| 6                | 312.3    | -6.38                              |
| 9                | 357.1    | 0                                  |
| 20               | 368.2    | 23.4                               |
| 50               | 383.5    | 87.23                              |
| 100              | 397.7    | 193.61                             |
|                  | 1        | 1                                  |

Table 3.16: SNM analysis with varying pull down transistor size

### ii) Pull up size variation. Pull down (W/L)=9, Pass gate (W/L)=6

| (W/L) of MP1=MP2 | SNM (mV) | Area change of standard 10T (%) |
|------------------|----------|---------------------------------|
| 4                | 357.1    | 0                               |
| 6                | 368.2    | 4.25                            |
| 9                | 368,3    | 10.63                           |
| 20               | 379.1    | 34.04                           |
| 50               | 381.7    | 97.87                           |
| 100              | 387.4    | 204.25                          |

Table 3.17: SNM analysis with varying pull up transistor size

As seen from the above tables, by increasing the sizes of pull down and pull up transistors, the SNM increases. This is because the storage nodes are strongly pulled to '0' and '1' by increasing pull down and pull up transistor size respectively. So more noise voltage is required to flip the cell. But on the other hand, the area of the cell also increases tremendously. So a proper SNM must be selected to get optimum performance for all design parameters.

### iii) Pass gate size variation. Pull down (W/L)=9, Pull up (W/L)=4

| (W/L) of MN5=MN7 | SNM (mV) | Area change of standard 10T (%) |
|------------------|----------|---------------------------------|
| 4                | 357.1    | -8.51                           |
| 6                | 357.1    | -4.25                           |
| 9                | 357.1    | 2.12                            |
| 20               | 357.1    | 25.53                           |
| 50               | 357.1    | 89.36                           |
| 100              | 357.1    | 195.74                          |

Table 3.18: SNM analysis with varying pass transistor size

As seen above, the variation in pass transistor size has no effect on SNM since here in 10T, these pass transistors does not affect the storage nodes. In 6T, the voltage bump at storage nodes changes with change in pass transistor size. but here, no voltage bump is observed on storage nodes. So SNM remains constant.

#### PVT Variation. (W/L) of pd=9, pg=6, pu=4

#### iv) Process variation. V=1.1V, T=25°C

| SNM (mV) |
|----------|
| 401.2    |
| 368.3    |
| 357.1    |
| 312.7    |
| 312.7    |
|          |

Table 3.19: SNM analysis with Process variation

As NMOS becomes fast, pass transistor becomes fast and so voltage bump at storage node increases and less noise voltage is required to flip the cell. So SNM decreases. On the other hand as PMOS fastens, node storing '1' becomes strong and more noise voltage required to flip the cell. So, SNM increases.

### v) Supply voltage variation. Process=TT, T=25°C

| Supply voltage (V) | SNM (mV) |
|--------------------|----------|
| 0.7                | 241.3    |
| 0.9                | 310.2    |
| 1.1                | 357.1    |
| 1.3                | 396      |
| 1.5                | 441.5    |
|                    |          |

Table 3.20: SNM analysis with Supply voltage variation

Cell is flipped when both the storage nodes cross 50% of their respective logic states (Rise or Fall). So as we increase the supply voltage voltage bump increases by little amount as compared to VDD. So more noise voltage is required to take the storage node to VDD/2 and hence SNM increases.

### vi) Temperature variation. Process=TT, V=1.1V

| Temperature (°C) | SNM (mV) |
|------------------|----------|
| -40              | 379.2    |
| 0                | 368.1    |
| 25               | 357.1    |
| 50               | 356.8    |
| 100              | 335      |
| 125              | 324      |

Table 3.21: SNM analysis with Temperature variation

With the increase in temperature, the threshold voltage of transistor decreases. So less noise voltage is required to flip the cell and so the SNM decreases with the increase in temperature.

# 3.7 Power analysis

### i) Average static power dissipation with VDD variation

Table 3.22: Average static power dissipation with VDD variation

| Supply voltage (V) | Average static power dissipation (nW/MHz) |
|--------------------|-------------------------------------------|
| 0.5                | 3.298                                     |
| 0.7                | 8.278                                     |
| 0.9                | 18.150                                    |
| 1.1                | 36.620                                    |
| 1.3                | 70.240                                    |
| 1.5                | 131.50                                    |

### ii) Average read power dissipation with VDD variation

| Table 3.23: | Average read | power | dissipation | with | VDD | variation |
|-------------|--------------|-------|-------------|------|-----|-----------|
|-------------|--------------|-------|-------------|------|-----|-----------|

| Supply voltage (V) | Average read power dissipation $(\mu W/MHz)$ |
|--------------------|----------------------------------------------|
| 0.5                | 4.234                                        |
| 0.7                | 25.40                                        |
| 0.9                | 68.75                                        |
| 1.1                | 132.3                                        |
| 1.3                | 213.8                                        |
| 1.5                | 312.6                                        |

The power dissipation for 10T is the summation of static and dynamic power dissipation. The static power dissipation is due to the leakage current when the bitcell is in idle state. Power dissipation is directly proportional to the square of supply voltage. So by increasing the supply voltage, the power dissipation increases drastically. So nowadays, low power circuits are designed in which supply voltage is very low. The above table shows the average static power dissipation of 10T SRAM. As we can see, the power increases with increase in supply voltage. This increases the heat and can lead to failure. So we need to operate it at lower voltage.

### iii) Average write power dissipation with VDD variation

| Supply voltage (V) | Average write power dissipation $(\mu W/MHz)$ |
|--------------------|-----------------------------------------------|
| 0.5                | 0.0288                                        |
| 0.7                | 0.2987                                        |
| 0.9                | 1.132                                         |
| 1.1                | 2.896                                         |
| 1.3                | 4.217                                         |
| 1.5                | 6.532                                         |

Table 3.24: Average write power dissipation with VDD variation

### iv) Average dynamic power dissipation with VDD variation

| Table 2.95. | Amono de demonsio norme | n discipation with  | VDD waristian |
|-------------|-------------------------|---------------------|---------------|
| Table 5.25: | Average dynamic powe    | er dissipation with | VDD variation |

| Supply voltage (V) | Average dynamic power dissipation $(\mu W/MHz)$ |
|--------------------|-------------------------------------------------|
| 0.5                | 2.1314                                          |
| 0.7                | 12.849                                          |
| 0.9                | 34.941                                          |
| 1.1                | 67.590                                          |
| 1.3                | 109                                             |
| 1.5                | 159.56                                          |
|                    |                                                 |

The average dynamic power dissipation is shown in above table. Dynamic power is dependent on three main components - voltage, load capacitance and operating frequency. For 10T, both read and write operation leads to power dissipation. So average dynamic power dissipation is the average of read and write power dissipation. The results shown above are for 1MHz frequency with capacitive load of 0.1pF. Now we will see the average power dissipation in 10T which is the summation of static and dynamic power dissipation.

# v) Average power dissipation in 10T with VDD variation

| Supply voltage (V) | Average power dissipation $(\mu W/MHz)$ |
|--------------------|-----------------------------------------|
| 0.5                | 2.1346                                  |
| 0.7                | 12.8576                                 |
| 0.9                | 34.9591                                 |
| 1.1                | 67.620                                  |
| 1.3                | 109.07                                  |
| 1.5                | 159.6915                                |
|                    |                                         |

Table 3.26: Average power dissipation with VDD variation

# 3.8 Power optimization

Table 3.27: 6T vs 10T average power dissipation with VDD variation

| Supply voltage (V) | 6T avg power $(\mu W/MHz)$ | 10T avg power $(\mu W/MHz)$ |
|--------------------|----------------------------|-----------------------------|
| 0.5                | 1.2737                     | 2.1346                      |
| 0.7                | 7.9914                     | 12.8576                     |
| 0.9                | 22.3398                    | 34.9591                     |
| 1.1                | 43.9402                    | 67.620                      |
| 1.3                | 72.5996                    | 109.07                      |
| 1.5                | 108.4849                   | 159.6915                    |

Table 3.28: 6T vs 10T SNM with VDD variation

| Supply voltage (V) | 6T  SNM (mV) | 10T  SNM (mV) |
|--------------------|--------------|---------------|
| 0.5                | 56.95        | 162.3         |
| 0.7                | 93.7         | 241.3         |
| 0.9                | 129.4        | 310.2         |
| 1.1                | 147.1        | 357.1         |
| 1.3                | 160.9        | 396           |
| 1.5                | 170.6        | 441.5         |
|                    |              |               |

Power dissipation is a major concern with 10T SRAM because of the addition of extra 4 NMOS transistors. As seen from the table above, the average power dissipation of 10T is larger than 6T for any supply voltage. Since the dimension of memory is reducing, power dissipation should be controlled otherwise our system woulf fail. So we need to optimize power for 10T because nowadays low power devices are in demand. But we cannot reduce the power of 10T since it has 4 additional transistors. On the other hand, the SNM of 10T is much larger than 6T because it has different read and write ports. We can take advantage of this to optimize power dissipation. 6T cannot work for lower voltages since its SNM degrades and becomes prone to small noise voltages. Generally we use it up to 0.9V only. And according to recent trends, ultralow voltage memories are in demand. As seen from above tables, SNM of 10T for 0.5V is 162.3mV and of 6T is 56.95mV. Power dissipation of 10T is 2.134 (micro-Watt/MHz) at 0.5V and power dissipation of 6T is 22.33 (micro-Watt/Mhz) at 0.9V. Hence because of high SNM at 0.5V, we can reduce the supply voltage in 10T to improve the power dissipation which we cannot do in 6T because doing so will affect the SNM significantly. So in this way average power dissipation in 10T SRAM can be optimized.

# Chapter 4

# 6T v/s 10T comparison

# 4.1 Area

| 6T Area $(\mu m^2)$ | 10T Area $(\mu m^2)$ | Area increment $(\%)$ |
|---------------------|----------------------|-----------------------|
| 0.0342              | 0.0846               | 147                   |

Table 4.1: 6T v/s 10T Area comparision

The table above shows the area comparison of 6T and 10T SRAM. Area occupied by 10T is its main disadvantage. As seen from the results, the area occupied by 10T is almost 2.5 times that of 6T. The main reason behind this is the addition of extra 4 NMOS transistors. We increase the size of these extra transistors in order to increase the read speed, read current. If we keep normal sizing of these additional transistors, then we wont get performance boosting. So we increase the size but at the same time we have to compromise on area. The design parameters are significantly improved in 10T. We can increase the performance of 10T by increasing the size. The performance degrades by decreasing the size. So we need to design the additional 4 transistors such that all the design parameters are optimized with minimum increment in area. The comparison shown here is of optimum 10T SRAM.

# 4.2 Read time



# i) Pull down size variation







Figure 4.2: Read time comparison with pass gate size variation

## iii) Process variation



Figure 4.3: Read time comparison with process variation

# iv) Supply voltage variation



Figure 4.4: Read time comparison with supply voltage variation

# v) Temperature variation



Figure 4.5: Read time comparison with temperature variation

# 4.3 Read current

## i) Pull down size variation



Figure 4.6: Read current comparison with pull down size variation



## ii) Pass gate size variation

Figure 4.7: Read current comparison with pass gate size variation

## iii) Process variation



Figure 4.8: Read current comparison with process variation



## iv) Supply voltage variation



## v) Temperature variation



Figure 4.10: Read current comparison with temperature variation

# 4.4 Leak current



# i) Pull down size variation





### ii) Pass gate size variation

Figure 4.12: Leak current comparison with pass gate size variation

# iii) Process variation



Figure 4.13: Leak current comparison with process variation

## iv) Supply voltage variation



Figure 4.14: Leak current comparison with supply voltage variation



#### v) Temperature variation

Figure 4.15: Leak current comparison with temperature variation

# 4.5 Read/Leak current ratio

Table 4.2: 6T v/s 10T Read/Leak current ratio comparision

| 6T Read/Leak current ratio | 10T Read/Leak current ratio | Increment( $\%$ ) |
|----------------------------|-----------------------------|-------------------|
| 25484                      | 29891                       | 17.3              |

The ratio of read current to leak current is known as read/leak ratio. It is very important that this ratio should be as high as possible. Suppose in a column in memory array, there are 10 bitcells. Read current of 1 bitcell is 9uA and leak current is 1uA. Now for a read operation, 1 cell will be active and 9 cells will be idle each contributing leak current. So at output we wont be able to judge whether the current is coming from selected bitcell or its the contribution of total leak current. So the read/leak ratio should be as high as possible to increase the number of memory cells in a column. Here, the ratio of 10T is 17.3% higher than 6T. So more densely packed array is possible with 10T.

# 4.6 Static Noise Margin (SNM)



# i) Pull down size variation



## ii) Pull up size variation



Figure 4.17: SNM comparison with pull up size variation



## iii) Pass gate size variation

Figure 4.18: SNM comparison with pass gate size variation

## iv) Process variation



Figure 4.19: SNM comparison with process variation



## v) Supply voltage variation

Figure 4.20: SNM comparison with supply voltage variation

## vi) Temperature variation



Figure 4.21: SNM comparison with temperature variation

# 4.7 Average Power dissipation

# i) Average static power dissipation



Figure 4.22: Average static power dissipation with supply voltage variation

## ii) Average read power dissipation



Figure 4.23: Average read power dissipation with supply voltage variation



### iii) Average write power dissipation

Figure 4.24: Average write power dissipation with supply voltage variation

## iv) Average dynamic power dissipation



Figure 4.25: Average dynamic power dissipation with supply voltage variation



### v) Average power dissipation

Figure 4.26: Average power dissipation with supply voltage variation

Above figure shows the average power dissipation comparison between 6T and 10T SRAM. As we can see the average power dissipation of 10T is larger than 6T over all the supply voltage range. This is because we add 4 extra transistors in 10T to improve the performance. The main disadvantage of 6T is that the SNM degrades drastically by reducing the supply voltage. So we cannot reduce the supply voltage of 6T beyond a limit since doing so will make the bitcell prone to noise voltages. However by reducing supply voltage in 10T, its SNM decreases but still it is higher than 6T. So we can operate 10T at ultralow voltage to reduce the power dissipation and at the same time not affecting the performance.

# Chapter 5

# Memory Compiler analysis

# 5.1 Introduction

In System on Chip (SoC) design, it is required to have a memory chip with different aspect ratio with different size. It is also required to have different features of memory for different requirements of the design. Memory compiler is the tool by which different instances of memory can be generated depending on the input given to the Memory compiler. In this chapter, the basic need of memory compiler is described followed by its design and development flow. Then signal flow is described in detail and in the last section various design and QA checks are mentioned which are necessary and affect the correct functionality of memory compiler.

Role of memory compiler is shown in figure 5.1. Memory designer is the one who designs or architects the memory compiler by using Memory Design Intellectual Property (IP). Based on the technology, different library for memory compilers are used to generate a specific memory instance(Memory component). The IC designer (SoC designer) is the one who provide inputs to the memory compiler to generate a memory component based on the requirement to design different SoC. The basic inputs to the memory compiler are NB (number of bit), NW (number of word), CD (Center



Figure 5.1: Role of memory compiler

Decoding), BK (Bank), CM (Column Muxing), PG (Power Gating), Vt (Periphery Vt option), BIST (Built In Self Test), Redundancy (Redundancy required to replace faulty column) and timing mode (FAST, SLOW, DEFAULT). Based on these inputs and many others, the memory compiler will generate memory component of different size and feature on Soc as shown below. The main advantage of memory compiler is the flexibility of choosing a memory instance of any size and features from the available one for SoC.

In memory compiler design, it is sufficient to test the instance (memory component) generated through memory compiler on different number of checks like timing check, functional check, leakage check and so on. After the schematic is designed, design checks are performed first to ensure correct design. Different design checks include latch check, Hi-Z c & P-check, daisychain analysis etc. After the characterization process, thorough Quality Assurance checks are performed to check the performance of the instances. If any check fails, then the problem is analysed and efforts are put to solve it. The QA check ensures the memory compiler wont fail on field. The detailed memory compiler generation flow is shown in the next section.

# 5.2 Memory Compiler development flow



Figure 5.2: Memory compiler development flow

The design and development flow for a memory compiler is shown above. It starts with the specification from the user i.e. what type of memory and which different features would they like in memory. Memory are of different types e.g. high speed, high density, low leakage, register file etc. and features include bank structure, column muxing option, dual supply for periphery and array, power gating options for low power memories, bist and redundancy enable options etc. So the customer chooses the type and different features of memory and after that the design starts. Bitcell analysis is the first step of effort in design. All the paramteres of bitcell are analysed and optimized as per the requirement. Schematic is prepared after the bitcell is finalized. When the schematic is prepared, along with the design phase, sufficient layout preparation also starts in parallel. The front end team also starts preparing the templates and plugins required for design simulation. From the schematic, placement file is generated. We cannot afford failure so before the compiler creation starts we need to ensure that the design is functional. So design checks are performed first on the design.

Design checks include latch check, Daisychain analysis, wakeup analysis, Hi-Z & Pcheck. These all checks ensures correctness in design. After this characterization of design starts. Characterization is predicting actual behaviour of circuit on silicon using simulations. This is difficult task as it includes all the RC considerations and other variations which might occur during fabrication. Monte carlo simulation is performed to identify the effect of PVT variation on performance. The mean and sigma from the gaussian distribution affects & determines the accuracy v/s reliability. At the same time, layout is being prepared parallely along with design characterization. If some difficulty/error occurs in design checks, design is changed to eliminate the error and hence that part of layout also needs to be changed. After the characterization runs successfully and meets our requirements, layout is finalized and layout checks are performed.

After the compiler is created, we need to validate it before it goes on silicon to ensure that it will work properly post silicon. So Quality Assurance (QA) checks are performed now which is the final step in compiler development flow. Many checks are performed to validate the timing requirements, power dissipation, IR drop, functional behaviour etc and its effects are analysed. in compiler. If any error occurs in these checks, then we need to ensure to fix those errors before proceeding further. So all the difficulties are resolved and all checks are passed successffully. After all the issues are fixed, compiler is ready to be released to customer. So all the sufficient documentation is done and finally the compiler is released. Compiler release takes place in two phase: 1) Front End (FE) release in which we just provide functional views and no GDS (Graphical design structure) 2) Back End (BE) release in which we also provide GDS to customer and it is the final release.

# 5.3 Memory Compiler features

The compiler provides different features in the default configuration and also provides flexibility to the system designer to choose from a set of optional features. Some of the basic features of memory compiler are listed below.

- Synchronous operation
- Functional voltage range for SLOW, DEFAULT and FAST MODE
- Temperature range from -40°C to 125°C
- Minimum area for any instance selected
- 5 sigma accuracy
- Low leakage
- Word length variation as per MUX option selected

Some additional features can also be selected to increase the flexibility and performance of compiler. As the number of features increases, the complexity of design also increases. The additional features include :-

#### 1) Center decode:

If in a memory array, number of columns are more than the number of rows i.e. the memory is too wide, then the RC delay of the wordline increases too much which will affect its rising slope and hence the time required to access the bitcell will be increased. If the number of rows are more, then the capacitive load on the bitlines increases and affects the charging and discharging of bitlines which will take longer time reducing the speed of memory compiler. If the wordline effective length is reduced somehow, the RC delay is reduced and this increases the performance. This introduces us to the concept of center decoding. Figure below depicts the concept



Figure 5.3: Center decoding concept

behind center decoding and how it is different from conventional side decoding.

The first part in the above figure shows conventional architecture in memory compiler, where the control circuitary and the row decoders are located at the lefft side. So in order to access the extreme right corner cell, the wordline has to travel a longer distance which affects the RC delay. By using the center decoding scheme, we place control circuitary and row decoders in the center of memory as shown in the second part of figure. This reduces the maximum wordline travel length to half and so speed is increased. But here the complexity of design increases and we also need to add multiplexer and extra buffers to select which side to be selected for read-write operation. The overall performance of memory increases by using CD scheme.



#### 2) Column muxing:

Figure 5.4: Column Muxing

Often it happens that we have to accomodate our memory in a very compact predefined area on Soc. It might happen that a square shaped size is available on Soc and our memory instance is of rectangular shape. Column muxing helps to adjust the aspect ratio of the selected memory instance by resucing the height and increasing the width of memory.

As shown in the above figure, on the left side we have a 8x4 sized memory. Now we want to decrease the column height by keeping the same memory size. So here we choose column muxing option to 2. Hence each column will be divided into four parts and the number of physical rows resuces to 4 and number of physical column increases to 8. Originally we required 3 address lines to select a particular row. Now we require 2 address lines to select a row and the third additional address line is used to select a particular part in one column i.e. left or right part. So column muxing option gives user a flexibility to select aspect ratio of memory. Here if we choose column mux = 4, then number of physical rows reduces to 2 and number of physical columns increases to 16.

#### 3) Bank architecture:

When in memory, physical columns are less than the physical rows or when the memory is tall, the RC delay of the bitline is very high which affects the rising and falling slope of the bitlines due to which the access time of the bitcell increases significantly. If we are able to reduce the length of bitline, the RC delay also gets reduced and we can increase the performance of the compiler in this way.

This introduces the concept of bank architecture where we divide the number of rows into different groups and each group is called a bank. Each bank is locally controlled by the local control block and a multiplexer is used to select a particular bank. So in this architecture, the bank address is also provided along with the row an column address. This architecture breaks the bit line into different section and increase the performance of the design with some overhead of multiplexing.

#### 4) Dualrail support:

Dualrail support is also known as dynamic voltage support where we use two different supply voltages, one for the bitcell array and other for the periphery devices. We do so to reduce the power dissipation. Generally, in memory array only one cell is active and rest are off in a clock cycle but the periphery devices are continuously ON for all clock cycles. So the power dissipation of periphery needs to be reduced. Here we provide different supply voltages to periphery and array. The periphery voltage is kept less than array to reduce the power dissipation. The array voltage is kept high to increase the performance of the compiler. The array voltage is denoted by vdda and periphery voltage by vddp. By doing so, the complexity increases but at the same time the power dissipation reduces. So this feature is important for low power applications.

#### 5) BIST interface:

Bilt-in Self Test (BIST) is another very important feature of memory compiler. BIST is the mechanism by which the circuit can test its operation itselff. By enabling BIST, the whole BIST design comes into the picture of memory architecure. All test pattern and test clock for the BIST are generated internally and data and addresses which come from the outside are isolated or bypass by the BIST controlled MUX.

#### 6) Power gating option:

As the technology is advanced to a lower scale, the device gets flaster but at the same time, the second order effects of the transistor show their dominance. The subthreshold leakage increases which in turn affects/increases the over all leakage of the design. Leakage current is the main difficulty in planner technology when the gate length is reduced. To overcome this, non-planner technology like FinFET can be used but it consumes signifficant area. So instead of changing the technology, some sufficient supporting circuitry is added to reduce the leakage. We provide three modes to reduce power. 1) LS (Light Sleep) 2) DS (Deep Sleep) 3) SD (Shutdown). In light sleep mode we increase the substrate voltage to increase the threshold value which reduces leakage. In Deep sleep mode we cut the periphery voltage which dissipitates more power and in Shutdown mode we cut both the supply of array and periphery.

#### 7) Different Vt option:

By reducing the threshold of the transistors, the time required to on and off the transis- tors gets reduced. Generally, the bit cells are flaster but the periphery which is designed to access that particular cell is slow. so to enhance the periphery design, low threshold voltage(Vt) device can be used. The disadvantage of using low Vt devices is that, it in- creases the subthreshold current significantly which increases the overall power consumption.

# 5.4 Signal flow in memory compiler



## 5.4.1 Read signal flow

Figure 5.5: Read signal flow in compiler

Read signal flow in a memory compiler is shown above. First the address signals are latched in the control circuit block. After then the clock, Memory Enable (ME) and Write (WE) signals arrive which are also latched. Latching is done so that the input pins can take in next address or ME or WE signals while the existing operation is still performing. The clock signal triggers the internal clock (INT CLK) which is latched signal of clock. The Address (ADR) is split into X-address and Y-address which are used to select particular wordline and bitline. These address lines select that particular cell and performs respective operation. There is a reference array on the top which is determines the maximum time required for an operation and after that time period it stops INT CLK and allows next cycle to progress. Sometimes it may happen that a cell is not read or written. So time taken by reference array to travel is the time required to access the corner cell. So when reference signal travels through and reaches the control block and still any operation is not performed, it means that there is some fault occured. This reference signal generates SAE and STOP INT CLK signals to ensure correct data at output for read operation. When INT CLK is stopped, the memory is in idle state and waits for next clock cycle.

## 5.4.2 Write signal flow



Figure 5.6: Write signal flow in compiler

Figure above shows the write signal flow in memory compiler. First the address signals are latched in the control circuit block. After then the clock, Memory Enable (ME) and Write (WE) signals arrive which are also latched. Latching is done so that the input pins can take in next address or ME or WE signals while the existing operation is still performing. The clock signal triggers the internal clock (INT CLK) which is latched signal of clock. The Address (ADR) is split into X-address and Y-address which are used to select particular wordline and bitline. The INT CLK triggers RWL which is Reference wordline signal used to perform dummy write operation after a particular time period. This time period is the maximum time that a wordline can take to access the corner cell. So this RWL along with REF BL WRITE signal performs a dummy write operation while at the same time real write operation is also going on. After dummy write operation is over, it generates signal to stop INT CLK which indicates that by this time the real write operation is performed and we are ready for the next clock cycle. On the other hand the real write operation is done and based on column muxing a series of bits are output to the write driver. Note that here we read a whole word and not a single bit. This then goes to the output buffer.

# 5.5 QA checks

#### 5.5.1 Primetime

Primetime check is used for timing analysis. It is a verilog model verification. This command checks the Synopsys library syntax and ensures the timing arcs defined in the Synopsys library exist in the Verilog model. The Synopsys PrimeTime suite, including PrimeTime, PrimeTime SI, PrimeTime PX and PrimeTime VX, provides a single, golden, trusted signoff solution for timing, signal integrity, power and variation-aware analysis.

#### 5.5.2 Timever

Timever is a timing verification tool. Timing verification is the process of determining that a given design can be operated at a specific clock frequency without errors caused by a signal arriving too soon or too late . For example, if the data input of a latch arrives affter the closing edge of the clock (a setup violation), or if the data input changes before the closing edge of the previous clock (a hold violation), the latch may not store the data correctly. This check adds timing parameters to the Verilog and checks for read/write setup/hold , tcc on the Verilog.

## 5.5.3 LibCompare

Libcompare Compare datasheets between different compiler versions given in tcl file. Area, timing and power are being compared between two compilers excel sheet depicting the percentage deviation is reported for different mentioned parametres.

## 5.5.4 Ccsn & Ccst

Ccsn is composite current source for noise. Noise analysis performed by this model.CCS noise is generated to calculate noise accurately and verify cells immunity to noise for compilers.

Ccst is composite current source for time. This model is used for time verification. Open source CCS model for timing is a part of the Liberty format specification. Characterization guidelines, development tools, and library validation and correlation tools are available to speed the library characterization and qualification process. To expedite and simplify CCS library qualification, Synopsys provides a new library QA capability part off Library Compiler that can be used to check the completeness and accuracy of all acquired CCS timing models in a library. In addition, Library Compiler can be used in correlation mode to verify the accuracy of the characterization.

## 5.5.5 Espcv

It is formal equivalence check between Verilog model and a structural model created by tool to verify functionality. ESP-CV is a symbolic simulation-based formality verification tool intended to perform custom equivalence(EQ) checking and provide functional verification coverage for fullcustom IC design.

## 5.5.6 Redhawk

Redhwak is a power integraty solution. It is an IR=Power Analysis tool that enables more effective and accurate static and dynamic analysis for on state and ramp up mode. We can easily utilize different utilities for Redhawk to run IR=Power analysis.

## 5.5.7 Familyverify

This check reports for all the files and corresponding errors. Also check the structure of the compiler. Compiler family validation tests are intended to assure quality of a compiler before it is released. Normally, you run family validation if you want to: Validate that an existing compiler conforms to one or more family definitions Update an existing compiler to force it to conform to one or more family definitions Create a new compiler that is already conforming to one or more family definitions

## 5.5.8 IQA

IQA is the integrated QA. It checks full compiler. For corner instances it checks Lib VS db. Pins transition etc. It ensure the quality of instances before the compilers are released. The IQA check also include the capability to ensure that antenna diodes are always present. Transistor recognition is performed so that a given piece of diffusion geometry can be recognized as a source or drain to a gate. If it is not a source/drain, it is a diode and is recorded appropriately.

# Chapter 6

# **Conclusion and Future scope**

By the analysis of 10T SRAM, we found that it is better in performance than 6T but the major disadvantage is area occupied and power dissipation. The power dissipation can be optimized by operating 10T at ultralow voltages. Another disadvantage is the leakege current which is higher than 6T but the overall read/leak current ratio of 10T is higher due to which we can accomodate more number of physical rows in memory array. 10T is also much faster for read operation while the write speed is same as 6T. The main advantage of 10T is its stability because it has different read and write ports due to which the storage nodes are not affected during read operation. So its SNM is high. We conclude that with increasing demand for speed, stability and low power, 10T is suitable for next generation dual port SRAM.

With increasing need of different size and features of memory for a Soc, memory compiler is the only solution to provide different memory instance. Customer can choose different options to increase the flexibility and also type of memory whether he wants high speed, high density, low leakage, register file etc.

Design of 10T in future can be application specific that whether we want speed or low power or high density. Single read bitline 10T can be used for low power application. For memory compilers, as technology decreases, we can implement a memory compiler with a new technology to improve the performance.

# References

- Sung-Mo Kang, Yusuf Leblebici, "CMOS Digital Integrated Circuits Analysis and Design ", Tata McGraw-Hill, 2003. pp. 402-430
- [2] David A. Hodges, "Semiconductor Memories". pp. 359-376
- [3] 5S. Khan, I. Agbo, S. Hamdioui, H. Kukner, B. Kaczer, P. Raghavan, F. Catthoor, "Bias Temperature Instability analysis of FinFET based SRAM cells", IEEE Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 1-6, March 2014
- [4] S. K. Singh, S. V. Singh, B. K. Kausik, C. Chauhan, T. Tripathi, "Characterization and improvement of SNM in deep submicron SNM design", IEEE International conference on Signal Processing and Integrated Networks (SPIN), pp. 538-542, February 2014
- [5] Sapna Singh, Neha Arora, Meenakshi Suthar, Neha Gupta, "PERFORMANCE EVALUATION OF DIFFERENT SRAM CELL STRUCTURE AT DIFFER-ENT TECHNOLOGIES", International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.1, February 2012
- [6] Hiroki Noguchi, Shunsuke Okumura, Yusuke Iguchi, Hidehiro Fujiwara, Yasuhiro Morita, Koji Nii, Hiroshi Kawaguchi, Masahiko Yoshimoto, "Which is the Best Dual Port SRAM in 45-nm Technology? —8T, 10T Single End and 10T Differential", IEEE, 2008

#### REFERENCES

- [7] Y.Nakagome, K. Itoh, M. Isoda, K. Takeuchi, and M.Aoki, "Architecture and Design of a High Performance SRAM for low power Applications", Symposium on VLSI Circuits, IEEE International Digest of Technical Papers, June 2002, pp. 82-83
- [8] http://iopscience.iop.org/00344885/75/7/076502;jsessionid=31910C49695E882B
   3BD12FB3B03903F8.c2
- [9] http://www.cse.scu.edu/ tschwarz/coen180\_04/LN/sram.html