Cache
Published:
Memory Types
Static RAM
- 6 Transistors per bit, of which make up two inverters & two other for off/on
- Optimized of speed primarly and density secondly
- Volatile
Dynamic RAM
- 1 Transistor and 1 Capacitor per bit
- Optimized for density
- Volatile
Volatile Storage
- Magnetic Disk
- Flash RAM
Memory Hierachy
Registers
- Accessed Directly via user level ISA
Memory
- Accessed Indirectly via user level ISA
Disk
- Not Accessed via user level USA
Memory Hierachy In Detail
Level:
0 - Registers.
1 - Primary Cache.
2 - Secondary Level Cache ~ Made of SRAM.
3 - Main Memory ~ Made of DRAM.
- Typically 1GB to 16GB for computers.
- In the hundreds of GB’s for servers.
4 - Disk (Swap & Files).
Locality
Temporal Locality
- Recently referenced data/instructions likely to be referenced soon again.
- Examples are for instance data in inner loops or hot global data.
- Here the hierachy is reactive; Move things up when accessed.
Spatial Locality
- Data/instructions nearby referenced data/instruction likely to be referenced soon.
- Examples are elements in an array, fields in a struct.
- Hierachy is proactive; Move things up speculatively.
Basic Memory Array Structure
Number of Entries
- 2^n; n is number of address bits
- Decoder changes n-bit address to 2^n bit signal
Size of Entries
- Width of data accessed, 256 bits / 32 bytes in example
Finding Data via Indexing
Basic Cache ~ Array of Block Frames
- 32 KB Cache ( 1024 Frames, 32B Blocks)
Finding Frame
- For a 32-bit address
- 32B Blocks ~ 5 Lowest Bits Locate Byte in Block ~ Offset Bits
- 1024 Frames ~ 10 bits Find Frame ~ Index Bits
Tags
- Each Frame Can Hold Multiple Blocks
- All blocks with the same index bit pattern
Lookup Algorithm
1 Read Frame Indicated By Index Bits
2 if(tag matches && valid bit set)
then hit -> data is good
else miss -> data not good
Handling Cache Miss
Tag overhead of 32KB cache with 1024 32B
- 32B Frames ~ 5 Bit Offset
- 1024 Frames ~ 10 Bit Index
- 32-bit Address ~ 32-bit Address - 5 Bit Offset - 10 Bit Index = 17 Bit Tag
- (17 Bit + 1 Bit Valid) * 1024 Frames = 18Kb Tags = 2.2KB ~~~ 2.2/32 = 0.68 ~ 6.8%
Cache Performance
- tavg = t_hit + (%miss * t_miss)
- %miss will change depending on program because:
- %miss = #miss/#accesses
- Each program will have a different number of accesses
Cache Calculations
4 Bit Address ~ 8B Cache, 2B Blocks
- Number of Frames = (Capacity/Block Size) = 8/2 = 4 Frames
- Offset = log2(2) = 1 ~ 0000
- Index = log2(#Frames) = log2(4) = 2 ~ 0000
- Tag = 4 - 1 - 2 = 0000