Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Generic Primary and Secondary Memory - Lecture Notes | CPSC 5155G, Study notes of Computer Architecture and Organization

Material Type: Notes; Professor: Bosworth; Class: Computer Architecture; Subject: Computer Science; University: Columbus State University; Term: Fall 2006;

Typology: Study notes

Pre 2010

Uploaded on 08/04/2009

koofers-user-j4y
koofers-user-j4y 🇺🇸

10 documents

1 / 11

Toggle sidebar

Related documents


Partial preview of the text

Download Generic Primary and Secondary Memory - Lecture Notes | CPSC 5155G and more Study notes Computer Architecture and Organization in PDF only on Docsity! Generic Primary / Secondary Memory This lecture covers two related subjects: Virtual Memory and Cache Memory. In each case, we have a fast primary memory backed by a bigger secondary memory. The “actors” in the two cases are as follows: Technology Primary Memory Secondary Memory Block Cache Memory SRAM Cache DRAM Main Memory Cache Line Virtual Memory DRAM Main Memory Disk Memory Page Access Time TP (Primary Time) TS (Secondary Time) Effective Access Time: TE = h  TP + (1 – h)  TS, where h (the primary hit rate) is the fraction of memory accesses satisfied by the primary memory; 0.0  h  1.0. This formula does extend to multi–level caches. For example a two–level cache has TE = h1  T1 + (1 – h1)  h2  T2 + (1 – h1)  (1 – h2)  TS. NOTATION WARNING: In some contexts, the DRAM main memory is called “primary memory”. I never use that terminology when discussing multi–level memory. Examples: Cache Memory Suppose a single cache fronting a main memory, which has 80 nanosecond access time. Suppose the cache memory has access time 10 nanoseconds. If the hit rate is 90%, then TE = 0.9  10.0 + (1 – 0.9)  80.0 ` = 0.9  10.0 + 0.1  80.0 = 9.0 + 8.0 = 17.0 nsec. If the hit rate is 99%, then TE = 0.99  10.0 + (1 – 0.99)  80.0 ` = 0.99  10.0 + 0.01  80.0 = 9.9 + 0.8 = 10.7 nsec. Suppose a L1 cache with T1 = 4 nanoseconds and h1 = 0.9 Suppose a L2 cache with T2 = 10 nanoseconds and h2 = 0.99 This is defined to be the number of hits on references that are a miss at L1. Suppose a main memory with TS = 80.0 TE = h1  T1 + (1 – h1)  h2  T2 + (1 – h1)  (1 – h2)  TS. = 0.90  4.0 + 0.1  0.99  10.0 + 0.1  0.01  80.0 = 0.90  4.0 + 0.1  9.9 + 0.1  0.80 = 3.6 + 0.99 + 0.08 = 4.67 nanoseconds. Note that with these hit rates, only 0.1  0.01 = 0.001 = 0.1% of the memory references are handled by the much slower main memory. Generic Primary / Secondary Memory View A small fast expensive memory is backed by a large, slow, cheap memory. Memory references are first made to the smaller memory. 1. If the address is present, we have a “hit”. 2. If the address is absent, we have a “miss” and must transfer the addressed item from the slow memory. For efficiency, we transfer as a unit the block containing the addressed item. The mapping of the secondary memory to primary memory is “many to one” in that each primary memory block can contain a number of secondary memory addresses. To compensate for each of these, we associate a tag with each primary block. For example, consider a byte–addressable memory with 24–bit addresses and 16 byte blocks. The memory address would have six hexadecimal digits. Consider the 24–bit address 0xAB7129. The block containing that address is every item with address beginning with 0xAB712: 0xAB7120, 0xAB7121, … , 0xAB7129, 0xAB712A, … 0xAB712F. The primary block would have 16 entries, indexed 0 through F. It would have the 20–bit tag 0XAB712 associated with the block, either explicitly or implicitly. Valid and Dirty Bits At system start–up, the faster memory contains no valid data, which are copied as needed from the slower memory. Each block would have three fields associated with it The tag field (discussed above) identifying the memory addresses contained Valid bit set to 0 at system start–up. set to 1 when valid data have been copied into the block Dirty bit set to 0 at system start–up. set to 1 whenever the CPU writes to the faster memory set to 0 whenever the contents are copied to the slower memory. Associative Memory Associative memory is “content addressable” memory. The contents of the memory are searched in one memory cycle. Consider an array of 256 entries, indexed from 0 to 255 (or 0x0 to 0xFF). Suppose that we are searching the memory for entry 0xAB712. Normal memory would be searched using a standard search algorithm, as learned in beginning programming classes. If the memory is unordered, it would take on average 128 searches to find an item. If the memory is ordered, binary search would find it in 8 searches. Associative memory would find the item in one search. Think of the control circuitry as “broadcasting” the data value (here oxAB712) to all memory cells at the same time. If one of the memory cells has the value, it raises a Boolean flag and the item is found. We do not consider duplicate entries in the associative memory. This can be handled by some rather straightforward circuitry, but is not done in associative caches. Set–Associative Caches An N–way set–associative cache uses direct mapping, but allows a set of N memory blocks to be stored in the line. This allows some of the flexibility of a fully associative cache, without the complexity of a large associative memory for searching the cache. Suppose a 2–way set–associative implementation of the same cache memory. Again assume 256 cache lines, each holding 16 bytes. Assume a 24–bit address. Recall that 256 = 28, so that we need eight bits to select the cache line. Consider addresses 0xCD4128 and 0xAB7129. Each would be stored in cache line 0x12. Set 0 of this cache line would have one block, and set 1 would have the other. Entry 0 Entry 1 D V Tag Contents D V Tag Contents 1 1 0xCD4 M[0xCD4120] to M[0xCD412F] 0 1 0xAB7 M[0xAB7120] to M[0xAB712F] Virtual Memory (Again) Suppose we want to support 32–bit logical addresses in a system in which physical memory is 24–bit addressable. We can follow the primary / secondary memory strategy seen in cache memory. We shall see this again, when we study virtual memory in a later lecture. For now, we just note that the address structure of the disk determines the structure of virtual memory. Each disk stores data in blocks of 512 bytes, called sectors. In some older disks, it is not possible to address each sector directly. This is due to the limitations of older file organization schemes, such as FAT–16. FAT–16 used a 16–bit addressing scheme for disk access. Thus 216 sectors could be addressed. Since each sector contained 29 bytes, the maximum disk size under “pure FAT–16” is 225 bytes = 25  220 bytes = 32 MB. To allow for larger disks, it was decided that a cluster of 2K sectors would be the smallest addressable unit. Thus one would get clusters of 1,024 bytes, 2,048 bytes, etc. Virtual memory transfers data in units of clusters, the size of which is system dependent.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved