In this assignment you’ll build a “simple” simulation of a multi-level memory: a CPU with one register, three levels of cache, and a main memory. You’ll run your simulation on a small program of load instructions, moving values from main memory into the register, and determine the total number of cycles needed to execute the program.

You can complete this assignment in C++, C, or any other programming language you are familiar with (Java, Python, R, etc.).

Memory structure

The system you will be simulating is intentionally simplified. Time measurements are in CPU cycles rather than nanoseconds. It offers

Here’s a summary:

Line sizeSetsLines/setLatency
L1256 1 4 1
L21024 1 64 10
L34096 1 256 100
Memoryn/a 1000

CPU Instruction

Our simulated CPU only supports a single instruction: loadaddress which loads the value stored in the specified memory address into the CPU. If the memory address requested is available in the L1 cache, then it will be loaded from there, otherwise, a L1 cache miss occurs, and it will search the L2 cache, and so forth, until either the address is found in one of the caches, or we have to retrieve it from memory. Thus, the fastest memory access (L1 hit) takes only 1 CPU cycle, while the slowest (misses on all caches) takes 1 + 10 + 100 + 1000 = 1111 CPU cycles.

Cache lines and loading

When a cache miss occurs, a cache will attempt to pull in an entire line of addresses. For L1, this means 256 addresses. As mentioned in class, the requested address is “rounded off” to the nearest lower multiple of 256 (line size) and then the next 256 addresses starting from that address are loaded into the “oldest” line in the cache (i.e., we are using LRU replacement policy). Newly-loaded lines have their age set to 0.

Each cache line needs to store its starting address (always a multiple of the line size), and its “age”. Somewhat unrealistically, we store a line’s age as an integer (normally, a small number of bits would be dedicated to age). When a line is accessed, the age of all other lines is incremented. The oldest line is the one with the greatest age. Of course, when a line is replaced, its age is reset to 0. (Note that cache lines and memory do not need to store their actual contents!) When a cache is accessed, whether it is a hit or a miss, all other lines have their ages incremented. (If it is a hit, then the hit line has its age left the same; if a miss, the newly-loaded line is set to 0.)

Simulated program

Here is the program which you should run your simulation on; addresses are given in hexadecimal.

0x1000  
0x1024  
0x1599  
0x100   
0x10245 
0x10246 
0x1027  
0x10247 
0x1600  
0x1601  
0x1700  

If you have simulated the caches perfectly, you should get a total of 3,471 cycles for this sequence of addresses.

Submission

Submit your program by uploading it onto the server into the directory ~/cs241/assign1/.