Quickwords28:Skylake Microarchitecture(11)

Store and Load Example

Given the initialized situation like this:

At the start time, several store and load instructions are streamed to Load-Store Queue(LSQ). And four address-value pairs are stored in the cache.

We start with the first instruction from the LSQ: load from addr 0x3290.

It will firstly check if there are previous store instruction which stores value at the identical address. As it is the first instruction, there won’t be any store instruction satisfies this requirement.

Then it looks up to the cache for a hit. In our situation, indeed there is a cache-hit and value 42 is found and streamed to the corresponding location in the Value Column of the LSQ.

Go on with the next store instruction. Assume that the value, 25, of the store instruction has been calculated and it is stored directly into the Value Column of the LSQ.

Note: value will be updated into cache only after the commit phase of store instruction.

Next store instruction shares similar story. Assume the calculated value of the store instruction is -17.

Next load instruction will, no exception, check if there are previous store instruction stores value at the identical memory address. No store instructions associated with address0x3418 so it then come to cache for a hint. Therefore, 1234 stores to the Value Column.

Next load instruction. And it can find a previous store instruction stores to the identical address, 0x3290. It will read the value of the store instruction directly to its Value Column.

This is a store-forward operation.

Next load instruction will also firstly do a search on previous store instructions and obviously, it fails. Then cache will provide with the value 1 to this load.

Next store, 0 is calculated and puts into the Value Column.

Next load, a store-forward occurs again. 25 is read out and put into the Value Column.

For the next load instruction, there will be multiple results return from the search. But it only accept the value returns from the nearest store instruction. Therefore, 0 is put into the Value Column.

The last load instruction will read value from cache.

Then, instruction will be committed.

For load instruction, it is just simply dequeue from the LSQ as the value is loaded to the register already at the execution phase.

For store instruction, the value is updated to cache and then dequeue.

Next store instruction, similar instruction, value is stored to cache and dequeue.

Next three load instructions, dequeue.

store, updates cache.

Last three load instructions, dequeue.

The reason why store only updates cache at the commit phase is that if the processor detects a prediction failure in the pipeline and the instructions after the last store instruction need to be flushed, the status of cache is not impacted and instructions at the correct branch can pretend to start with the untouched status.

© 2020 DecodeZ All Rights Reserved. 本站访客数人次 本站总访问量
Theme by hiero