Quickwords27 Skylake Microarchitecture(10)

Posted on April 20, 2019 本文总阅读量次

Load and Store instructions

In previous chapters we discussed how does ROB and RS as well as RAT work. You may notice that we did not include load and store instruction in the demonstrated examples. This is partly due to simplification reason and partly because of the specialized mechanisms we will introduce in this article.

Although we categorize load and store instructions as special from other classes of instructions, all instructions and the design of the pipeline share unified purpose: increase the instruction level parellarmise by eliminating dependencies. By what I am saying:

Eliminate control dependencies by leveraging branch prediction
Eliminate false dependencies by leveraging register renaming

Note that register renaming is primarily aimed for registers, not memory.

Is there also dependency existing for memory operations? If yes, what can we do about it? These are the questions we are trying to address.

Load and Store Are Different From Read And Write

Load and store are terms used for memory instructions whereas read and write are used for actions directly operated on memory. Most of the time, these terms are interchangeable. However, in our scope, we must differentiate those in order to avoid misunderstanding in following discussion:

Stores are instructions and they follow the same procedure described before. Only after the store instruction is committed, memory written happens.

Loads are also instructions, but memory read action may happen before or after load instruction is committed. That’s mainly because load is able to leverage results of previous stores which store to the same address of the load instruction. Therefore, loads perform in execute stage.

Registers and Memory

Registers and memory share same type of dependences. False dependencies can be eliminated during Out-of-order execution.

However, there’s one important difference, the address of memory operation known only at runtime, makes memory operation much more difficult to tell if there’s a dependency. For example:

Load r3 = 0[R6]
Add r7 = r3 + r9
Store r4->0[r7]
Sub r1 = r1 - r2
Load r8 = 0[r1]

Here, in the third instruction you store value in r4 to memory location represented by r7, and then you load value in memory location [r1] to r8. We assume there is a cache hit. If r7 is not equals to r1, there’s no problem. Problem arises if r7 equals r1, as the store/third instruction has not been committed, the value in cache/read by the last instruction is not the latest/correct. In other word, this is a RAW true dependency. Our trusted friend, compiler, can not help under this circumstance neither.

This is the root cause of memory aliasing, when two pointers refer to the same memory location, true dependency happens. Although you can give compiler hint to omit memory aliasing, it is up to the unreliable programmer to take care of their spaghetti logic.

As before, we set an example to make the explanation easier to understand…in next chapter.