next up previous contents
Next: Memory Hierarchy and Interconnection Up: Processor Memory Unit Previous: Issuing instructions to the

Completing memory instructions in the memory hierarchy

Source files: src/Processor/memprocess.cc, src/Processor/memunit.cc, src/Processor/funcs.cc

Header files: incl/Processor/memory.h, incl/Processor/memprocess.h

Completion of memory references takes place in two parts. First, the GlobalPerform function is called at the level of the memory hierarchy which responds to the reference. This function calls the function associated with this instruction (as specified in src/Processor/funcs.cc) to actually read a value from or write a value into the UNIX address space of the simulator environment. In the case of virtual store-buffer forwards, the value taken by the load is the value forwarded from the buffer rather than that in the address space. In the case of accesses which are not simulated, this behavior takes place as part of the CompleteMemOp function (described below).

Then, when a reference is ready to return from the caches, the MemDoneHeapInsert function is called to mark the instruction for completion. In the case of non-simulated accesses, the access is put into the MemDoneHeap by the memory_latency function invoked at the time of issue.

The function CompleteMemQueue processes instructions from the MemDoneHeap of the processor by calling CompleteMemOp for each instruction to complete in a given cycle. The corresponding instruction emulation function is called for accesses that were not simulated at the caches. For loads, this function first checks whether or not a soft exception has been marked on the load for either address disambiguation or consistency constraints while it was outstanding. If this has occurred, this load must be forced to re-issue, but does not actually need to take an exception. Otherwise, this function checks to see whether the limbo field for the load must be set (that is, if any previous stores still have not generated their addresses), or whether the load must be redone (if a previous store disambiguated to an address that overlaps with the load). If the load does not need to be redone and either does not have a limbo set or has a processor in which values can be passed down from limbo loads (as discussed above), the function PerformMemOp is called to note that the value produced by this instruction is ready for use. The function PerformMemOp is called for all stores to reach CompleteMemOp.

PerformMemOp has two functions: removing instructions from the memory unit and passing values down from limbo loads. In the case of RC, PerformMemOp always removes the operation from either the memory unit or virtual store buffer (as appropriate) except in the case of loads that are either marked with a limbo field or past a MEMBAR that blocks loads. In SC, memory operations must leave the memory unit strictly in order. The constraints for PC are identical to those for SC, except that loads may leave the memory unit past outstanding stores. In no memory model may limbo loads leave the memory unit before all previous stores have disambiguated. If the memory unit policy allows values to be passed down from limbo loads, PerformMemOp fulfills some of the duties otherwise associated with the update_cycle function (filling in physical register values and clearing the busy bit and distributed stall queues for the destination register). Note that PerformMemOp will be called again for the same instruction when the limbo flag is cleared or, in the case of RC, when prior memory fences have been cleared.

If the system supports speculative load execution to improve the performance of its consistency model (with the ``-K'' option), the constraints enforced by PerformMemOp will be sufficient to guarantee that no speculative load leaves the memory unit. Each coherence message received at the L1 cache because of an external invalidation or a replacement from the lowest level of local cache (L2 in our case) must be sent to the memory unit through the SpecLoadBufCohe function. If such a message invalidates or updates a cache line accessed by any outstanding or completed speculative load access, that access is marked with a soft exception. If the access is still oustanding, the soft exception will be ignored and the load will be forced to reissue; if the access has completed, the exception must be taken in order to guarantee that the load or any later operations do not commit incorrect values into the architectural state of the processor [5, 28].


next up previous contents
Next: Memory Hierarchy and Interconnection Up: Processor Memory Unit Previous: Issuing instructions to the

Vijay Sadananda Pai
Thu Aug 7 14:18:56 CDT 1997