...functions
Currently, the only tested platform that does not use ELF is the Convex Exemplar with HP-UX.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...scheduling
An option for in-order scheduling is provided as a straightforward modification to the out-of-order scheduling pipeline, but is not well tested. Details of the implementation of this feature are provided in Section 10.2.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...size
Code for fewer renaming registers and consequent register stalls is included but has not been tested and is not exposed to the user. Chapter 10 gives a more detailed explanation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...HREF="node25.html#rpipes_mem">3.2.3
A command-line option is also provided to specify that non-memory instructions should also be dispatched to an issue window; by default, these instructions are issued directly from the active list. More details about this option are provided in Section 4.1.1.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...renaming
Single-precision floating-point operations experience WAW (output) dependences because floating-point registers are mapped and renamed on double-precision boundaries. This is further discussed in Chapter 10
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...memory
RSIM also does not include the ability to mark certain regions of memory uncached, a feature commonly associated with virtual memory
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...below
We do not expect applications to use this type of MEMBAR. It is currently used in RSIM only in the RSIM system trap library.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...performed
A store is globally performed when its value is visible to all processors; i.e. all other caches with a copy of the line have been invalidated. In RSIM, this is indicated when an acknowledgment for the store is received by the processor. A load is globally performed when its return value is bound and when the store whose value it returns is globally performed. In RSIM, this is detected when the load returns its value to the processor.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...exceptions
We use the terms exception and trap interchangably
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...boundary
The SPARC architecture also allows word-alignment for quadruple-precision floating-point loads and stores, but RSIM does not support such instructions.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...subtraction)
RSIM does not yet raise any exception on some unsupported instructions, such as 64-bit integer operations or quadruple-precision floating-point accesses. It is the user's responsibility to insure that such instructions are not used. The compiler options we provide in Section 2.4 inform the compiler not to generate these instructions.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...network
The potential for supporting other network models is discussed in Section 15.3.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...filename
SpecialInitOutput is simply an fopen followed by an fwrite.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...hierarchy
Note that in this chapter, the terms issue and complete usually refer to issuing to the memory hierarchy and completion at the memory hierarchy. These are different from the issue and completion stages of the processor pipeline.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...length
For some operations, the minimum alignment requirement specified in the ISA is smaller than the actual length of data transferred. However, we simulate a processor that traps and emulates instructions that are not aligned on a boundary equal to their length, as these seem more appropriate for high-performance implementation. That is, the possibility of having multiple cache line accesses and multiple page faults for a single instruction seems to be an undesirably difficult problem.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...barriers
Because of the single-precision floating point problems discussed in Chapter 10, single-precision loads do not issue until their output dependences are resolved. With static scheduling, subsequent loads will also be prevented from issuing.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...message
In the code, the data structure allocated for such a message is called the REQ data structure. This data structure is used for requests and reply messages, for both data and coherence transactions. We avoid using the term REQ in this manual to avoid any possible confusion with the REQUEST message type.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...READ_SH
If no other processors sharing
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...UPGRADE
In certain race conditions, UPGRADE is converted to READ_OWN, as described in Section 14.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...UPGRADE
In certain race conditions, UPGRADE is converted to READ_OWN, as described in Section 14.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...request
In the L2 cache, this will return NOMSHR_STALL_COHE if there is currently a COHE request pending on the line. This indicates that no MSHR has been consumed, but this REQUEST must wait for the pending COHE first.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...stall
WAR stalls are not usually seen with straightforward implementations of consistency models, as stores following loads to the same line generally depend on the values of those loads. Consequently, such stores cannot issue in straightforward implementations until the loads complete.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...used
Recall that the simulator code refers to the MESI states as PR_DY, PR_CL, SH_CL, and INVALID, respectively.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...studies [21]
Note that our architecture does permit ``forwarding'' of values in the processor memory unit.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

Vijay Sadananda Pai
Thu Aug 7 14:18:56 CDT 1997