next up previous contents
Next: Performance tuning Up: Porting Applications to RSIM Previous: Synchronization support for multiprocessor

Statistics collection

  RSIM automatically generates statistics for many important characteristics of the simulated system. RSIM has special functions and macros that can be used to subdivide these statistics according to the phases of an application.

The user can add the newphase and endphase functions to indicate the start and end of an application phase. The newphase function takes a single integer argument that represents the new phase number (the simulation starts in phase 0). This function also clears out all current processor simulation statistics. The endphase function takes no arguments. This function prints out both a concise summary and a detailed set of processor simulation statistics, described in Chapter 6.

There are additional macros that can be used within a processor phase to aggregate a set of instructions into a single statistics class. These macros are START_USR<num> and END_USR<num>, where num is an integer between 1 and 9. When RSIM prints the processor statistics for a phase, all the time spent between a set of these aggregation macros is lumped together, rather than being associated with the individual instructions included therein. In addition to the USR<num> aggregation classes, there are also aggregation classes for various synchronization operations called ACQ, SPIN, REL, and BAR. These aggregation classes can be used with START_ACQ, END_ACQ, and so forth.

Note that aggregate classes cannot be nested. Additionally, an aggregate class is not counted as graduating until all instructions in the class have graduated. Consequently, the partial statistics printed according to the ``-A'' option in Section 4.1.7 (i.e. the the number of cycles since each processor graduated an instruction) do not count instructions graduated within an aggregate class.

The functions StatReportAll and StatClearAll handle the statistics associated with the caches, memory system, and network. Each one applies to the entire system, and thus should usually be called only by processor 0 just after a barrier. StatReportAll prints out a detailed set of statistics associated with the caches, memory system, and network of the system, while StatClearAll clears all the statistics gathered.


next up previous contents
Next: Performance tuning Up: Porting Applications to RSIM Previous: Synchronization support for multiprocessor

Vijay Sadananda Pai
Thu Aug 7 14:18:56 CDT 1997