Single and Multi-CPU Performance Modeling for Embedded Systems
Trevor Conrad Meyerowitz, Alberto L. Sangiovanni-Vincentelli, Mirko Sauermann and Dominik Langen
Software executing on one or more microprocessors has become the dominant factor in the majority of embedded systems. Most system-level design environments provide support for at most several ISA's (Instruction Set Architectures), and typically even fewer microarchitectural instances. Furthermore, they have little or no support for adding new instruction sets or microarchitectures. Microarchitectural simulators such as SimpleScalar provide stand-alone environments for evaluating the impact of different microarchitectures, implementing one or more ISA's, but are generally difficult to retarget or use at the system level. Architecture description languages, such as LISA, often couple the microarchitecture with the description of the functionality, force the user to completely specify the functionality of the processor, and generally aren't integrated with system-level design environments. All of the above-mentioned pieces of work simulate the processors either at the instruction level or the cycle level, both of which aren't incredibly scalable.
This work focuses on the modeling of the performance of software executing on embedded processors in the context of a heterogeneous multi-processor system-on-a-chip in a more scalable manner than current approaches. It contains three major parts. The first part describes different levels of abstraction for modeling such systems and how their speed and accuracy tradeoffs relate. The second part presents our modeling microarchitectural performance of a single processor in an intuitive and retargetable manner using a high-level description in Kahn Process Networks. The final portion explores multi-processor modeling at different levels of abstraction using performance backwards annotation.
The first portion of this work defines different levels of abstraction for performance of modeling microprocessors and hardware in general. The levels from cycle accurate models and below (e.g., signal level models) are clearly defined, but levels above cycle-accurate aren't that well defined. Transaction Level Modeling (TLM) represents above-RTL modeling with communication being implemented as function calls, but the definitions beyond this are still forming and are unclear. We refine the definition of this, and also perform a framework for comparing models of software performance.
The microarchitectural models  use the Kahn Process Network formalism with the FIFOs kept at a constant length to represent delays. These models execute on traces generated by a functional ISS (Instruction Set Simulator) and only model instruction timings and operand dependencies making them easier to retarget and to perform microarchitectural exploration with than traditional techniques. These techniques are tested on models of StrongARM and XScale microarchitectures and their accompanying memory models.
The multi-processor portion of the work  examines performance backwards annotation. Performance backwards annotation writes the execution times of code executing on the target model back at to the functional model at a user-specified level of granularity. Such models are more accurate than coarse-grained instruction-level models, yet will be much faster than co-simulating with low-level models. These techniques are demonstrated on a heterogeneous multiprocessor from Infineon intended for software defined radio. The annotated models are compared to a cycle-level virtual prototype in terms of speed and accuracy.
Figure 1: Two-process model of a single CPU microarchitecture using Kahn Process Networks
- T. Meyerowitz and A. Sangiovanni-Vincentelli, "High Level CPU Micro-Architecture Models Using Kahn Process Networks," SRC Techcon Extended Abstract, Portland, OR, October 2005.
- T. Meyerowitz, M. Sauermann, D. Langen, and A. Sangiovanni-Vincentelli, "Source Level Timing Annotation and Simulation for a Heterogeneous Multiprocessor," Design Automation and Test Europe Conference, 2008 (submitted).