Experimental Evaluation of On-Chip Microprocessor Cache Memories

Mark D. Hill and Alan Jay Smith

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-84-175
April 1984

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1984/CSD-84-175.pdf

Advances in integrated circuit density are permitting the implementation on a single chip of functions and performance enhancements beyond those of a basic processors. One performance enhancement of proven value is a cache memory; placing a cache on the processor chip can reduce both mean memory access time and bus traffic. In this paper we use trace driven simulation to study design tradeoffs for small (on-chip) caches. Miss ratio and traffic ratio (bus traffic) are the metrics for cache performance. Particular attention is paid to sub-block caches (also known as sector caches), in which address tags are associated with blocks, each of which contains multiple sub-blocks; sub-blocks are the transfer unit. Using traces from two 16-bit architectures (Z8000,PDP-11) and two 32-bit architectures (VAX-11, System/370), we find that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well; typical miss and traffic ratios for a 1024 byte (net size) cache, 4-way set associative with 8 byte blocks are: PDP-11: .039, .156, .060, VAX 11: .080, .160, Sys/370: .244, .489. (These figures are based on traces of user programs and the performance obtained in practice is likely to be less good.) The use of sub-blocks allows tradeoffs between miss ratio and traffic ratio for a given cache size. Load forward is quite useful. Extensive simulation results are presented.


BibTeX citation:

@techreport{Hill:CSD-84-175,
    Author = {Hill, Mark D. and Smith, Alan Jay},
    Title = {Experimental Evaluation of On-Chip Microprocessor Cache Memories},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {1984},
    Month = {Apr},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1984/5964.html},
    Number = {UCB/CSD-84-175},
    Abstract = {Advances in integrated circuit density are permitting the implementation on a single chip of functions and performance enhancements beyond those of a basic processors. One performance enhancement of proven value is a cache memory; placing a cache on the processor chip can reduce both mean memory access time and bus traffic. In this paper we use trace driven simulation to study design tradeoffs for small (on-chip) caches.  Miss ratio and traffic ratio (bus traffic) are the metrics for cache performance. Particular attention is paid to sub-block caches (also known as sector caches), in which address tags are associated with blocks, each of which contains multiple sub-blocks; sub-blocks are the transfer unit. Using traces from two 16-bit architectures (Z8000,PDP-11) and two 32-bit architectures (VAX-11, System/370), we find that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well; typical miss and traffic ratios for a 1024 byte (net size) cache, 4-way set associative with 8 byte blocks are: PDP-11: .039, .156, .060, VAX 11: .080, .160, Sys/370: .244, .489.  (These figures are based on traces of user programs and the performance obtained in practice is likely to be less good.) The use of sub-blocks allows tradeoffs between miss ratio and traffic ratio for a given cache size. Load forward is quite useful. Extensive simulation results are presented.}
}

EndNote citation:

%0 Report
%A Hill, Mark D.
%A Smith, Alan Jay
%T Experimental Evaluation of On-Chip Microprocessor Cache Memories
%I EECS Department, University of California, Berkeley
%D 1984
%@ UCB/CSD-84-175
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1984/5964.html
%F Hill:CSD-84-175