Disk Caching in Large Databases and Timeshared Systems

Barbara Tockey Zivkov and Alan Jay Smith

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-96-913
September 1996

http://www.eecs.berkeley.edu/Pubs/TechRpts/1996/CSD-96-913.pdf

We present the results of a variety of trace-driven simulations of disk cache design. Our traces come from a variety of mainframe timesharing and database systems in production use. We compute miss ratios, run lengths, traffic ratios, cache residency times, degree of memory pollution and other statistics for a variety of designs, varying block size, prefetching algorithm and write algorithm. We find that for this workload, sequential prefetching produces a significant (about 20%) but still limited improvement in the miss ratio, even using a powerful technique for detecting sequentiality. Copy-back writing decreased write traffic relative to write-through; periodic flushing of the dirty blocks increased write traffic only slightly compared to pure write-back, and then only for large cache sizes. Write-allocate had little effect compared to no-write-allocate. Block sizes of over a track don't appear to be useful. Limiting cache occupancy by a single processor transaction appears to have little effect. This study is unique in the variety and quality of the data used in the studies.


BibTeX citation:

@techreport{Zivkov:CSD-96-913,
    Author = {Zivkov, Barbara Tockey and Smith, Alan Jay},
    Title = {Disk Caching in Large Databases and Timeshared Systems},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {1996},
    Month = {Sep},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/1996/5812.html},
    Number = {UCB/CSD-96-913},
    Abstract = {We present the results of a variety of trace-driven simulations of disk cache design. Our traces come from a variety of mainframe timesharing and database systems in production use. We compute miss ratios, run lengths, traffic ratios, cache residency times, degree of memory pollution and other statistics for a variety of designs, varying block size, prefetching algorithm and write algorithm. We find that for this workload, sequential prefetching produces a significant (about 20%) but still limited improvement in the miss ratio, even using a powerful technique for detecting sequentiality. Copy-back writing decreased write traffic relative to write-through; periodic flushing of the dirty blocks increased write traffic only slightly compared to pure write-back, and then only for large cache sizes. Write-allocate had little effect compared to no-write-allocate. Block sizes of over a track don't appear to be useful. Limiting cache occupancy by a single processor transaction appears to have little effect. This study is unique in the variety and quality of the data used in the studies.}
}

EndNote citation:

%0 Report
%A Zivkov, Barbara Tockey
%A Smith, Alan Jay
%T Disk Caching in Large Databases and Timeshared Systems
%I EECS Department, University of California, Berkeley
%D 1996
%@ UCB/CSD-96-913
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/1996/5812.html
%F Zivkov:CSD-96-913