The slow mechanical nature of I/O devices, specifically disks, compared to the speed of electronic processing has long been recognized. In order to keep the processor supplied with data, systems rely on aggressive I/O optimization techniques that can be tuned to specific workloads. But as the improvement in processor performance continues to far exceed the improvement in disk access time, the I/O bottleneck is increasingly an issue. We now resort more and more to expensive measures for increasing I/O performance such as configuring systems with large amounts of memory as the I/O cache or using many more disks than storage requirements warrant. As systems continue to grow in complexity over and beyond our ability to cost-effectively manage them, what is really needed is a storage system that delivers good performance without requiring a lot of resources and time to configure and tune, even as the workloads evolve.
In this research, we explore mechanizable techniques for improving I/O performance by dynamically optimizing disk block layout in response to the actual usage pattern. Our techniques are based on the observation that physically, it is much more efficient to read a large contiguous chunk of data than many small chunks scattered throughout the disk. Users, however, typically have only limited knowledge and control of how their data are laid out on the disk, and most would rather not be thus burdened. The file system or application can guess how blocks are likely to be used based on static logical information such as name space relationships, but such information may not accurately reflect the actual dynamic usage pattern. On the other hand, technology trends are such that disk space and processing power are increasingly available for performing sophisticated optimizations without user involvement.
Therefore, we propose automatic locality-improving storage (ALIS), a storage system that automatically replicates and reorganizes selected disk blocks to increase the spatial locality of reference and allow it to effectively fetch data in larger chunks. Using extensive trace-driven simulations, we demonstrate that ALIS can dramatically improve performance despite efforts already made by the database and file system to optimize the layout of data. Our results show that ALIS is able to far outperform prior reorganization techniques in both an environment where the storage system consists of only disks and low-end disk adaptors, and one where there is a large outboard controller.