Optimizing Hierarchical Storage Management For Database System
Loading...
Date
2014-06-17
Authors
Liu, Xin
Advisor
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Caching is a classical but effective way to improve system performance.
To improve system performance, servers, such as database servers and storage servers, contain
significant amounts of memory that act as a fast cache.
Meanwhile, as new storage devices such as flash-based solid state drives (SSDs)
are added to storage systems over time, using the
memory cache is not the only way to improve system performance.
In this thesis, we address the problems of how to manage the cache of a storage server and
how to utilize the SSD in a hybrid storage system.
Traditional caching policies are known to perform poorly for storage
server caches. One promising approach to solving this problem is to use hints
from the storage clients to manage the storage server cache. Previous
hinting approaches are ad hoc, in that a predefined reaction to
specific types of hints is hard-coded into the caching policy. With
ad hoc approaches, it is difficult to ensure that the best hints are
being used, and it is difficult to accommodate multiple types of hints
and multiple client applications. In this thesis, we propose
CLient-Informed Caching (CLIC), a generic hint-based technique for
managing storage server caches. CLIC automatically interprets hints
generated by storage clients and translates them into a server caching
policy. It does this without explicit knowledge of the
application-specific hint semantics. We demonstrate using trace-based
simulation of database workloads that CLIC outperforms hint-oblivious
and state-of-the-art hint-aware caching policies.
We also demonstrate that the space required to track and interpret
hints is small.
SSDs are becoming a part of the storage system.
Adding SSD to a storage system not only raises the question of how to manage the SSD,
but also raises the question of whether current buffer pool algorithms will still work effectively.
We are interested in the use of hybrid storage systems, consisting of SSDs and hard disk drives (HDD),
for database management.
We present cost-aware replacement algorithms for both the DBMS buffer pool and the SSD.
These algorithms are aware of the different I/O performance
of HDD and SSD.
In such a hybrid storage system, the physical access pattern to the SSD depends on the management of the DBMS buffer pool.
We studied the impact of the buffer pool caching policies on the access patterns of the SSD.
Based on these studies, we designed a caching policy to effectively manage the SSD.
We implemented these algorithms in MySQL's InnoDB storage engine
and used the TPC-C workload to demonstrate that these cost-aware algorithms
outperform previous algorithms.
Description
Keywords
Database, Storage management, Cache management, Hybrid storage, SSD