Efficient Hierarchical Storage Management Framework Empowered by Reinforcement Learning

Zhang, Tianru; Toor, Salman; Hellander, Andreas

Full-text links:

Download:

Current browse context:

cs.DC

< prev | next >

new | recent | 2201

Change to browse by:

Computer Science > Distributed, Parallel, and Cluster Computing

Title: Efficient Hierarchical Storage Management Framework Empowered by Reinforcement Learning

Authors: Tianru Zhang, Salman Toor, Andreas Hellander

(Submitted on 12 Jan 2022)

Abstract: With the rapid development of big data and cloud computing, data management has become increasingly challenging. Over the years, a number of frameworks for data management and storage with various characteristics and features have become available. Most of these are highly efficient, but ultimately create data silos. It becomes difficult to move and work coherently with data as new requirements emerge as no single framework can efficiently fulfill the data management needs of diverse applications. A possible solution is to design smart and efficient hierarchical (multi-tier) storage solutions. A hierarchical storage system (HSS) is a meta solution that consists of different storage frameworks organized as a jointly constructed large storage pool. It brings a number of benefits including better utilization of the storage, cost-efficiency, and use of different features provided by the underlying storage frameworks. In order to maximize the gains of hierarchical storage solutions, it is important that they include intelligent and autonomous mechanisms for data management grounded in the features of the different underlying frameworks. These decisions should be made according to the characteristics of the dataset, tier status, and access patterns. These are highly dynamic parameters and defining a policy based on the mentioned parameters is a non-trivial task. This paper presents an open-source hierarchical storage framework with a dynamic migration policy based on reinforcement learning (RL). We present a mathematical model, a software architecture, and an implementation based on both simulations and a live cloud-based environment. We compare the proposed RL-based strategy to a baseline of three rule-based policies, showing that the RL-based policy achieves significantly higher efficiency and optimal data distribution in different scenarios compared to the dynamic rule-based policies.

Comments:	20 pages, 13 figures
Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2201.11668 [cs.DC]
	(or arXiv:2201.11668v1 [cs.DC] for this version)

Submission history

From: Tianru Zhang [view email]
[v1] Wed, 12 Jan 2022 15:10:33 GMT (6720kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2201.11668

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Distributed, Parallel, and Cluster Computing

Title: Efficient Hierarchical Storage Management Framework Empowered by Reinforcement Learning

Submission history