We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DC

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: Accelerating Parallel Write via Deeply Integrating Predictive Lossy Compression with HDF5

Abstract: Lossy compression is one of the most efficient solutions to reduce storage overhead and improve I/O performance for HPC applications. However, existing parallel I/O libraries cannot fully utilize lossy compression to accelerate parallel write due to the lack of deep understanding on compression-write performance. To this end, we propose to deeply integrate predictive lossy compression with HDF5 to significantly improve the parallel-write performance. Specifically, we propose analytical models to predict the time of compression and parallel write before the actual compression to enable compression-write overlapping. We also introduce an extra space in the process to handle possible data overflows resulting from prediction uncertainty in compression ratios. Moreover, we propose an optimization to reorder the compression tasks to increase the overlapping efficiency. Experiments with up to 4,096 cores from Summit show that our solution improves the write performance by up to 4.5X and 2.9X over the non-compression and lossy compression solutions, respectively, with only 1.5% storage overhead (compared to original data) on two real-world HPC applications.
Comments: 13 pages, 18 figures, accepted by ACM/IEEE SC'22
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
Cite as: arXiv:2206.14761 [cs.DC]
  (or arXiv:2206.14761v1 [cs.DC] for this version)

Submission history

From: Dingwen Tao [view email]
[v1] Wed, 29 Jun 2022 16:47:01 GMT (3418kb,D)

Link back to: arXiv, form interface, contact.