We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DC

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: SIMD Lossy Compression for Scientific Data

Abstract: Modern HPC applications produce increasingly large amounts of data, which limits the performance of current extreme-scale systems. Data reduction techniques, such as lossy compression, help to mitigate this issue by decreasing the size of data generated by these applications. SZ, a current state-of-the-art lossy compressor, is able to achieve high compression ratios, but the prediction/quantization methods used introduce dependencies which prevent parallelizing this step of the compression. Recent work proposes a parallel dual prediction/quantization algorithm for GPUs which removes these dependencies. However, some HPC systems and applications do not use GPUs, and could still benefit from the fine-grained parallelism of this method. Using the dual-quantization technique, we implement and optimize a SIMD vectorized CPU version of SZ, and create a heuristic for selecting the optimal block size and vector length. We also investigate the effect of non-zero block padding values to decrease the number of unpredictable values along compression block borders. We measure performance of vecSZ against an O3 optimized CPU version of SZ using dual-quantization, pSZ, as well as SZ-1.4. We evaluate our vectorized version, vecSZ, on the Intel Skylake and AMD Rome architectures using real-world scientific datasets. We find that applying alternative padding reduces the number of outliers by 100\% for some configurations. Our implementation also results in up to 32\% improvement in rate-distortion and up to 15$\times$ speedup over SZ-1.4, achieving a prediction and quantization bandwidth in excess of 3.4 GB/s.
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as: arXiv:2201.04614 [cs.DC]
  (or arXiv:2201.04614v1 [cs.DC] for this version)

Submission history

From: Griffin Dube [view email]
[v1] Wed, 12 Jan 2022 18:38:34 GMT (467kb,D)

Link back to: arXiv, form interface, contact.