We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DC

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: ARcode: HPC Application Recognition Through Image-encoded Monitoring Data

Abstract: Knowing HPC applications of jobs and analyzing their performance behavior play important roles in system management and optimizations. The existing approaches detect and identify HPC applications through machine learning models. However, these approaches rely heavily on the manually extracted features from resource utilization data to achieve high prediction accuracy. In this study, we propose an innovative application recognition method, ARcode, which encodes job monitoring data into images and leverages the automatic feature learning capability of convolutional neural networks to detect and identify applications. Our extensive evaluations based on the dataset collected from a large-scale production HPC system show that ARcode outperforms the state-of-the-art methodology by up to 18.87% in terms of accuracy at high confidence thresholds. For some specific applications (BerkeleyGW and e3sm), ARcode outperforms by over 20% at a confidence threshold of 0.8.
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as: arXiv:2301.08612 [cs.DC]
  (or arXiv:2301.08612v1 [cs.DC] for this version)

Submission history

From: Jie Li [view email]
[v1] Fri, 20 Jan 2023 14:49:25 GMT (18089kb,D)

Link back to: arXiv, form interface, contact.