We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AR

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation

Abstract: This paper presents a methodology for using LLVM-based tools to tune the DCA++ (dynamical clusterapproximation) application that targets the new ARM A64FX processor. The goal is to describethe changes required for the new architecture and generate efficient single instruction/multiple data(SIMD) instructions that target the new Scalable Vector Extension instruction set. During manualtuning, the authors used the LLVM tools to improve code parallelization by using OpenMP SIMD,refactored the code and applied transformation that enabled SIMD optimizations, and ensured thatthe correct libraries were used to achieve optimal performance. By applying these code changes, codespeed was increased by 1.98X and 78 GFlops were achieved on the A64FX processor. The authorsaim to automatize parts of the efforts in the OpenMP Advisor tool, which is built on top of existingand newly introduced LLVM tooling.
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Materials Science (cond-mat.mtrl-sci); Hardware Architecture (cs.AR); Computation and Language (cs.CL); Software Engineering (cs.SE)
Cite as: arXiv:2106.14332 [cs.DC]
  (or arXiv:2106.14332v1 [cs.DC] for this version)

Submission history

From: Weile Wei [view email]
[v1] Sun, 27 Jun 2021 22:38:16 GMT (1728kb,D)

Link back to: arXiv, form interface, contact.