We gratefully acknowledge support from
the Simons Foundation and member institutions.

Performance

New submissions

[ total of 2 entries: 1-2 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 22 May 20

[1]  arXiv:2005.10413 [pdf, other]
Title: Mapping Matters: Application Process Mapping on 3-D Processor Topologies
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)

Applications' performance is influenced by the mapping of processes to computing nodes, the frequency and volume of exchanges among processing elements, the network capacity, and the routing protocol. A poor mapping of application processes degrades performance and wastes resources. Process mapping is frequently ignored as an explicit optimization step since the system typically offers a default mapping, users may lack awareness of their applications' communication behavior, and the opportunities for improving performance through mapping are often unclear. This work studies the impact of application process mapping on several processor topologies. We propose a workflow that renders mapping as an explicit optimization step for parallel applications. We apply the workflow to a set of four applications (NAS CG and BT-MZ, CORAL-2 AMG, and CORAL LULESH), twelve mapping algorithms (communication \& topology oblivious/aware), and three direct network topologies (3-D mesh, 3-D torus, and a novel highly adaptive energy-efficient 3-D topology, called the HAEC Box). We assess the mappings' quality in terms of volume, frequency, and distance of exchanges using metrics such as dilation (measured in hop$\cdot$Byte). A parallel trace-based simulator predicts the applications' execution on the three topologies using the twelve mappings. We evaluate the impact of process mapping on the applications' simulated performance in terms of execution and communication times and identify the mapping that achieves the highest performance. To ensure correctness of the simulations, we compare the resulting volume, frequency, and distance of exchanges against their pre-simulation values. This work emphasizes the importance of process mapping as an explicit optimization step and offers a solution for parallel applications to exploit the full potential of the allocated resources on a given system.

Replacements for Fri, 22 May 20

[2]  arXiv:2003.04821 (replaced) [pdf, other]
Title: Benchmarking TinyML Systems: Challenges and Direction
Comments: 6 pages, 1 figure, 2 tables
Subjects: Performance (cs.PF); Machine Learning (cs.LG)
[ total of 2 entries: 1-2 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2005, contact, help  (Access key information)