We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DC

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: PaSh: Light-touch Data-Parallel Shell Processing

Authors: Nikos Vasilakis (MIT), Konstantinos Kallas (University of Pennsylvania), Konstantinos Mamouras (Rice University), Achilleas Benetopoulos (Unaffiliated), Lazar Cvetković (University of Belgrade)
Abstract: This paper presents {\scshape PaSh}, a system for parallelizing POSIX shell scripts. Given a script, {\scshape PaSh} converts it to a dataflow graph, performs a series of semantics-preserving program transformations that expose parallelism, and then converts the dataflow graph back into a script -- one that adds POSIX constructs to explicitly guide parallelism coupled with {\scshape PaSh}-provided {\scshape Unix}-aware runtime primitives for addressing performance- and correctness-related issues. A lightweight annotation language allows command developers to express key parallelizability properties about their commands. An accompanying parallelizability study of POSIX and GNU commands -- two large and commonly used groups -- guides the annotation language and optimized aggregator library that {\scshape PaSh} uses. Finally, {\scshape PaSh}'s {\scshape PaSh}'s extensive evaluation over 44 unmodified {\scshape Unix} scripts shows significant speedups ($0.89$--$61.1\times$, avg: $6.7\times$) stemming from the combination of its program transformations and runtime primitives.
Comments: 18 pages, 12 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Programming Languages (cs.PL)
DOI: 10.1145/3447786.3456228
Cite as: arXiv:2007.09436 [cs.DC]
  (or arXiv:2007.09436v4 [cs.DC] for this version)

Submission history

From: Konstantinos Kallas [view email]
[v1] Sat, 18 Jul 2020 14:14:11 GMT (568kb,D)
[v2] Sun, 11 Oct 2020 20:24:41 GMT (697kb,D)
[v3] Mon, 4 Jan 2021 02:04:55 GMT (697kb,D)
[v4] Sat, 3 Apr 2021 16:02:11 GMT (468kb,D)

Link back to: arXiv, form interface, contact.