References & Citations
Computer Science > Programming Languages
Title: Stream Processing With Dependency-Guided Synchronization (Extended Version)
(Submitted on 9 Apr 2021 (v1), last revised 3 Jan 2022 (this version, v3))
Abstract: Real-time data processing applications with low latency requirements have led to the increasing popularity of stream processing systems. While such systems offer convenient APIs that can be used to achieve data parallelism automatically, they offer limited support for computations that require synchronization between parallel nodes. In this paper, we propose *dependency-guided synchronization (DGS)*, an alternative programming model for stateful streaming computations with complex synchronization requirements. In the proposed model, the input is viewed as partially ordered, and the program consists of a set of parallelization constructs which are applied to decompose the partial order and process events independently. Our programming model maps to an execution model called *synchronization plans* which supports synchronization between parallel nodes. Our evaluation shows that APIs offered by two widely used systems -- Flink and Timely Dataflow -- cannot suitably expose parallelism in some representative applications. In contrast, DGS enables implementations with scalable performance, the resulting synchronization plans offer throughput improvements when implemented manually in existing systems, and the programming overhead is small compared to writing sequential code.
Submission history
From: Caleb Stanford [view email][v1] Fri, 9 Apr 2021 17:50:53 GMT (247kb,D)
[v2] Sat, 17 Apr 2021 05:12:14 GMT (203kb,D)
[v3] Mon, 3 Jan 2022 08:11:19 GMT (1293kb,D)
Link back to: arXiv, form interface, contact.