We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DC

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments

Abstract: Intelligent task placement and management of tasks in large-scale fog platforms is challenging due to the highly volatile nature of modern workload applications and sensitive user requirements of low energy consumption and response time. Container orchestration platforms have emerged to alleviate this problem with prior art either using heuristics to quickly reach scheduling decisions or AI driven methods like reinforcement learning and evolutionary approaches to adapt to dynamic scenarios. The former often fail to quickly adapt in highly dynamic environments, whereas the latter have run-times that are slow enough to negatively impact response time. Therefore, there is a need for scheduling policies that are both reactive to work efficiently in volatile environments and have low scheduling overheads. To achieve this, we propose a Gradient Based Optimization Strategy using Back-propagation of gradients with respect to Input (GOBI). Further, we leverage the accuracy of predictive digital-twin models and simulation capabilities by developing a Coupled Simulation and Container Orchestration Framework (COSCO). Using this, we create a hybrid simulation driven decision approach, GOBI*, to optimize Quality of Service (QoS) parameters. Co-simulation and the back-propagation approaches allow these methods to adapt quickly in volatile environments. Experiments conducted using real-world data on fog applications using the GOBI and GOBI* methods, show a significant improvement in terms of energy consumption, response time, Service Level Objective and scheduling time by up to 15, 40, 4, and 82 percent respectively when compared to the state-of-the-art algorithms.
Comments: Accepted in IEEE Transactions on Parallel and Distributed Systems, 2021
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
DOI: 10.1109/TPDS.2021.3087349
Cite as: arXiv:2104.14392 [cs.DC]
  (or arXiv:2104.14392v3 [cs.DC] for this version)

Submission history

From: Shreshth Tuli [view email]
[v1] Thu, 29 Apr 2021 15:09:44 GMT (7847kb,D)
[v2] Thu, 3 Jun 2021 17:32:14 GMT (7847kb,D)
[v3] Fri, 9 Jul 2021 13:08:48 GMT (7847kb,D)

Link back to: arXiv, form interface, contact.