We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SE

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Software Engineering

Title: Causal Inference-Based Root Cause Analysis for Online Service Systems with Intervention Recognition

Abstract: Fault diagnosis is critical in many domains, as faults may lead to safety threats or economic losses. In the field of online service systems, operators rely on enormous monitoring data to detect and mitigate failures. Quickly recognizing a small set of root cause indicators for the underlying fault can save much time for failure mitigation. In this paper, we formulate the root cause analysis problem as a new causal inference task named intervention recognition. We proposed a novel unsupervised causal inference-based method named Causal Inference-based Root Cause Analysis (CIRCA). The core idea is a sufficient condition for a monitoring variable to be a root cause indicator, i.e., the change of probability distribution conditioned on the parents in the Causal Bayesian Network (CBN). Towards the application in online service systems, CIRCA constructs a graph among monitoring metrics based on the knowledge of system architecture and a set of causal assumptions. The simulation study illustrates the theoretical reliability of CIRCA. The performance on a real-world dataset further shows that CIRCA can improve the recall of the top-1 recommendation by 25% over the best baseline method.
Comments: Accepted at KDD 2022 Applied Data Science Track
Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI)
DOI: 10.1145/3534678.3539041
Cite as: arXiv:2206.05871 [cs.SE]
  (or arXiv:2206.05871v1 [cs.SE] for this version)

Submission history

From: Mingjie Li [view email]
[v1] Mon, 13 Jun 2022 01:45:13 GMT (631kb,D)

Link back to: arXiv, form interface, contact.