References & Citations
Computer Science > Computational Geometry
Title: Same Stats, Different Graphs (Graph Statistics and Why We Need Graph Drawings)
(Submitted on 29 Aug 2018 (v1), revised 25 Sep 2018 (this version, v3), latest version 30 Oct 2019 (v5))
Abstract: Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe's Quartet demonstrates how such statistics can be misleading. Graph mining has a similar problem in that graph statistics (e.g., density, connectivity, clustering coefficient) may not capture all of the critical properties of a given graph. To study the relationships between different graph properties and statistics, we examine all low-order (<= 10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.
Submission history
From: Hang Chen [view email][v1] Wed, 29 Aug 2018 16:27:48 GMT (7148kb,D)
[v2] Fri, 31 Aug 2018 20:10:29 GMT (7408kb,D)
[v3] Tue, 25 Sep 2018 00:02:24 GMT (7543kb,D)
[v4] Wed, 5 Dec 2018 18:11:38 GMT (4245kb,D)
[v5] Wed, 30 Oct 2019 01:32:25 GMT (7543kb,D)
Link back to: arXiv, form interface, contact.