References & Citations
Computer Science > Databases
Title: A Hybrid Approach To Hierarchical Density-based Cluster Selection
(Submitted on 6 Nov 2019 (v1), last revised 21 Jan 2021 (this version, v4))
Abstract: HDBSCAN is a density-based clustering algorithm that constructs a cluster hierarchy tree and then uses a specific stability measure to extract flat clusters from the tree. We show how the application of an additional threshold value can result in a combination of DBSCAN* and HDBSCAN clusters, and demonstrate potential benefits of this hybrid approach when clustering data of variable densities. In particular, our approach is useful in scenarios where we require a low minimum cluster size but want to avoid an abundance of micro-clusters in high-density regions. The method can directly be applied to HDBSCAN's tree of cluster candidates and does not require any modifications to the hierarchy itself. It can easily be integrated as an addition to existing HDBSCAN implementations.
Submission history
From: Claudia Malzer [view email][v1] Wed, 6 Nov 2019 09:59:56 GMT (1877kb,D)
[v2] Sun, 8 Dec 2019 09:47:23 GMT (2065kb,D)
[v3] Thu, 10 Dec 2020 09:25:13 GMT (3100kb,D)
[v4] Thu, 21 Jan 2021 13:39:38 GMT (1568kb,D)
Link back to: arXiv, form interface, contact.