We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Using an expert deviation carrying the knowledge of climate data in usual clustering algorithms

Abstract: In order to help physicists to expand their knowledge of the climate in the Lesser Antilles, we aim to identify the spatio-temporal configurations using clustering analysis on wind speed and cumulative rainfall datasets. But we show that using the L2 norm in conventional clustering methods as K-Means (KMS) and Hierarchical Agglomerative Clustering (HAC) can induce undesirable effects. So, we propose to replace Euclidean distance (L2) by a dissimilarity measure named Expert Deviation (ED). Based on the symmetrized Kullback-Leibler divergence, the ED integrates the properties of the observed physical parameters and climate knowledge. This measure helps comparing histograms of four patches, corresponding to geographical zones, that are influenced by atmospheric structures. The combined evaluation of the internal homogeneity and the separation of the clusters obtained using ED and L2 was performed. The results, which are compared using the silhouette index, show five clusters with high indexes. For the two available datasets one can see that, unlike KMS-L2, KMS-ED discriminates the daily situations favorably, giving more physical meaning to the clusters discovered by the algorithm. The effect of patches is observed in the spatial analysis of representative elements for KMS-ED. The ED is able to produce different configurations which makes the usual atmospheric structures clearly identifiable. Atmospheric physicists can interpret the locations of the impact of each cluster on a specific zone according to atmospheric structures. KMS-L2 does not lead to such an interpretability, because the situations represented are spatially quite smooth. This climatological study illustrates the advantage of using ED as a new approach.
Subjects: Machine Learning (cs.LG); Atmospheric and Oceanic Physics (physics.ao-ph); Machine Learning (stat.ML)
Cite as: arXiv:2006.05603 [cs.LG]
  (or arXiv:2006.05603v1 [cs.LG] for this version)

Submission history

From: Emmanuel Biabiany [view email]
[v1] Wed, 10 Jun 2020 01:42:40 GMT (5356kb,D)

Link back to: arXiv, form interface, contact.