We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: A Multi-disciplinary Ensemble Algorithm for Clustering Heterogeneous Datasets

Abstract: Clustering is a commonly used method for exploring and analysing data where the primary objective is to categorise observations into similar clusters. In recent decades, several algorithms and methods have been developed for analysing clustered data. We notice that most of these techniques deterministically define a cluster based on the value of the attributes, distance, and density of homogenous and single-featured datasets. However, these definitions are not successful in adding clear semantic meaning to the clusters produced. Evolutionary operators and statistical and multi-disciplinary techniques may help in generating meaningful clusters. Based on this premise, we propose a new evolutionary clustering algorithm (ECAStar) based on social class ranking and meta-heuristic algorithms for stochastically analysing heterogeneous and multiple-featured datasets. The ECAStar is integrated with recombinational evolutionary operators, Levy flight optimisation, and some statistical techniques, such as quartiles and percentiles, as well as the Euclidean distance of the K-means algorithm. Experiments are conducted to evaluate the ECAStar against five conventional approaches: K-means (KM), K-meansPlusPlus (KMPlusPlus), expectation maximisation (EM), learning vector quantisation (LVQ), and the genetic algorithm for clusteringPlusPlus (GENCLUSTPlusPlus).
Comments: 30 pages
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Journal reference: Neural Computing and Applications, 2021
DOI: 10.1007/s00521-020-05649-1
Cite as: arXiv:2102.08361 [cs.LG]
  (or arXiv:2102.08361v1 [cs.LG] for this version)

Submission history

From: Tarik A. Rashid [view email]
[v1] Fri, 1 Jan 2021 07:20:50 GMT (1769kb)

Link back to: arXiv, form interface, contact.