Hierarchical clustering that takes advantage of both density-peak and density-connectivity

Ye Zhu*, Kai Ming Ting, Yuan Jin, Maia Angelova

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This paper focuses on density-based clustering, particularly the Density Peak (DP) algorithm and the one based on density-connectivity DBSCAN; and proposes a new method which takes advantage of the individual strengths of these two methods to yield a density-based hierarchical clustering algorithm. We first formally define the types of clusters DP and DBSCAN are designed to detect; and then identify the kinds of distributions that DP and DBSCAN individually fail to detect all clusters in a dataset. These identified weaknesses inspire us to formally define a new kind of clusters and propose a new method called DC-HDP to overcome these weaknesses to identify clusters with arbitrary shapes and varied densities. In addition, the new method produces a richer clustering result in terms of hierarchy or dendrogram for a better understanding of cluster structures. Our empirical evaluation results show that DC-HDP produces the best clustering results on 28 datasets in comparison with 8 state-of-the-art clustering algorithms.

Original languageEnglish
Article number101871
Number of pages16
JournalInformation Systems
Volume103
Early online date28 Aug 2021
DOIs
Publication statusPublished - Jan 2022

Keywords

  • Density connectivity
  • Density peak
  • Density-based clustering
  • Hierarchical clustering
  • Local contrast
  • Varied density

Fingerprint

Dive into the research topics of 'Hierarchical clustering that takes advantage of both density-peak and density-connectivity'. Together they form a unique fingerprint.

Cite this