A distance scaling method to improve density-based clustering

Ye Zhu*, Kai Ming Ting, Maia Angelova

*Corresponding author for this work

Research output: Chapter in Book/Published conference outputConference publication

Abstract

Density-based clustering is able to find clusters of arbitrary sizes and shapes while effectively separating noise. Despite its advantage over other types of clustering, it is well-known that most density-based algorithms face the same challenge of finding clusters with varied densities. Recently, ReScale, a principled density-ratio preprocessing technique, enables a density-based clustering algorithm to identify clusters with varied densities. However, because the technique is based on one-dimensional scaling, it does not do well in datasets which require multi-dimensional scaling. In this paper, we propose a multi-dimensional scaling method, named DScale, which rescales based on the computed distance. It overcomes the key weakness of ReScale and requires one less parameter while maintaining the simplicity of the implementation. Our empirical evaluation shows that DScale has better clustering performance than ReScale for three existing density-based algorithms, i.e., DBSCAN, OPTICS and DP, on synthetic and real-world datasets.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 22nd Pacific-Asia Conference, PAKDD 2018, Proceedings
EditorsGeoffrey I. Webb, Dinh Phung, Mohadeseh Ganji, Lida Rashidi, Vincent S. Tseng, Bao Ho
PublisherSpringer-Verlag Italia Srl
Pages389-400
Number of pages12
ISBN (Print)9783319930398
DOIs
Publication statusPublished - 17 Jun 2018
Event22nd Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2018 - Melbourne, Australia
Duration: 3 Jun 20186 Jun 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10939 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2018
Country/TerritoryAustralia
CityMelbourne
Period3/06/186/06/18

Bibliographical note

Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.

Keywords

  • Density-based clustering
  • Density-ratio
  • Scaling
  • Varied densities

Fingerprint

Dive into the research topics of 'A distance scaling method to improve density-based clustering'. Together they form a unique fingerprint.

Cite this