基于局部密度信息熵均值的密度峰值聚类算法

2022,30(3):192-197
唐风扬, 覃仁超, 熊健
西南科技大学
摘要:针对密度峰值聚类算法(The density peak clustering algorithm,DPC)聚类结果受距离阈值dc参数影响较大的问题,提出一种局部密度捕获范围以及利用局部密度信息熵均值进行加权优化的方法(简称为LDDPC),在DPC算法选取到错误的距离阈值dc时,通过对最大密度邻近点的相对距离进行加权,重新获得正确的分类数量和聚类中心。经典数据集的实验结果表明,基于局部密度信息熵均值加权优化能避免 DPC 算法中距离阈值dc对聚类结果的影响,提高分类的正确率。
关键词:聚类算法;密度峰值;信息熵;加权;局部密度

Optimized Density peaks Clustering algorithm based on Local Density information entropy

Abstract:Aiming at the problem that the density peak clustering algorithm (DPC) clustering result is greatly affected by the distance threshold dc parameter, a method of local density capture range and weighted optimization using the mean value of local density information (abbreviated as For LDDPC), when the DPC algorithm selects the wrong distance threshold dc, by weighting the relative distance of the neighboring points of the maximum density, the correct number of classifications and cluster centers are obtained again. The experimental results of the classic data set show that the weighted optimization based on the mean value of the local density information entropy can avoid the influence of the distance threshold dc in the DPC algorithm on the clustering results, and improve the accuracy of classification.
Key words:Clustering algorithm;Density peaks;Information entropy;Weighting;Local density
收稿日期:2021-09-07
基金项目:四川省科技厅重点研发项目(20ZDYF0978)
     下载PDF全文