Abstract:
For dataset with uneven density distribution, the density peak clustering algorithm (DPC) is error-prone in determining the cluster centers and assigning data points. To solve the above issues, this paper proposes a clustering algorithm based on sparsity factor and non-shared neighbors. The cutoff distance of data point was dynamically adjusted according to its sparsity factor, and the geodesic distance was used to calculate the local density of data point so that the clustering center was less affected by the sparse distribution of the data set. The inconsistency factor of the associated point pairs on the paths was calculated where the clustering centers were located based on the non-shared nearest neighbors of the data points. The clustering results were obtained by removing the edges corresponding to the largest inconsistency factor on the minimum spanning tree. The experimental results show that the proposed algorithm outperforms the comparison algorithm.