Scikit Learn-聚类方法 (Scikit Learn - Clustering Methods)

Here, we will study about the clustering methods in Sklearn which will help in identification of any similarity in the data samples.


Clustering methods, one of the most useful unsupervised ML methods, used to find similarity & relationship patterns among data samples. After that, they cluster those samples into groups having similarity based on features. Clustering determines the intrinsic grouping among the present unlabeled data, that’s why it is important.

聚类方法是最有用的无监督ML方法之一,用于查找数据样本之间的相似性和关系模式。 之后,他们将这些样本基于特征聚类为具有相似性的组。 聚类决定了当前未标记数据之间的固有分组,这就是为什么它很重要。

The Scikit-learn library have sklearn.cluster to perform clustering of unlabeled data. Under this module scikit-leran have the following clustering methods −

Scikit-learn库具有sklearn.cluster以执行未标记数据的聚类。 在这个模块下scikit-leran具有以下聚类方法-

均值 (KMeans)

This algorithm computes the centroids and iterates until it finds optimal centroid. It requires the number of clusters to be specified that’s why it assumes that they are already known. The main logic of this algorithm is to cluster the data separating samples in n number of groups of equal variances by minimizing the criteria known as the inertia. The number of clusters identified by algorithm is represented by ‘K.

该算法计算质心并进行迭代,直到找到最佳质心为止。 它要求指定簇的数量,这就是为什么它假定它们已经已知的原因。 该算法的主要逻辑是,通过最小化称为惯性的标准,将分离样本的数据聚类为n个等方差组。 用算法标识的簇数用'K表示。

Scikit-learn have sklearn.cluster.KMeans module to perform K-Means clustering. While computing cluster centers and value of inertia, the parameter named sample_weight allows sklearn.cluster.KMeans module to assign more weight to some samples.

Scikit-learn具有sklearn.cluster.KMeans模块来执行K-Means聚类。 在计算聚类中心和惯性值时,名为sample_weight的参数允许sklearn.cluster.KMeans模块为某些样本分配更多的权重。

亲和力传播 (Affinity Propagation)

This algorithm is based on the concept of ‘message passing’ between different pairs of samples until convergence. It does not require the number of clusters to be specified before running the algorithm. The algorithm has a time complexity of the order

