With SciPy how do I get clustering for k=? with doing hierarchical clustering

If you want to cut the tree at a specific level, then use:

fl = fcluster(cl,numclust,criterion='maxclust')

where cl is the output of your linkage method and numclust is the number of clusters you want to get.


Hierarchical clustering allows you to zoom in and out to get fine or coarse grained views of the clustering. So, it might not be clear in advance which level of the dendrogram to cut. A simple solution is to get the cluster membership at every level. It is also possible to select the desired number of clusters.

import numpy as np
from scipy import cluster
np.random.seed(23)
X = np.random.randn(20, 4)
Z = cluster.hierarchy.ward(X)
cutree_all = cluster.hierarchy.cut_tree(Z)
cutree1 = cluster.hierarchy.cut_tree(Z, n_clusters=[5, 10])
print("membership at all levels \n", cutree_all) 
print("membership for 5 and 10 clusters \n", cutree1)