k-means clustering 2. Quick start

Clustering using the k-means clustering algorithm in Praat is done by selecting a PatternList and choosing To Categories.... In the appearing requester the number of sought after clusters (unique categories) can be specified. The cluster size ratio constraint (z) imposes a constraint on the output such that cluster size(x) / cluster size(y) > z for all clusters x and y in the resulting set of clusters. Valid values of z are 0 < z <= 1 where values near 0 imposes practically no constraints on the cluster sizes and a value of 1 tells the algorithm to attempt to create clusters of equal size. The size ratio constraint is enforced in a very naive fashion, by random reseeding. Since this can be a rather time consuming process it is possible to set an upper bound on the number of reseeds done by the algorithm. This upper bound is defined by the parameter Maximum number of reseeds. It should be noted however that normally there's no need to use the size ratio constraint, selecting the desired number of clusters will, on average, result in clusters of roughly equal size, given well distributed data.

Links to this page

© Ola Söder, May 29, 2008