Functions for clustering data.

Load this file with "load clustering".

function kmeanscluster (x : numerical, k : index) Cluster the rows in x in k clusters It uses the algorithm proposed by Lloyd, and used by Steinhaus, MacQueen. The algorithm starts with a random partition (cluster). Then it computes the means of the clusters, and associates each point to the cluster with the closest mean. It loops this procedure until there are no changes. The function works for multi-dimensional x too. The means are then vector means, and the diestance to the mean is measured in Euclidean distance. x : rows containing the data points k : number of clusters that should be used Returns j : indices of the clusters the rows should belong to.

function similaritycluster (S : numerical, k : index) Cluster data depending on the similarity matrix S This clustering uses the first k eigenvalue of S, and clusters the entries of their eigenvalues. S : similarity matrix (symmetric) k : number of clusters Returns j : indices of the clusters the rows should belong to.

function eigencluster (x : numerical, k : index) Cluster the rows in x in k clusters This algorithm uses the similarity matrix S, which contains the Euclidean distances of two rows in x. Then it uses the the function similaritycluster to get the clustering of the similarity matrix. x : rows containing the data points k : number of clusters that should be used Returns j : indices of the clusters the rows should belong to.