Load this file with "load clustering".

Functions for clustering data.

functionkmeanscluster(x : numerical, k : index)

Cluster the rows in x in k clusters It uses the algorithm proposed by Lloyd, and used by Steinhaus, MacQueen. The algorithm starts with a random partition (cluster). Then it computes the means of the clusters, and associates each point to the cluster with the closest mean. It loops this procedure until there are no changes. The function works for multi-dimensional x too. The means are then vector means, and the distance to the mean is measured in Euclidean distance. x : rows containing the data points k : number of clusters that should be used Returns j : indices of the clusters the rows should belong to.

functionsimilaritycluster(S : numerical, k : index)

Cluster data depending on the similarity matrix S This clustering uses the first k eigenvalue of S, and clusters the entries of their eigenvalues. S : similarity matrix (symmetric) k : number of clusters Returns j : indices of the clusters the rows should belong to.

functioneigencluster(x : numerical, k : index)

Cluster the rows in x in k clusters This algorithm uses the similarity matrix S, which contains the Euclidean distances of two rows in x. Then it uses the the function similaritycluster() to get the clustering of the similarity matrix. x : rows containing the data points k : number of clusters that should be used Returns j : indices of the clusters the rows should belong to.