Clustering Data

Functions for clustering data.

Load this file with "load clustering".

function kmeanscluster (x : numerical, k : index)

  Cluster the rows in x in k clusters
 
  It uses the algorithm proposed by Lloyd, and used by Steinhaus,
  MacQueen. The algorithm starts with a random partition (cluster).
  Then it computes the means of the clusters, and associates each
  point to the cluster with the closest mean. It loops this
  procedure until there are no changes.
 
  The function works for multi-dimensional x too. The means are then
  vector means, and the diestance to the mean is measured in Euclidean
  distance.
 
  x : rows containing the data points
  k : number of clusters that should be used
 
  Returns j : indices of the clusters the rows should belong to.

function similaritycluster (S : numerical, k : index)

  Cluster data depending on the similarity matrix S
 
  This clustering uses the first k eigenvalue of S, and clusters
  the entries of their eigenvalues.
 
  S : similarity matrix (symmetric)
  k : number of clusters
 
  Returns j : indices of the clusters the rows should belong to.

function eigencluster (x : numerical, k : index)

  Cluster the rows in x in k clusters
 
  This algorithm uses the similarity matrix S, which contains the
  Euclidean distances of two rows in x. Then it uses the the function
  similaritycluster to get the clustering of the similarity
  matrix.
 
  x : rows containing the data points
  k : number of clusters that should be used
 
  Returns j : indices of the clusters the rows should belong to.

Examples