cluster helpers

Data Lab helpers for clustering.

cluster.constructOutlines(x, y, clusterlabels)[source]

Construct the convex hull (outline) of points in (x,y) feature space,

Parameters:y (x,) – Location of points in (x,y) feature space (e,g, RA & Dec).
Returns:hull – The convex hull of points (x,y), an instance of scipy.spatial.qhull.ConvexHull.
Return type:instance


Given x & y coordinates as 1d sequences:

points = np.vstack((x,y)).T  # make 2-d array of correct shape
hull = constructOutlines(x,y)
plt.plot(points[hull.vertices,0], points[hull.vertices,1], 'r-', lw=2) # plot the hull
plt.plot(points[hull.vertices[0],0], points[hull.vertices[0],1], 'r-') # closing last point of the hull
cluster.findClusters(x, y, method='MiniBatchKMeans', **kwargs)[source]

Find 2D clusters from x & y data.

  • y (x,) – Location of points in (x,y) feature space, e,g, RA & Dec, but x & y need not be spatial in nature.
  • method (str) – Cluster finding method from sklearn.cluster to use. Default: ‘MiniBatchKMeans’ (a streaming implementation of KMeans), which is very fast, but not the most robust. ‘DBSCAN’ is much more robust, but MUCH slower. For other methods, consult sklearn.cluster.
  • **kwargs

    Any other keyword arguments will be passed to the cluster finding method. If method=’MiniBatchKMeans’ or ‘KMeans’, n_clusters (integer number of clusters to find) must be passed, e.g.

    clusters = findClusters(x,y,method='MiniBatchKMeans',n_clusters=3)