A bisecting k-means algorithm based on the paper “A comparison of
document clustering techniques” by Steinbach, Karypis, and Kumar,
with modification to fit Spark.
The algorithm starts from a single cluster that contains all points.
Iteratively it finds divisible clusters on the bottom level and
bisects each of them using k-means, until there are k leaf
clusters in total or no leaf clusters are divisible.
The bisecting steps of clusters on the same level are grouped
together to increase parallelism. If bisecting all divisible
clusters on the bottom level would result more than k leaf
clusters, larger clusters get higher priority.
New in version 2.0.0.
See the original paper 
Steinbach, M. et al. “A Comparison of Document Clustering Techniques.” (2000).
KDD Workshop on Text Mining, 2000
train(rdd[, k, maxIterations, …])
Runs the bisecting k-means algorithm return the model.
Training points as an RDD of Vector or convertible
The desired number of leaf clusters. The actual number could
be smaller if there are no divisible leaf clusters.
Maximum number of iterations allowed to split clusters.
Minimum number of points (if >= 1.0) or the minimum proportion
of points (if < 1.0) of a divisible cluster.
Random seed value for cluster initialization.
(default: -1888008604 from classOf[BisectingKMeans].getName.##)