Package org.apache.spark.ml.clustering
package org.apache.spark.ml.clustering
-
ClassDescriptionA bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark.Model fitted by BisectingKMeans.Common params for BisectingKMeans and BisectingKMeansModelSummary of BisectingKMeans.Helper class for storing model dataSummary of clustering algorithms.Distributed model fitted by
LDA
.ExpectationAggregator computes the partial expectation results.Gaussian Mixture clustering.Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i with probability weights(i).Common params for GaussianMixture and GaussianMixtureModelSummary of GaussianMixture.A writer for KMeans that handles the "internal" (or default) formatK-means clustering with support for k-means|| initialization proposed by Bahmani et al.KMeansAggregator computes the distances and updates the centers for blocks in sparse or dense matrix in an online fashion.Model fitted by KMeans.Common params for KMeans and KMeansModelSummary of KMeans.Latent Dirichlet Allocation (LDA), a topic model designed for text documents.Model fitted byLDA
.Local (non-distributed) model fitted byLDA
.A writer for KMeans that handles the "pmml" formatPower Iteration Clustering (PIC), a scalable graph clustering algorithm developed by Lin and Cohen.Common params for PowerIterationClustering