Class MatrixFactorizationModel

Object
org.apache.spark.mllib.recommendation.MatrixFactorizationModel
All Implemented Interfaces:
Serializable, org.apache.spark.internal.Logging, Saveable, scala.Serializable

public class MatrixFactorizationModel extends Object implements Saveable, scala.Serializable, org.apache.spark.internal.Logging
Model representing the result of matrix factorization.

param: rank Rank for the features in this model. param: userFeatures RDD of tuples where each tuple represents the userId and the features computed for this user. param: productFeatures RDD of tuples where each tuple represents the productId and the features computed for this product.

See Also:
Note:
If you create the model directly using constructor, please be aware that fast prediction requires cached user/product features and their associated partitioners.
  • Constructor Details

    • MatrixFactorizationModel

      public MatrixFactorizationModel(int rank, RDD<scala.Tuple2<Object,double[]>> userFeatures, RDD<scala.Tuple2<Object,double[]>> productFeatures)
  • Method Details

    • load

      public static MatrixFactorizationModel load(SparkContext sc, String path)
      Load a model from the given path.

      The model should have been saved by Saveable.save.

      Parameters:
      sc - Spark context used for loading model files.
      path - Path specifying the directory to which the model was saved.
      Returns:
      Model instance
    • rank

      public int rank()
    • userFeatures

      public RDD<scala.Tuple2<Object,double[]>> userFeatures()
    • productFeatures

      public RDD<scala.Tuple2<Object,double[]>> productFeatures()
    • predict

      public double predict(int user, int product)
      Predict the rating of one user for one product.
    • predict

      public RDD<Rating> predict(RDD<scala.Tuple2<Object,Object>> usersProducts)
      Predict the rating of many users for many products. The output RDD has an element per each element in the input RDD (including all duplicates) unless a user or product is missing in the training set.

      Parameters:
      usersProducts - RDD of (user, product) pairs.
      Returns:
      RDD of Ratings.
    • predict

      public JavaRDD<Rating> predict(JavaPairRDD<Integer,Integer> usersProducts)
      Java-friendly version of MatrixFactorizationModel.predict.
      Parameters:
      usersProducts - (undocumented)
      Returns:
      (undocumented)
    • recommendProducts

      public Rating[] recommendProducts(int user, int num)
      Recommends products to a user.

      Parameters:
      user - the user to recommend products to
      num - how many products to return. The number returned may be less than this.
      Returns:
      Rating objects, each of which contains the given user ID, a product ID, and a "score" in the rating field. Each represents one recommended product, and they are sorted by score, decreasing. The first returned is the one predicted to be most strongly recommended to the user. The score is an opaque value that indicates how strongly recommended the product is.
    • recommendUsers

      public Rating[] recommendUsers(int product, int num)
      Recommends users to a product. That is, this returns users who are most likely to be interested in a product.

      Parameters:
      product - the product to recommend users to
      num - how many users to return. The number returned may be less than this.
      Returns:
      Rating objects, each of which contains a user ID, the given product ID, and a "score" in the rating field. Each represents one recommended user, and they are sorted by score, decreasing. The first returned is the one predicted to be most strongly recommended to the product. The score is an opaque value that indicates how strongly recommended the user is.
    • save

      public void save(SparkContext sc, String path)
      Save this model to the given path.

      This saves: - human-readable (JSON) model metadata to path/metadata/ - Parquet formatted data to path/data/

      The model may be loaded using Loader.load.

      Specified by:
      save in interface Saveable
      Parameters:
      sc - Spark context used to save model data.
      path - Path specifying the directory in which to save this model. If the directory already exists, this method throws an exception.
    • recommendProductsForUsers

      public RDD<scala.Tuple2<Object,Rating[]>> recommendProductsForUsers(int num)
      Recommends top products for all users.

      Parameters:
      num - how many products to return for every user.
      Returns:
      [(Int, Array[Rating])] objects, where every tuple contains a userID and an array of rating objects which contains the same userId, recommended productID and a "score" in the rating field. Semantics of score is same as recommendProducts API
    • recommendUsersForProducts

      public RDD<scala.Tuple2<Object,Rating[]>> recommendUsersForProducts(int num)
      Recommends top users for all products.

      Parameters:
      num - how many users to return for every product.
      Returns:
      [(Int, Array[Rating])] objects, where every tuple contains a productID and an array of rating objects which contains the recommended userId, same productID and a "score" in the rating field. Semantics of score is same as recommendUsers API