org.apache.spark.mllib.recommendation
Class MatrixFactorizationModel

Object
  extended by org.apache.spark.mllib.recommendation.MatrixFactorizationModel
All Implemented Interfaces:
java.io.Serializable, Logging, Saveable

public class MatrixFactorizationModel
extends Object
implements Saveable, scala.Serializable, Logging

Model representing the result of matrix factorization.

Note: If you create the model directly using constructor, please be aware that fast prediction requires cached user/product features and their associated partitioners.

param: rank Rank for the features in this model. param: userFeatures RDD of tuples where each tuple represents the userId and the features computed for this user. param: productFeatures RDD of tuples where each tuple represents the productId and the features computed for this product.

See Also:
Serialized Form

Constructor Summary
MatrixFactorizationModel(int rank, RDD<scala.Tuple2<Object,double[]>> userFeatures, RDD<scala.Tuple2<Object,double[]>> productFeatures)
           
 
Method Summary
static MatrixFactorizationModel load(SparkContext sc, String path)
           
 double predict(int user, int product)
          Predict the rating of one user for one product.
 JavaRDD<Rating> predict(JavaPairRDD<Integer,Integer> usersProducts)
          Java-friendly version of MatrixFactorizationModel.predict.
 RDD<Rating> predict(RDD<scala.Tuple2<Object,Object>> usersProducts)
          Predict the rating of many users for many products.
 RDD<scala.Tuple2<Object,double[]>> productFeatures()
           
 int rank()
           
 Rating[] recommendProducts(int user, int num)
          Recommends products to a user.
 RDD<scala.Tuple2<Object,Rating[]>> recommendProductsForUsers(int num)
          Recommends topK products for all users.
 Rating[] recommendUsers(int product, int num)
          Recommends users to a product.
 RDD<scala.Tuple2<Object,Rating[]>> recommendUsersForProducts(int num)
          Recommends topK users for all products.
 void save(SparkContext sc, String path)
          Save this model to the given path.
 RDD<scala.Tuple2<Object,double[]>> userFeatures()
           
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

MatrixFactorizationModel

public MatrixFactorizationModel(int rank,
                                RDD<scala.Tuple2<Object,double[]>> userFeatures,
                                RDD<scala.Tuple2<Object,double[]>> productFeatures)
Method Detail

load

public static MatrixFactorizationModel load(SparkContext sc,
                                            String path)

rank

public int rank()

userFeatures

public RDD<scala.Tuple2<Object,double[]>> userFeatures()

productFeatures

public RDD<scala.Tuple2<Object,double[]>> productFeatures()

predict

public double predict(int user,
                      int product)
Predict the rating of one user for one product.


predict

public RDD<Rating> predict(RDD<scala.Tuple2<Object,Object>> usersProducts)
Predict the rating of many users for many products. The output RDD has an element per each element in the input RDD (including all duplicates) unless a user or product is missing in the training set.

Parameters:
usersProducts - RDD of (user, product) pairs.
Returns:
RDD of Ratings.

predict

public JavaRDD<Rating> predict(JavaPairRDD<Integer,Integer> usersProducts)
Java-friendly version of MatrixFactorizationModel.predict.

Parameters:
usersProducts - (undocumented)
Returns:
(undocumented)

recommendProducts

public Rating[] recommendProducts(int user,
                                  int num)
Recommends products to a user.

Parameters:
user - the user to recommend products to
num - how many products to return. The number returned may be less than this.
Returns:
Rating objects, each of which contains the given user ID, a product ID, and a "score" in the rating field. Each represents one recommended product, and they are sorted by score, decreasing. The first returned is the one predicted to be most strongly recommended to the user. The score is an opaque value that indicates how strongly recommended the product is.

recommendUsers

public Rating[] recommendUsers(int product,
                               int num)
Recommends users to a product. That is, this returns users who are most likely to be interested in a product.

Parameters:
product - the product to recommend users to
num - how many users to return. The number returned may be less than this.
Returns:
Rating objects, each of which contains a user ID, the given product ID, and a "score" in the rating field. Each represents one recommended user, and they are sorted by score, decreasing. The first returned is the one predicted to be most strongly recommended to the product. The score is an opaque value that indicates how strongly recommended the user is.

save

public void save(SparkContext sc,
                 String path)
Description copied from interface: Saveable
Save this model to the given path.

This saves: - human-readable (JSON) model metadata to path/metadata/ - Parquet formatted data to path/data/

The model may be loaded using Loader.load.

Specified by:
save in interface Saveable
Parameters:
sc - Spark context used to save model data.
path - Path specifying the directory in which to save this model. If the directory already exists, this method throws an exception.

recommendProductsForUsers

public RDD<scala.Tuple2<Object,Rating[]>> recommendProductsForUsers(int num)
Recommends topK products for all users.

Parameters:
num - how many products to return for every user.
Returns:
[(Int, Array[Rating])] objects, where every tuple contains a userID and an array of rating objects which contains the same userId, recommended productID and a "score" in the rating field. Semantics of score is same as recommendProducts API

recommendUsersForProducts

public RDD<scala.Tuple2<Object,Rating[]>> recommendUsersForProducts(int num)
Recommends topK users for all products.

Parameters:
num - how many users to return for every product.
Returns:
[(Int, Array[Rating])] objects, where every tuple contains a productID and an array of rating objects which contains the recommended userId, same productID and a "score" in the rating field. Semantics of score is same as recommendUsers API