org.apache.spark.ml.recommendation

ALS

class ALS extends Estimator[ALSModel] with ALSParams

Alternating Least Squares (ALS) matrix factorization.

ALS attempts to estimate the ratings matrix R as the product of two lower-rank matrices, X and Y, i.e. X * Yt = R. Typically these approximations are called 'factor' matrices. The general approach is iterative. During each iteration, one of the factor matrices is held constant, while the other is solved for using least squares. The newly-solved factor matrix is then held constant while solving for the other factor matrix.

This is a blocked implementation of the ALS factorization algorithm that groups the two sets of factors (referred to as "users" and "products") into blocks and reduces communication by only sending one copy of each user vector to each product block on each iteration, and only for the product blocks that need that user's feature vector. This is achieved by pre-computing some information about the ratings matrix to determine the "out-links" of each user (which blocks of products it will contribute to) and "in-link" information for each product (which of the feature vectors it receives from each user block it will depend on). This allows us to send only an array of feature vectors between each user block and product block, and have the product block find the users' ratings and update the products based on these messages.

For implicit preference data, the algorithm used is based on "Collaborative Filtering for Implicit Feedback Datasets", available at http://dx.doi.org/10.1109/ICDM.2008.22, adapted for the blocked approach used here.

Essentially instead of finding the low-rank approximations to the rating matrix R, this finds the approximations for a preference matrix P where the elements of P are 1 if r > 0 and 0 if r <= 0. The ratings then act as 'confidence' values related to strength of indicated user preferences rather than explicit ratings given to items.

Linear Supertypes
ALSParams, HasCheckpointInterval, HasPredictionCol, HasRegParam, HasMaxIter, Estimator[ALSModel], Params, Identifiable, PipelineStage, Logging, Serializable, Serializable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By inheritance
Inherited
  1. ALS
  2. ALSParams
  3. HasCheckpointInterval
  4. HasPredictionCol
  5. HasRegParam
  6. HasMaxIter
  7. Estimator
  8. Params
  9. Identifiable
  10. PipelineStage
  11. Logging
  12. Serializable
  13. Serializable
  14. AnyRef
  15. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ALS()

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def addOutputColumn(schema: StructType, colName: String, dataType: DataType): StructType

    Attributes
    protected
    Definition Classes
    Params
  7. val alpha: DoubleParam

    Param for the alpha parameter in the implicit preference formulation.

    Param for the alpha parameter in the implicit preference formulation.

    Definition Classes
    ALSParams
  8. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  9. def checkInputColumn(schema: StructType, colName: String, dataType: DataType): Unit

    Check whether the given schema contains an input column.

    Check whether the given schema contains an input column.

    colName

    Parameter name for the input column.

    dataType

    SQL DataType of the input column.

    Attributes
    protected
    Definition Classes
    Params
  10. val checkpointInterval: IntParam

    param for checkpoint interval

    param for checkpoint interval

    Definition Classes
    HasCheckpointInterval
  11. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  12. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  13. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  14. def explainParams(): String

    Returns the documentation of all params.

    Returns the documentation of all params.

    Definition Classes
    Params
  15. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  16. def fit(dataset: DataFrame, paramMap: ParamMap): ALSModel

    Fits a single model to the input data with provided parameter map.

    Fits a single model to the input data with provided parameter map.

    dataset

    input dataset

    paramMap

    Parameter map. These values override any specified in this Estimator's embedded ParamMap.

    returns

    fitted model

    Definition Classes
    ALSEstimator
  17. def fit(dataset: DataFrame, paramMaps: Array[ParamMap]): Seq[ALSModel]

    Fits multiple models to the input data with multiple sets of parameters.

    Fits multiple models to the input data with multiple sets of parameters. The default implementation uses a for loop on each parameter map. Subclasses could overwrite this to optimize multi-model training.

    dataset

    input dataset

    paramMaps

    An array of parameter maps. These values override any specified in this Estimator's embedded ParamMap.

    returns

    fitted models, matching the input parameter maps

    Definition Classes
    Estimator
  18. def fit(dataset: DataFrame, paramPairs: ParamPair[_]*): ALSModel

    Fits a single model to the input data with optional parameters.

    Fits a single model to the input data with optional parameters.

    dataset

    input dataset

    paramPairs

    Optional list of param pairs. These values override any specified in this Estimator's embedded ParamMap.

    returns

    fitted model

    Definition Classes
    Estimator
    Annotations
    @varargs()
  19. def get[T](param: Param[T]): T

    Gets the value of a parameter in the embedded param map.

    Gets the value of a parameter in the embedded param map.

    Attributes
    protected
    Definition Classes
    Params
  20. def getAlpha: Double

    Definition Classes
    ALSParams
  21. def getCheckpointInterval: Int

    Definition Classes
    HasCheckpointInterval
  22. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  23. def getImplicitPrefs: Boolean

    Definition Classes
    ALSParams
  24. def getItemCol: String

    Definition Classes
    ALSParams
  25. def getMaxIter: Int

    Definition Classes
    HasMaxIter
  26. val getNonnegative: Boolean

    Definition Classes
    ALSParams
  27. def getNumItemBlocks: Int

    Definition Classes
    ALSParams
  28. def getNumUserBlocks: Int

    Definition Classes
    ALSParams
  29. def getPredictionCol: String

    Definition Classes
    HasPredictionCol
  30. def getRank: Int

    Definition Classes
    ALSParams
  31. def getRatingCol: String

    Definition Classes
    ALSParams
  32. def getRegParam: Double

    Definition Classes
    HasRegParam
  33. def getUserCol: String

    Definition Classes
    ALSParams
  34. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  35. val implicitPrefs: BooleanParam

    Param to decide whether to use implicit preference.

    Param to decide whether to use implicit preference.

    Definition Classes
    ALSParams
  36. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  37. def isSet(param: Param[_]): Boolean

    Checks whether a param is explicitly set.

    Checks whether a param is explicitly set.

    Definition Classes
    Params
  38. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  39. val itemCol: Param[String]

    Param for the column name for item ids.

    Param for the column name for item ids.

    Definition Classes
    ALSParams
  40. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  41. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  42. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  43. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  44. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  45. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  46. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  47. def logName: String

    Attributes
    protected
    Definition Classes
    Logging
  48. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  49. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  50. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  51. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  52. val maxIter: IntParam

    param for max number of iterations

    param for max number of iterations

    Definition Classes
    HasMaxIter
  53. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  54. val nonnegative: BooleanParam

    Param for whether to apply nonnegativity constraints.

    Param for whether to apply nonnegativity constraints.

    Definition Classes
    ALSParams
  55. final def notify(): Unit

    Definition Classes
    AnyRef
  56. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  57. val numItemBlocks: IntParam

    Param for number of item blocks.

    Param for number of item blocks.

    Definition Classes
    ALSParams
  58. val numUserBlocks: IntParam

    Param for number of user blocks.

    Param for number of user blocks.

    Definition Classes
    ALSParams
  59. val paramMap: ParamMap

    Internal param map.

    Internal param map.

    Attributes
    protected
    Definition Classes
    Params
  60. def params: Array[Param[_]]

    Returns all params.

    Returns all params.

    Definition Classes
    Params
  61. val predictionCol: Param[String]

    param for prediction column name

    param for prediction column name

    Definition Classes
    HasPredictionCol
  62. val rank: IntParam

    Param for rank of the matrix factorization.

    Param for rank of the matrix factorization.

    Definition Classes
    ALSParams
  63. val ratingCol: Param[String]

    Param for the column name for ratings.

    Param for the column name for ratings.

    Definition Classes
    ALSParams
  64. val regParam: DoubleParam

    param for regularization parameter

    param for regularization parameter

    Definition Classes
    HasRegParam
  65. def set[T](param: Param[T], value: T): ALS.this.type

    Sets a parameter in the embedded param map.

    Sets a parameter in the embedded param map.

    Attributes
    protected
    Definition Classes
    Params
  66. def setAlpha(value: Double): ALS.this.type

  67. def setCheckpointInterval(value: Int): ALS.this.type

  68. def setImplicitPrefs(value: Boolean): ALS.this.type

  69. def setItemCol(value: String): ALS.this.type

  70. def setMaxIter(value: Int): ALS.this.type

  71. def setNonnegative(value: Boolean): ALS.this.type

  72. def setNumBlocks(value: Int): ALS.this.type

    Sets both numUserBlocks and numItemBlocks to the specific value.

  73. def setNumItemBlocks(value: Int): ALS.this.type

  74. def setNumUserBlocks(value: Int): ALS.this.type

  75. def setPredictionCol(value: String): ALS.this.type

  76. def setRank(value: Int): ALS.this.type

  77. def setRatingCol(value: String): ALS.this.type

  78. def setRegParam(value: Double): ALS.this.type

  79. def setUserCol(value: String): ALS.this.type

  80. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  81. def toString(): String

    Definition Classes
    AnyRef → Any
  82. def transformSchema(schema: StructType, paramMap: ParamMap): StructType

    :: DeveloperApi ::

    :: DeveloperApi ::

    Derives the output schema from the input schema and parameters. The schema describes the columns and types of the data.

    schema

    Input schema to this stage

    paramMap

    Parameters passed to this stage

    returns

    Output schema from this stage

    Definition Classes
    ALSPipelineStage
  83. def transformSchema(schema: StructType, paramMap: ParamMap, logging: Boolean): StructType

    Derives the output schema from the input schema and parameters, optionally with logging.

    Derives the output schema from the input schema and parameters, optionally with logging.

    Attributes
    protected
    Definition Classes
    PipelineStage
  84. val userCol: Param[String]

    Param for the column name for user ids.

    Param for the column name for user ids.

    Definition Classes
    ALSParams
  85. def validate(): Unit

    Validates parameter values stored internally.

    Validates parameter values stored internally. Raise an exception if any parameter value is invalid.

    Definition Classes
    Params
  86. def validate(paramMap: ParamMap): Unit

    Validates parameter values stored internally plus the input parameter map.

    Validates parameter values stored internally plus the input parameter map. Raises an exception if any parameter is invalid.

    Definition Classes
    Params
  87. def validateAndTransformSchema(schema: StructType, paramMap: ParamMap): StructType

    Validates and transforms the input schema.

    Validates and transforms the input schema.

    schema

    input schema

    paramMap

    extra params

    returns

    output schema

    Attributes
    protected
    Definition Classes
    ALSParams
  88. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  89. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  90. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from ALSParams

Inherited from HasCheckpointInterval

Inherited from HasPredictionCol

Inherited from HasRegParam

Inherited from HasMaxIter

Inherited from Estimator[ALSModel]

Inherited from Params

Inherited from Identifiable

Inherited from PipelineStage

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Parameters

A list of (hyper-)parameter keys this algorithm can take. Users can set and get the parameter values through setters and getters, respectively.

Members

Parameter setters

Parameter getters