Class Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>>

Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Estimator<M>
org.apache.spark.ml.Predictor<FeaturesType,Learner,M>
Type Parameters:
FeaturesType - Type of features. E.g., VectorUDT for vector features.
Learner - Specialization of this class. If you subclass this type, use this type parameter to specify the concrete type.
M - Specialization of PredictionModel. If you subclass this type, use this type parameter to specify the concrete type for the corresponding model.
All Implemented Interfaces:
Serializable, org.apache.spark.internal.Logging, Params, HasFeaturesCol, HasLabelCol, HasPredictionCol, PredictorParams, Identifiable, scala.Serializable
Direct Known Subclasses:
Classifier, Regressor

public abstract class Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> extends Estimator<M> implements PredictorParams
Abstraction for prediction problems (regression and classification). It accepts all NumericType labels and will automatically cast it to DoubleType in fit(). If this predictor supports weights, it accepts all NumericType weights, which will be automatically casted to DoubleType in fit().

See Also:
  • Constructor Details

    • Predictor

      public Predictor()
  • Method Details

    • copy

      public abstract Learner copy(ParamMap extra)
      Description copied from interface: Params
      Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See defaultCopy().
      Specified by:
      copy in interface Params
      Specified by:
      copy in class Estimator<M extends PredictionModel<FeaturesType,M>>
      Parameters:
      extra - (undocumented)
      Returns:
      (undocumented)
    • featuresCol

      public final Param<String> featuresCol()
      Description copied from interface: HasFeaturesCol
      Param for features column name.
      Specified by:
      featuresCol in interface HasFeaturesCol
      Returns:
      (undocumented)
    • fit

      public M fit(Dataset<?> dataset)
      Description copied from class: Estimator
      Fits a model to the input data.
      Specified by:
      fit in class Estimator<M extends PredictionModel<FeaturesType,M>>
      Parameters:
      dataset - (undocumented)
      Returns:
      (undocumented)
    • labelCol

      public final Param<String> labelCol()
      Description copied from interface: HasLabelCol
      Param for label column name.
      Specified by:
      labelCol in interface HasLabelCol
      Returns:
      (undocumented)
    • predictionCol

      public final Param<String> predictionCol()
      Description copied from interface: HasPredictionCol
      Param for prediction column name.
      Specified by:
      predictionCol in interface HasPredictionCol
      Returns:
      (undocumented)
    • setFeaturesCol

      public Learner setFeaturesCol(String value)
    • setLabelCol

      public Learner setLabelCol(String value)
    • setPredictionCol

      public Learner setPredictionCol(String value)
    • transformSchema

      public StructType transformSchema(StructType schema)
      Description copied from class: PipelineStage
      Check transform validity and derive the output schema from the input schema.

      We check validity for interactions between parameters during transformSchema and raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled by Param.validate().

      Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.

      Specified by:
      transformSchema in class PipelineStage
      Parameters:
      schema - (undocumented)
      Returns:
      (undocumented)