org.apache.spark.ml.Predictor<FeaturesType,Learner,M>

Type Parameters:: FeaturesType - Type of features. E.g., VectorUDT for vector features.; Learner - Specialization of this class. If you subclass this type, use this type parameter to specify the concrete type.; M - Specialization of PredictionModel. If you subclass this type, use this type parameter to specify the concrete type for the corresponding model.

All Implemented Interfaces:: Serializable, org.apache.spark.internal.Logging, Params, HasFeaturesCol, HasLabelCol, HasPredictionCol, PredictorParams, Identifiable

Direct Known Subclasses:: Classifier, Regressor

public abstract class Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> extends Estimator<M> implements PredictorParams

Abstraction for prediction problems (regression and classification). It accepts all NumericType labels and will automatically cast it to DoubleType in fit(). If this predictor supports weights, it accepts all NumericType weights, which will be automatically casted to DoubleType in fit().

See Also:

Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

Predictor()
Method Summary

Modifier and Type

Method

Description

abstract Learner

copy(ParamMap extra)

Creates a copy of this instance with the same UID and some extra params.

final Param<String>

featuresCol()

Param for features column name.

M

fit(Dataset<?> dataset)

Fits a model to the input data.

final Param<String>

labelCol()

Param for label column name.

final Param<String>

predictionCol()

Param for prediction column name.

Learner

setFeaturesCol(String value)

Learner

setLabelCol(String value)

Learner

setPredictionCol(String value)

StructType

transformSchema(StructType schema)

Check transform validity and derive the output schema from the input schema.

Methods inherited from class org.apache.spark.ml.Estimator
fit, fit, fit, fit

Methods inherited from class org.apache.spark.ml.PipelineStage
params

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol
getFeaturesCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasLabelCol
getLabelCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol
getPredictionCol

Methods inherited from interface org.apache.spark.ml.util.Identifiable
toString, uid

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn

Methods inherited from interface org.apache.spark.ml.PredictorParams
validateAndTransformSchema

Constructor Details
- Predictor
  
  public Predictor()
Method Details
- copy
  
  public abstract Learner copy(ParamMap extra)
  
  Description copied from interface: Params
  
  Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See defaultCopy().
  
  Specified by:
  
  copy in interface Params
  
  Specified by:
  
  copy in class Estimator<M extends PredictionModel<FeaturesType,M>>
  
  Parameters:
  
  extra - (undocumented)
  
  Returns:
  
  (undocumented)
- featuresCol
  
  public final Param<String> featuresCol()
  
  Description copied from interface: HasFeaturesCol
  
  Param for features column name.
  
  Specified by:
  
  featuresCol in interface HasFeaturesCol
  
  Returns:
  
  (undocumented)
- fit
  
  public M fit(Dataset<?> dataset)
  
  Description copied from class: Estimator
  
  Fits a model to the input data.
  
  Specified by:
  
  fit in class Estimator<M extends PredictionModel<FeaturesType,M>>
  
  Parameters:
  
  dataset - (undocumented)
  
  Returns:
  
  (undocumented)
- labelCol
  
  public final Param<String> labelCol()
  
  Description copied from interface: HasLabelCol
  
  Param for label column name.
  
  Specified by:
  
  labelCol in interface HasLabelCol
  
  Returns:
  
  (undocumented)
- predictionCol
  
  public final Param<String> predictionCol()
  
  Description copied from interface: HasPredictionCol
  
  Param for prediction column name.
  
  Specified by:
  
  predictionCol in interface HasPredictionCol
  
  Returns:
  
  (undocumented)
- setFeaturesCol
  
  public Learner setFeaturesCol(String value)
- setLabelCol
  
  public Learner setLabelCol(String value)
- setPredictionCol
  
  public Learner setPredictionCol(String value)
- transformSchema
  
  public StructType transformSchema(StructType schema)
  
  Description copied from class: PipelineStage
  
  Check transform validity and derive the output schema from the input schema.
  We check validity for interactions between parameters during transformSchema and raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled by Param.validate().
  Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.
  
  Specified by:
  
  transformSchema in class PipelineStage
  
  Parameters:
  
  schema - (undocumented)
  
  Returns:
  
  (undocumented)

Class Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>>

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class org.apache.spark.ml.Estimator

Methods inherited from class org.apache.spark.ml.PipelineStage

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasLabelCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol

Methods inherited from interface org.apache.spark.ml.util.Identifiable

Methods inherited from interface org.apache.spark.internal.Logging

Methods inherited from interface org.apache.spark.ml.param.Params

Methods inherited from interface org.apache.spark.ml.PredictorParams

Constructor Details

Predictor

Method Details

copy

featuresCol

fit

labelCol

predictionCol

setFeaturesCol

setLabelCol

setPredictionCol

transformSchema