public class CrossValidator extends Estimator<CrossValidatorModel> implements CrossValidatorParams, HasParallelism, HasCollectSubModels, MLWritable, org.apache.spark.internal.Logging
| Constructor and Description |
|---|
CrossValidator() |
CrossValidator(String uid) |
| Modifier and Type | Method and Description |
|---|---|
BooleanParam |
collectSubModels()
Param for whether to collect a list of sub-models trained during tuning.
|
CrossValidator |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
Param<Estimator<?>> |
estimator()
param for the estimator to be validated
|
Param<ParamMap[]> |
estimatorParamMaps()
param for estimator param maps
|
Param<Evaluator> |
evaluator()
param for the evaluator used to select hyper-parameters that maximize the validated metric
|
CrossValidatorModel |
fit(Dataset<?> dataset)
Fits a model to the input data.
|
Param<String> |
foldCol()
Param for the column name of user specified fold number.
|
static CrossValidator |
load(String path) |
IntParam |
numFolds()
Param for number of folds for cross validation.
|
IntParam |
parallelism()
The number of threads to use when running parallel algorithms.
|
static MLReader<CrossValidator> |
read() |
LongParam |
seed()
Param for random seed.
|
CrossValidator |
setCollectSubModels(boolean value)
Whether to collect submodels when fitting.
|
CrossValidator |
setEstimator(Estimator<?> value) |
CrossValidator |
setEstimatorParamMaps(ParamMap[] value) |
CrossValidator |
setEvaluator(Evaluator value) |
CrossValidator |
setFoldCol(String value) |
CrossValidator |
setNumFolds(int value) |
CrossValidator |
setParallelism(int value)
Set the maximum level of parallelism to evaluate models in parallel.
|
CrossValidator |
setSeed(long value) |
StructType |
transformSchema(StructType schema)
Check transform validity and derive the output schema from the input schema.
|
String |
uid()
An immutable unique ID for the object and its derivatives.
|
MLWriter |
write()
Returns an
MLWriter instance for this ML instance. |
paramsequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetFoldCol, getNumFoldsgetEstimator, getEstimatorParamMaps, getEvaluator, logTuningParams, transformSchemaImplclear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwntoStringgetExecutionContext, getParallelismgetCollectSubModelssave$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitializepublic CrossValidator(String uid)
public CrossValidator()
public static MLReader<CrossValidator> read()
public static CrossValidator load(String path)
public final BooleanParam collectSubModels()
HasCollectSubModelscollectSubModels in interface HasCollectSubModelspublic IntParam parallelism()
HasParallelismparallelism in interface HasParallelismpublic IntParam numFolds()
CrossValidatorParamsnumFolds in interface CrossValidatorParamspublic Param<String> foldCol()
CrossValidatorParamsCrossValidator won't do random k-fold split. Note that this column should be
integer type with range [0, numFolds) and Spark will throw exception on out-of-range
fold numbers.foldCol in interface CrossValidatorParamspublic Param<Estimator<?>> estimator()
ValidatorParamsestimator in interface ValidatorParamspublic Param<ParamMap[]> estimatorParamMaps()
ValidatorParamsestimatorParamMaps in interface ValidatorParamspublic Param<Evaluator> evaluator()
ValidatorParamsevaluator in interface ValidatorParamspublic final LongParam seed()
HasSeedpublic String uid()
Identifiableuid in interface Identifiablepublic CrossValidator setEstimator(Estimator<?> value)
public CrossValidator setEstimatorParamMaps(ParamMap[] value)
public CrossValidator setEvaluator(Evaluator value)
public CrossValidator setNumFolds(int value)
public CrossValidator setSeed(long value)
public CrossValidator setFoldCol(String value)
public CrossValidator setParallelism(int value)
value - (undocumented)public CrossValidator setCollectSubModels(boolean value)
Note: If set this param, when you save the returned model, you can set an option
"persistSubModels" to be "true" before saving, in order to save these submodels.
You can check documents of
CrossValidatorModel.CrossValidatorModelWriter
for more information.
value - (undocumented)public CrossValidatorModel fit(Dataset<?> dataset)
Estimatorfit in class Estimator<CrossValidatorModel>dataset - (undocumented)public StructType transformSchema(StructType schema)
PipelineStage
We check validity for interactions between parameters during transformSchema and
raise an exception if any parameter value is invalid. Parameter value checks which
do not depend on other parameters are handled by Param.validate().
Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.
transformSchema in class PipelineStageschema - (undocumented)public CrossValidator copy(ParamMap extra)
ParamsdefaultCopy().copy in interface Paramscopy in class Estimator<CrossValidatorModel>extra - (undocumented)public MLWriter write()
MLWritableMLWriter instance for this ML instance.write in interface MLWritable