org.apache.spark.ml.PipelineStage

org.apache.spark.ml.Transformer

org.apache.spark.ml.UnaryTransformer<String,scala.collection.immutable.Seq<String>,Tokenizer>

org.apache.spark.ml.feature.Tokenizer

All Implemented Interfaces:: Serializable, org.apache.spark.internal.Logging, Params, HasInputCol, HasOutputCol, DefaultParamsWritable, Identifiable, MLWritable

public class Tokenizer extends UnaryTransformer<String,scala.collection.immutable.Seq<String>,Tokenizer> implements DefaultParamsWritable

A tokenizer that converts the input string to lowercase and then splits it by white spaces.

See Also:

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

Tokenizer()

Tokenizer(String uid)
Method Summary

Modifier and Type

Method

Description

Tokenizer

copy(ParamMap extra)

Creates a copy of this instance with the same UID and some extra params.

static Tokenizer

load(String path)

static MLReader<T>

read()

String

uid()

An immutable unique ID for the object and its derivatives.

Methods inherited from class org.apache.spark.ml.UnaryTransformer
inputCol, outputCol, setInputCol, setOutputCol, transform, transformSchema

Methods inherited from class org.apache.spark.ml.Transformer
transform, transform, transform

Methods inherited from class org.apache.spark.ml.PipelineStage
params

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.ml.util.DefaultParamsWritable
write

Methods inherited from interface org.apache.spark.ml.param.shared.HasInputCol
getInputCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasOutputCol
getOutputCol

Methods inherited from interface org.apache.spark.ml.util.Identifiable
toString

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Methods inherited from interface org.apache.spark.ml.util.MLWritable
save

Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn

Constructor Details
- Tokenizer
  
  public Tokenizer(String uid)
- Tokenizer
  
  public Tokenizer()
Method Details
- load
  
  public static Tokenizer load(String path)
- read
  
  public static MLReader<T> read()
- uid
  
  public String uid()
  
  Description copied from interface: Identifiable
  
  An immutable unique ID for the object and its derivatives.
  
  Specified by:
  
  uid in interface Identifiable
  
  Returns:
  
  (undocumented)
- copy
  
  public Tokenizer copy(ParamMap extra)
  
  Description copied from interface: Params
  
  Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See defaultCopy().
  
  Specified by:
  
  copy in interface Params
  
  Overrides:
  
  copy in class UnaryTransformer<String,scala.collection.immutable.Seq<String>,Tokenizer>
  
  Parameters:
  
  extra - (undocumented)
  
  Returns:
  
  (undocumented)

Class Tokenizer

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class org.apache.spark.ml.UnaryTransformer

Methods inherited from class org.apache.spark.ml.Transformer

Methods inherited from class org.apache.spark.ml.PipelineStage

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.ml.util.DefaultParamsWritable

Methods inherited from interface org.apache.spark.ml.param.shared.HasInputCol

Methods inherited from interface org.apache.spark.ml.param.shared.HasOutputCol

Methods inherited from interface org.apache.spark.ml.util.Identifiable

Methods inherited from interface org.apache.spark.internal.Logging

Methods inherited from interface org.apache.spark.ml.util.MLWritable

Methods inherited from interface org.apache.spark.ml.param.Params

Constructor Details

Tokenizer

Tokenizer

Method Details

load

read

uid

copy