Package org.apache.spark.ml.feature
Class NGram
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Transformer
org.apache.spark.ml.UnaryTransformer<scala.collection.Seq<String>,scala.collection.Seq<String>,NGram>
org.apache.spark.ml.feature.NGram
- All Implemented Interfaces:
Serializable
,org.apache.spark.internal.Logging
,Params
,HasInputCol
,HasOutputCol
,DefaultParamsWritable
,Identifiable
,MLWritable
,scala.Serializable
public class NGram
extends UnaryTransformer<scala.collection.Seq<String>,scala.collection.Seq<String>,NGram>
implements DefaultParamsWritable
A feature transformer that converts the input array of strings into an array of n-grams. Null
values in the input array are ignored.
It returns an array of n-grams where each n-gram is represented by a space-separated string of
words.
When the input is empty, an empty array is returned. When the input array length is less than n (number of elements per n-gram), no n-grams are returned.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Constructor Summary
-
Method Summary
Methods inherited from class org.apache.spark.ml.UnaryTransformer
copy, inputCol, outputCol, setInputCol, setOutputCol, transform, transformSchema
Methods inherited from class org.apache.spark.ml.Transformer
transform, transform, transform
Methods inherited from class org.apache.spark.ml.PipelineStage
params
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.spark.ml.util.DefaultParamsWritable
write
Methods inherited from interface org.apache.spark.ml.param.shared.HasInputCol
getInputCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasOutputCol
getOutputCol
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq
Methods inherited from interface org.apache.spark.ml.util.MLWritable
save
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
-
Constructor Details
-
NGram
-
NGram
public NGram()
-
-
Method Details
-
load
-
read
-
uid
Description copied from interface:Identifiable
An immutable unique ID for the object and its derivatives.- Specified by:
uid
in interfaceIdentifiable
- Returns:
- (undocumented)
-
n
Minimum n-gram length, greater than or equal to 1. Default: 2, bigram features- Returns:
- (undocumented)
-
setN
-
getN
public int getN() -
toString
- Specified by:
toString
in interfaceIdentifiable
- Overrides:
toString
in classObject
-