Package org.apache.spark.mllib.fpm
Class FPGrowth
Object
org.apache.spark.mllib.fpm.FPGrowth
- All Implemented Interfaces:
Serializable
,org.apache.spark.internal.Logging
A parallel FP-growth algorithm to mine frequent itemsets. The algorithm is described in
Li et al., PFP: Parallel FP-Growth for Query
Recommendation. PFP distributes computation in such a way that each worker executes an
independent group of mining tasks. The FP-Growth algorithm is described in
Han et al., Mining frequent patterns without
candidate generation.
param: minSupport the minimal support level of the frequent pattern, any pattern that appears more than (minSupport * size-of-the-dataset) times will be output param: numPartitions number of partitions used by parallel FP-growth
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Constructor Summary
ConstructorDescriptionFPGrowth()
Constructs a default instance with default parameters {minSupport:0.3
, numPartitions: same as the input data}. -
Method Summary
Modifier and TypeMethodDescription<Item,
Basket extends Iterable<Item>>
FPGrowthModel<Item>Java-friendly version ofrun
.<Item> FPGrowthModel<Item>
Computes an FP-Growth model that contains frequent itemsets.setMinSupport
(double minSupport) Sets the minimal support level (default:0.3
).setNumPartitions
(int numPartitions) Sets the number of partitions used by parallel FP-growth (default: same as input data).Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
-
Constructor Details
-
FPGrowth
public FPGrowth()Constructs a default instance with default parameters {minSupport:0.3
, numPartitions: same as the input data}.
-
-
Method Details
-
setMinSupport
Sets the minimal support level (default:0.3
).- Parameters:
minSupport
- (undocumented)- Returns:
- (undocumented)
-
setNumPartitions
Sets the number of partitions used by parallel FP-growth (default: same as input data).- Parameters:
numPartitions
- (undocumented)- Returns:
- (undocumented)
-
run
Computes an FP-Growth model that contains frequent itemsets.- Parameters:
data
- input data set, each element contains a transactionevidence$4
- (undocumented)- Returns:
- an
FPGrowthModel
-
run
Java-friendly version ofrun
.- Parameters:
data
- (undocumented)- Returns:
- (undocumented)
-