Package org.apache.spark.mllib.fpm
Class FPGrowth
Object
org.apache.spark.mllib.fpm.FPGrowth
- All Implemented Interfaces:
Serializable,org.apache.spark.internal.Logging
A parallel FP-growth algorithm to mine frequent itemsets. The algorithm is described in
Li et al., PFP: Parallel FP-Growth for Query
Recommendation. PFP distributes computation in such a way that each worker executes an
independent group of mining tasks. The FP-Growth algorithm is described in
Han et al., Mining frequent patterns without
candidate generation.
param: minSupport the minimal support level of the frequent pattern, any pattern that appears more than (minSupport * size-of-the-dataset) times will be output param: numPartitions number of partitions used by parallel FP-growth
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Constructor Summary
ConstructorsConstructorDescriptionFPGrowth()Constructs a default instance with default parameters {minSupport:0.3, numPartitions: same as the input data}. -
Method Summary
Modifier and TypeMethodDescription<Item,Basket extends Iterable<Item>>
FPGrowthModel<Item>Java-friendly version ofrun.<Item> FPGrowthModel<Item>Computes an FP-Growth model that contains frequent itemsets.setMinSupport(double minSupport) Sets the minimal support level (default:0.3).setNumPartitions(int numPartitions) Sets the number of partitions used by parallel FP-growth (default: same as input data).Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
-
Constructor Details
-
FPGrowth
public FPGrowth()Constructs a default instance with default parameters {minSupport:0.3, numPartitions: same as the input data}.
-
-
Method Details
-
setMinSupport
Sets the minimal support level (default:0.3).- Parameters:
minSupport- (undocumented)- Returns:
- (undocumented)
-
setNumPartitions
Sets the number of partitions used by parallel FP-growth (default: same as input data).- Parameters:
numPartitions- (undocumented)- Returns:
- (undocumented)
-
run
Computes an FP-Growth model that contains frequent itemsets.- Parameters:
data- input data set, each element contains a transactionevidence$4- (undocumented)- Returns:
- an
FPGrowthModel
-
run
Java-friendly version ofrun.- Parameters:
data- (undocumented)- Returns:
- (undocumented)
-