A B C D E F G H I J K L M N O P Q R S T U V W Z _ 

A

Accumulable<R,T> - Class in org.apache.spark
A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
 
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.SparkContext
Create an Accumulable shared variable, to which tasks can add values with +=.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
Create an accumulator from a "mutable collection" type.
AccumulableParam<R,T> - Interface in org.apache.spark
Helper object defining how to accumulate values of a particular type.
Accumulator<T> - Class in org.apache.spark
A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.e.
Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
 
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the += method.
AccumulatorParam<T> - Interface in org.apache.spark
A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
ActorHelper - Interface in org.apache.spark.streaming.receiver
:: DeveloperApi :: A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.
actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: A helper with set of defaults for supervisor strategy
ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
 
actorSystem() - Method in class org.apache.spark.SparkEnv
 
add(T) - Method in class org.apache.spark.Accumulable
Add more data to this accumulator / accumulable
add(Vector) - Method in class org.apache.spark.util.Vector
 
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
 
addedFiles() - Method in class org.apache.spark.SparkContext
 
addedJars() - Method in class org.apache.spark.SparkContext
 
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(String) - Method in class org.apache.spark.SparkContext
Add a file to be downloaded with this Spark job on every node.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
Merge two accumulated values together.
addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
 
addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
 
addInPlace(Vector) - Method in class org.apache.spark.util.Vector
 
addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
 
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in class org.apache.spark.SparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
Add Hadoop configuration specific to a single partition and attempt.
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext
Add a callback function to be executed on task completion.
addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
Add a StreamingListener object for receiving system events related to streaming.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
Aggregate - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Groups input data by groupingExpressions and computes the aggregateExpressions for each group.
Aggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan, SQLContext) - Constructor for class org.apache.spark.sql.execution.Aggregate
 
aggregate() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
aggregate(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
Performs an aggregation over all Rows in this RDD.
Aggregate.ComputedAggregate - Class in org.apache.spark.sql.execution
An aggregate that needs to be computed for each row in a group.
Aggregate.ComputedAggregate(AggregateExpression, AggregateExpression, AttributeReference) - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
Aggregate.ComputedAggregate$ - Class in org.apache.spark.sql.execution
 
Aggregate.ComputedAggregate$() - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate$
 
aggregateExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
 
Aggregator<K,V,C> - Class in org.apache.spark
:: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
 
Algo - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
 
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
AlphaComponent - Annotation Type in org.apache.spark.annotation
A new component of Spark which may have unstable API's.
alreadyPlanned() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
ALS - Class in org.apache.spark.mllib.recommendation
Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
analyzed() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
 
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
 
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
Returns a new vector with 1.0 (bias) appended to the input vector.
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
 
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
Gets the (i, j)-th element.
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
Gets the value of the ith element.
apply(String) - Static method in class org.apache.spark.storage.BlockId
Converts a BlockId "name" String back into a BlockId.
apply(String, String, int, int) - Static method in class org.apache.spark.storage.BlockManagerId
Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
 
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
 
apply(long) - Static method in class org.apache.spark.streaming.Minutes
 
apply(long) - Static method in class org.apache.spark.streaming.Seconds
 
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values passed as variable-length arguments.
apply(int) - Method in class org.apache.spark.util.Vector
 
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Applies a schema to an RDD of Java Beans.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
 
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
appName() - Method in class org.apache.spark.SparkContext
 
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the receiver operating characteristic (ROC) curve.
as(Symbol) - Method in class org.apache.spark.sql.SchemaRDD
Applies a qualifier to the attributes of this relation.
asIterator() - Method in interface org.apache.spark.serializer.DeserializationStream
Read the elements of this stream through an iterator.
asRDDId() - Method in class org.apache.spark.storage.BlockId
 
AsyncRDDActions<T> - Class in org.apache.spark.rdd
:: Experimental :: A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
 
attemptId() - Method in class org.apache.spark.TaskContext
 
attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.

B

baseLogicalPlan() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
baseLogicalPlan() - Method in class org.apache.spark.sql.SchemaRDD
 
baseSchemaRDD() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
baseSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
 
BatchInfo - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, ReceivedBlockInfo[]>, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
BernoulliSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, double, boolean, Random) - Constructor for class org.apache.spark.util.random.BernoulliSampler
 
BernoulliSampler(double, Random) - Constructor for class org.apache.spark.util.random.BernoulliSampler
 
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation
:: Experimental :: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
 
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators
Function to check if labels used for classification are either zero or one.
BlockId - Class in org.apache.spark.storage
:: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
 
blockManager() - Method in class org.apache.spark.SparkEnv
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
BlockManagerId - Class in org.apache.spark.storage
:: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
 
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
 
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
blocks() - Method in class org.apache.spark.storage.StorageStatus
 
BlockStatus - Class in org.apache.spark.storage
 
BlockStatus(StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
 
bmAddress() - Method in class org.apache.spark.FetchFailed
 
booleanWritableConverter() - Static method in class org.apache.spark.SparkContext
 
boolToBoolWritable(boolean) - Static method in class org.apache.spark.SparkContext
 
boundCondition() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
boundCondition() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
BoundedDouble - Class in org.apache.spark.partial
:: Experimental :: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
 
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast
A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
 
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
broadcast() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
broadcast() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
 
BroadcastBlockId - Class in org.apache.spark.storage
 
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
 
BroadcastFactory - Interface in org.apache.spark.broadcast
:: DeveloperApi :: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
 
broadcastManager() - Method in class org.apache.spark.SparkEnv
 
BroadcastNestedLoopJoin - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
BroadcastNestedLoopJoin(SparkPlan, SparkPlan, JoinType, Option<Expression>, SQLContext) - Constructor for class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
build(Node[]) - Method in class org.apache.spark.mllib.tree.model.Node
build the left node and right nodes if not leaf
buildKeys() - Method in class org.apache.spark.sql.execution.HashJoin
 
buildKeys() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
BuildLeft - Class in org.apache.spark.sql.execution
 
BuildLeft() - Constructor for class org.apache.spark.sql.execution.BuildLeft
 
buildPlan() - Method in class org.apache.spark.sql.execution.HashJoin
 
buildPlan() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
BuildRight - Class in org.apache.spark.sql.execution
 
BuildRight() - Constructor for class org.apache.spark.sql.execution.BuildRight
 
BuildSide - Class in org.apache.spark.sql.execution
 
BuildSide() - Constructor for class org.apache.spark.sql.execution.BuildSide
 
buildSide() - Method in class org.apache.spark.sql.execution.HashJoin
 
buildSideKeyGenerator() - Method in class org.apache.spark.sql.execution.HashJoin
 
buildSideKeyGenerator() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
bytesToBytesWritable(byte[]) - Static method in class org.apache.spark.SparkContext
 
bytesWritableConverter() - Static method in class org.apache.spark.SparkContext
 

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.rdd.RDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
CacheCommand - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
CacheCommand(String, boolean, SQLContext) - Constructor for class org.apache.spark.sql.execution.CacheCommand
 
cacheManager() - Method in class org.apache.spark.SparkEnv
 
cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
Caches the specified table in-memory.
cacheTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
calculate(double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: entropy calculation
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
 
calculate(double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: Gini coefficient calculation
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
 
calculate(double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for binary classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for regression
calculate(double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
 
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: variance calculation
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
 
call(T1) - Method in interface org.apache.spark.api.java.function.Function
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
 
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
 
cancel() - Method in class org.apache.spark.ComplexFutureAction
 
cancel() - Method in interface org.apache.spark.FutureAction
Cancels the execution of this action.
cancel() - Method in class org.apache.spark.SimpleFutureAction
 
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.SparkContext
Cancel all jobs that have been scheduled or are running.
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
Cancel active jobs for the specified group.
cancelled() - Method in class org.apache.spark.ComplexFutureAction
Returns whether the promise has been cancelled.
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
 
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
CartesianProduct - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
CartesianProduct(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.CartesianProduct
 
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
categories() - Method in class org.apache.spark.mllib.tree.model.Split
 
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
 
checkpoint() - Method in class org.apache.spark.rdd.RDD
Mark this RDD for checkpointing.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Enable periodic checkpointing of RDDs of this DStream
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointData() - Method in class org.apache.spark.rdd.RDD
 
checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
 
checkpointDir() - Method in class org.apache.spark.SparkContext
 
checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
 
checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
 
child() - Method in class org.apache.spark.sql.execution.Aggregate
 
child() - Method in class org.apache.spark.sql.execution.DescribeCommand
 
child() - Method in class org.apache.spark.sql.execution.Exchange
 
child() - Method in class org.apache.spark.sql.execution.Filter
 
child() - Method in class org.apache.spark.sql.execution.Generate
 
child() - Method in class org.apache.spark.sql.execution.Limit
 
child() - Method in class org.apache.spark.sql.execution.Project
 
child() - Method in class org.apache.spark.sql.execution.Sample
 
child() - Method in class org.apache.spark.sql.execution.Sort
 
child() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
children() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
children() - Method in class org.apache.spark.sql.execution.Union
 
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
ClassificationModel - Interface in org.apache.spark.mllib.classification
:: Experimental :: Represents a classification model that predicts to which of a set of categories an example belongs.
className() - Method in class org.apache.spark.ExceptionFailure
 
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaRDD
 
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
classTag() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
cleaner() - Method in class org.apache.spark.SparkContext
 
clear() - Method in interface org.apache.spark.sql.SQLConf
 
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext
Support function for API backtraces.
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearFiles() - Method in class org.apache.spark.SparkContext
Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.SparkContext
Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext
Clear the current thread's job group ID and its description.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
:: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
:: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clone() - Method in class org.apache.spark.SparkConf
Copy this object
clone() - Method in class org.apache.spark.storage.StorageLevel
 
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
 
clone() - Method in class org.apache.spark.util.random.PoissonSampler
 
clone() - Method in interface org.apache.spark.util.random.RandomSampler
 
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliSampler
Return a sampler with is the complement of the range specified of the current sampler.
close() - Method in interface org.apache.spark.serializer.DeserializationStream
 
close() - Method in interface org.apache.spark.serializer.SerializationStream
 
closureSerializer() - Method in class org.apache.spark.SparkEnv
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
 
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd
:: DeveloperApi :: A RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
 
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.rdd.RDD
Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return an RDD that contains all matching values by applying f.
collect() - Method in class org.apache.spark.sql.SchemaRDD
 
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for retrieving all elements of this RDD.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in a specific partition of this RDD.
columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Combine elements of each key in DStream's RDDs using custom functions.
combineCombinersByKey(Iterator<Tuple2<K, C>>) - Method in class org.apache.spark.Aggregator
 
combineCombinersByKey(Iterator<Tuple2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
Command - Interface in org.apache.spark.sql.execution
 
commands() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
 
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
 
completed() - Method in class org.apache.spark.TaskContext
 
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
completionTime() - Method in class org.apache.spark.scheduler.StageInfo
Time when all tasks in the stage completed or when the stage was cancelled.
ComplexFutureAction<T> - Class in org.apache.spark
:: Experimental :: A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
 
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
CompressionCodec - Interface in org.apache.spark.io
:: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
:: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.sql.SchemaRDD
 
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Method that generates a RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Method that generates a RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
Ask ReceiverInputTracker for received data blocks and generates RDDs with them.
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes column-wise summary statistics.
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the covariance matrix, treating each row as an observation.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the Gramian matrix A^T A.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the top k principal components.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the singular value decomposition of this matrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the singular value decomposition of this matrix.
condition() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
condition() - Method in class org.apache.spark.sql.execution.Filter
 
condition() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
conf() - Method in class org.apache.spark.SparkContext
 
conf() - Method in class org.apache.spark.SparkEnv
 
conf() - Method in class org.apache.spark.streaming.StreamingContext
 
confidence() - Method in class org.apache.spark.partial.BoundedDouble
 
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
connectionManager() - Method in class org.apache.spark.SparkEnv
 
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
An input stream that always returns the same RDD on each timestep.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
 
contains(String) - Method in class org.apache.spark.SparkConf
Does the configuration contain a given parameter?
contains(String) - Method in interface org.apache.spark.sql.SQLConf
 
containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
 
context() - Method in interface org.apache.spark.api.java.JavaRDDLike
The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
 
context() - Method in class org.apache.spark.rdd.RDD
The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream
Return the StreamingContext associated with this DStream
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
convertToCatalyst(Object) - Static method in class org.apache.spark.sql.execution.ExistingRdd
 
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
copy() - Method in class org.apache.spark.util.StatCounter
Clone this StatCounter
count() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the number of elements in the RDD.
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample size.
count() - Method in class org.apache.spark.rdd.RDD
Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.SchemaRDD
:: Experimental :: Return the number of elements in the RDD.
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.util.StatCounter
 
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
Count the number of elements for each key, and return the result to the master as a Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
(Experimental) Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
(Experimental) Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Deprecated.
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Create a new StorageLevel object.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
Create a PartitionPruningRDD.
create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
 
createCodec(SparkConf) - Method in interface org.apache.spark.io.CompressionCodec
 
createCodec(SparkConf, String) - Method in interface org.apache.spark.io.CompressionCodec
 
createCombiner() - Method in class org.apache.spark.Aggregator
 
createFilter(Expression) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
createParquetFile(Class<?>, String, boolean, Configuration) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
:: Experimental :: Creates an empty parquet file with the schema of class beanClass, which can be registered as a table.
createParquetFile(String, boolean, Configuration, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates an empty parquet file with the schema of class A, which can be registered as a table.
createRecordFilter(Seq<Expression>) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
createSchemaRDD(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
Creates a SchemaRDD from an RDD of case classes.
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from a Kafka Broker.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, Manifest<U>, Manifest<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages form a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages form a Kafka Broker.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages form a Kafka Broker.
createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createTable(String, boolean, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.hive.HiveContext
Creates a table using the schema of the given class.
creationSiteInfo() - Method in class org.apache.spark.rdd.RDD
User code that created this RDD (e.g.

D

dagScheduler() - Method in class org.apache.spark.SparkContext
 
DataValidators - Class in org.apache.spark.mllib.util
:: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
 
DecisionTree - Class in org.apache.spark.mllib.tree
:: Experimental :: A class that implements a decision tree algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
 
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
:: Experimental :: Model to store the decision tree parameters
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
 
DEFAULT_COMPRESSION_CODEC() - Method in interface org.apache.spark.io.CompressionCodec
 
DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
As of Spark 1.0.0, defaultMinSplits is deprecated, use JavaSparkContext.defaultMinPartitions() instead
defaultMinSplits() - Method in class org.apache.spark.SparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in class org.apache.spark.SparkContext
Default level of parallelism to use when not given by user (e.g.
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
 
delegate() - Method in class org.apache.spark.InterruptibleIterator
 
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
Creates a column-majored dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from a double array.
DenseMatrix - Class in org.apache.spark.mllib.linalg
Column-majored dense matrix.
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
 
DenseVector - Class in org.apache.spark.mllib.linalg
A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
 
dependencies() - Method in class org.apache.spark.rdd.RDD
Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
Dependency<T> - Class in org.apache.spark
:: DeveloperApi :: Base class for dependencies.
Dependency(RDD<T>) - Constructor for class org.apache.spark.Dependency
 
DescribeCommand - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
DescribeCommand(SparkPlan, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.DescribeCommand
 
describedTable() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
DescribeHiveTableCommand - Class in org.apache.spark.sql.hive.execution
Implementation for "describe [extended] table".
DescribeHiveTableCommand(org.apache.spark.sql.hive.MetastoreRelation, Seq<Attribute>, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
description() - Method in class org.apache.spark.ExceptionFailure
 
description() - Method in class org.apache.spark.storage.StorageLevel
 
DeserializationStream - Interface in org.apache.spark.serializer
:: DeveloperApi :: A stream for reading serialized objects.
deserialize(ByteBuffer, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializerInstance
 
deserialized() - Method in class org.apache.spark.storage.StorageLevel
 
deserializeFilterExpressions(Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
Note: Inside the Hadoop API we only have access to Configuration, not to SparkContext, so we cannot use broadcasts to convey the actual filter predicate.
deserializeMany(ByteBuffer) - Method in interface org.apache.spark.serializer.SerializerInstance
 
deserializeStream(InputStream) - Method in interface org.apache.spark.serializer.SerializerInstance
 
DeveloperApi - Annotation Type in org.apache.spark.annotation
A lower-level, unstable API intended for developers.
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
 
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
 
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
 
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
 
diskSize() - Method in class org.apache.spark.storage.BlockStatus
 
diskSize() - Method in class org.apache.spark.storage.RDDInfo
 
diskUsed() - Method in class org.apache.spark.storage.StorageStatus
 
diskUsedByRDD(int) - Method in class org.apache.spark.storage.StorageStatus
 
dist(Vector) - Method in class org.apache.spark.util.Vector
 
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.SchemaRDD
 
distinct(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
Represents a distributively stored matrix backed by one or more RDDs.
divide(double) - Method in class org.apache.spark.util.Vector
 
doCache() - Method in class org.apache.spark.sql.execution.CacheCommand
 
dot(Vector) - Method in class org.apache.spark.util.Vector
 
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function
A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleRDDFunctions - Class in org.apache.spark.rdd
Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
 
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
 
doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
 
doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
 
doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
 
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
DStream<T> - Class in org.apache.spark.streaming.dstream
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
 
duration() - Method in class org.apache.spark.scheduler.TaskInfo
 
Duration - Class in org.apache.spark.streaming
 
Duration(long) - Constructor for class org.apache.spark.streaming.Duration
 

E

elements() - Method in class org.apache.spark.util.Vector
 
emittedTaskSizeWarning() - Method in class org.apache.spark.scheduler.StageInfo
 
emptyRDD(ClassTag<T>) - Method in class org.apache.spark.SparkContext
Get an RDD that has no partitions or elements.
entries() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
Entropy - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating entropy during binary classification.
Entropy() - Constructor for class org.apache.spark.mllib.tree.impurity.Entropy
 
env() - Method in class org.apache.spark.api.java.JavaSparkContext
 
env() - Method in class org.apache.spark.SparkContext
 
env() - Method in class org.apache.spark.streaming.StreamingContext
 
environmentDetails() - Method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
EnvironmentListener - Class in org.apache.spark.ui.env
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the EnvironmentTab
EnvironmentListener() - Constructor for class org.apache.spark.ui.env.EnvironmentListener
 
EPSILON() - Static method in class org.apache.spark.mllib.util.MLUtils
 
equals(Object) - Method in class org.apache.spark.HashPartitioner
 
equals(Object) - Method in interface org.apache.spark.mllib.linalg.Vector
 
equals(Object) - Method in class org.apache.spark.RangePartitioner
 
equals(Object) - Method in class org.apache.spark.scheduler.InputFormatInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.SplitInfo
 
equals(Object) - Method in class org.apache.spark.storage.BlockId
 
equals(Object) - Method in class org.apache.spark.storage.BlockManagerId
 
equals(Object) - Method in class org.apache.spark.storage.StorageLevel
 
errorMessage() - Method in class org.apache.spark.ui.jobs.TaskUIData
 
event() - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
eventLogger() - Method in class org.apache.spark.SparkContext
 
ExceptionFailure - Class in org.apache.spark
:: DeveloperApi :: Task failed due to a runtime exception.
ExceptionFailure(String, String, StackTraceElement[], Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
 
Exchange - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Exchange(Partitioning, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Exchange
 
execute() - Method in class org.apache.spark.sql.execution.Aggregate
Substituted version of aggregateExpressions expressions which are used to compute final output rows given a group and the result of all aggregate computations.
execute() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
execute() - Method in class org.apache.spark.sql.execution.CacheCommand
 
execute() - Method in class org.apache.spark.sql.execution.CartesianProduct
 
execute() - Method in class org.apache.spark.sql.execution.DescribeCommand
 
execute() - Method in class org.apache.spark.sql.execution.Exchange
 
execute() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
execute() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
execute() - Method in class org.apache.spark.sql.execution.Filter
 
execute() - Method in class org.apache.spark.sql.execution.Generate
 
execute() - Method in class org.apache.spark.sql.execution.HashJoin
 
execute() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
execute() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
execute() - Method in class org.apache.spark.sql.execution.Limit
 
execute() - Method in class org.apache.spark.sql.execution.Project
 
execute() - Method in class org.apache.spark.sql.execution.Sample
 
execute() - Method in class org.apache.spark.sql.execution.SetCommand
 
execute() - Method in class org.apache.spark.sql.execution.Sort
 
execute() - Method in class org.apache.spark.sql.execution.SparkPlan
Runs this query returning the result as an RDD.
execute() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
execute() - Method in class org.apache.spark.sql.execution.Union
 
execute() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
execute() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
execute() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
execute() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
execute() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
execute() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
Inserts all rows into the Parquet file.
execute() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
executeCollect() - Method in class org.apache.spark.sql.execution.Limit
 
executeCollect() - Method in class org.apache.spark.sql.execution.SparkPlan
Runs this query returning the result as an array.
executeCollect() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
executeOnCompleteCallbacks() - Method in class org.apache.spark.TaskContext
 
executePlan(LogicalPlan) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
executor_() - Method in class org.apache.spark.streaming.receiver.Receiver
Handler object that runs the receiver.
executorEnvs() - Method in class org.apache.spark.SparkContext
 
executorId() - Method in class org.apache.spark.scheduler.TaskInfo
 
executorId() - Method in class org.apache.spark.SparkEnv
 
executorId() - Method in class org.apache.spark.storage.BlockManagerId
 
executorIdToBlockManagerId() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
executorIdToStorageStatus() - Method in class org.apache.spark.storage.StorageStatusListener
 
ExecutorLostFailure - Class in org.apache.spark
:: DeveloperApi :: The task failed because the executor that it was running on was lost.
ExecutorLostFailure() - Constructor for class org.apache.spark.ExecutorLostFailure
 
executorMemory() - Method in class org.apache.spark.SparkContext
 
ExecutorsListener - Class in org.apache.spark.ui.exec
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the ExecutorsTab
ExecutorsListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.exec.ExecutorsListener
 
ExecutorSummary - Class in org.apache.spark.ui.jobs
:: DeveloperApi :: Class for reporting aggregated metrics for each executor in stage UI.
ExecutorSummary() - Constructor for class org.apache.spark.ui.jobs.ExecutorSummary
 
executorToDuration() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksActive() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksComplete() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksFailed() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
ExistingRdd - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
ExistingRdd(Seq<Attribute>, RDD<Row>) - Constructor for class org.apache.spark.sql.execution.ExistingRdd
 
Experimental - Annotation Type in org.apache.spark.annotation
An experimental user-facing API.
ExplainCommand - Class in org.apache.spark.sql.execution
An explain command for users to see how a command will be executed.
ExplainCommand(LogicalPlan, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.ExplainCommand
 
extractDistribution(Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
extractDoubleDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
extractLongDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 

F

failed() - Method in class org.apache.spark.scheduler.TaskInfo
 
failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
failedTasks() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
 
failureReason() - Method in class org.apache.spark.scheduler.StageInfo
If the stage failed, the reason why.
FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
feature() - Method in class org.apache.spark.mllib.tree.model.Split
 
features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
FeatureType - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum to describe whether a feature is "continuous" or "categorical"
FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
 
featureType() - Method in class org.apache.spark.mllib.tree.model.Split
 
FetchFailed - Class in org.apache.spark
:: DeveloperApi :: Task failed to fetch shuffle data from a remote node.
FetchFailed(BlockManagerId, int, int, int) - Constructor for class org.apache.spark.FetchFailed
 
field() - Method in class org.apache.spark.storage.BroadcastBlockId
 
FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
files() - Method in class org.apache.spark.SparkContext
 
fileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Row, Boolean>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD containing only the elements that satisfy a predicate.
Filter - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Filter(Expression, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Filter
 
filter(Function1<Row, Object>) - Method in class org.apache.spark.sql.SchemaRDD
 
filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream containing only the elements that satisfy a predicate.
filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD
Filters this RDD with p, where p takes an additional parameter of type A.
findExpression(CatalystFilter, Expression) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
Try to find the given expression in the tree of filters in order to determine whether it is safe to remove it from the higher level filters.
finished() - Method in class org.apache.spark.scheduler.TaskInfo
 
finishTime() - Method in class org.apache.spark.scheduler.TaskInfo
The time when the task has completed successfully (including the time to remotely fetch results, if necessary).
first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
first() - Method in class org.apache.spark.api.java.JavaPairRDD
 
first() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the first element in this RDD.
first() - Method in class org.apache.spark.rdd.RDD
Return the first element in this RDD.
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMap(Function1<T, Traversable<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function
A function that returns zero or more output records from each input record.
FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function
A function that takes two inputs and returns zero or more output records.
flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD
Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
FlatMaps f over this RDD, where f takes an additional parameter of type A.
floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
 
floatWritableConverter() - Static method in class org.apache.spark.SparkContext
 
floor(Duration) - Method in class org.apache.spark.streaming.Time
 
FlumeUtils - Class in org.apache.spark.streaming.flume
 
FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
 
flush() - Method in interface org.apache.spark.serializer.SerializationStream
 
fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, F-Measure) curve.
fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, F-Measure) curve with beta = 1.0.
fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Applies a function f to all elements of this RDD.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies a function f to all elements of this RDD.
foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
Applies a function f to all elements of this RDD.
foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies a function f to each partition of this RDD.
foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
Applies a function f to each partition of this RDD.
foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Apply a function to each RDD in this DStream.
foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Apply a function to each RDD in this DStream.
foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies f to each element of this RDD, where f takes an additional parameter of type A.
formatExecutorId(String) - Method in class org.apache.spark.storage.StorageStatusListener
In the local mode, there is a discrepancy between the executor ID according to the task ("localhost") and that according to SparkEnv ("").
fraction() - Method in class org.apache.spark.sql.execution.Sample
 
fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
Convert a scala DStream to a Java-friendly JavaDStream.
fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
Convert a scala InputDStream to a Java-friendly JavaInputDStream.
fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
Convert a scala InputDStream of pairs to a Java-friendly JavaPairInputDStream.
fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD
Convert a JavaRDD of key-value pairs to JavaPairRDD.
fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
fromProductRdd(RDD<A>, TypeTags.TypeTag<A>) - Static method in class org.apache.spark.sql.execution.ExistingRdd
 
fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
 
fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
 
fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
 
fromStage(Stage) - Static method in class org.apache.spark.scheduler.StageInfo
Construct a StageInfo from a Stage.
Function<T1,R> - Interface in org.apache.spark.api.java.function
Base interface for functions whose return types do not create special RDDs.
Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function
A two-argument function that takes arguments of type T1 and T2 and returns an R.
Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function
A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
FutureAction<T> - Interface in org.apache.spark
:: Experimental :: A future for the result of an action to support cancellation.

G

gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression
:: DeveloperApi :: GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
 
GeneralizedLinearModel - Class in org.apache.spark.mllib.regression
:: DeveloperApi :: GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm.
GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
Generate - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Applies a Generator to a stream of input rows, combining the output of each into a new stream of rows.
Generate(Generator, boolean, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Generate
 
generate(Generator, boolean, boolean, Option<String>) - Method in class org.apache.spark.sql.SchemaRDD
:: Experimental :: Applies the given Generator, or table generating function, to this relation.
generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
 
generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
Generate an RDD containing test data for KMeans.
generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
 
generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
Return a Java List of synthetic data randomly generated according to a multi collinear model.
generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and uregularized variants.
generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
Generate an RDD containing test data for LogisticRegression.
generator() - Method in class org.apache.spark.sql.execution.Generate
 
get() - Method in interface org.apache.spark.FutureAction
Blocks and returns the result of this job.
get(String) - Method in class org.apache.spark.SparkConf
Get a parameter; throws a NoSuchElementException if it's not set
get(String, String) - Method in class org.apache.spark.SparkConf
Get a parameter, falling back to a default if not set
get() - Static method in class org.apache.spark.SparkEnv
Returns the ThreadLocal SparkEnv, if non-null.
get(String) - Static method in class org.apache.spark.SparkFiles
Get the absolute path of a file added through SparkContext.addFile().
get(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column `i`.
get(String) - Method in interface org.apache.spark.sql.SQLConf
 
get(String, String) - Method in interface org.apache.spark.sql.SQLConf
 
getAkkaConf() - Method in class org.apache.spark.SparkConf
Get all akka conf variables set on this SparkConf
getAll() - Method in class org.apache.spark.SparkConf
Get all parameters as a list of pairs
getAll() - Method in interface org.apache.spark.sql.SQLConf
 
getAllPools() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return pools for fair scheduler
getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf
Get a parameter as a boolean, falling back to a default if not set
getBoolean(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a bool.
getByte(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a byte.
getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
 
getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
The three methods below are helpers for accessing the local map, a property of the SparkEnv of the local process.
getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
 
getCheckpointDir() - Method in class org.apache.spark.SparkContext
 
getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike
Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.rdd.RDD
Gets the name of the file to which this RDD was checkpointed
getConf() - Method in class org.apache.spark.api.java.JavaSparkContext
Return a copy of this JavaSparkContext's configuration.
getConf() - Method in class org.apache.spark.rdd.HadoopRDD
 
getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getConf() - Method in class org.apache.spark.SparkContext
Return a copy of this SparkContext's configuration.
getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
 
getDouble(String, double) - Method in class org.apache.spark.SparkConf
Get a parameter as a double, falling back to a default if not set
getDouble(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a double.
getExecutorEnv() - Method in class org.apache.spark.SparkConf
Get all executor environment variables set on this SparkConf
getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext
Return a map from the slave to the max memory available for caching and the remaining memory available for caching.
getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return information about blocks stored in all of the slaves
getFinalValue() - Method in class org.apache.spark.partial.PartialResult
Blocking method to wait for and return the final value.
getFloat(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a float.
getHiveFile(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
getInt(String, int) - Method in class org.apache.spark.SparkConf
Get a parameter as an integer, falling back to a default if not set
getInt(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as an int.
getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Get a local property set in this thread, or null if it is missing.
getLocalProperty(String) - Method in class org.apache.spark.SparkContext
Get a local property set in this thread, or null if it is missing.
getLong(String, long) - Method in class org.apache.spark.SparkConf
Get a parameter as a long, falling back to a default if not set
getLong(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a long.
getOption(String) - Method in class org.apache.spark.SparkConf
Get a parameter as an Option
getOption(String) - Method in interface org.apache.spark.sql.SQLConf
 
getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getParents(int) - Method in class org.apache.spark.NarrowDependency
Get the parent partitions for a child partition.
getParents(int) - Method in class org.apache.spark.OneToOneDependency
 
getParents(int) - Method in class org.apache.spark.RangeDependency
 
getPartition(Object) - Method in class org.apache.spark.HashPartitioner
 
getPartition(Object) - Method in class org.apache.spark.Partitioner
 
getPartition(Object) - Method in class org.apache.spark.RangePartitioner
 
getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
 
getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
 
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
 
getPartitions() - Method in class org.apache.spark.sql.SchemaRDD
 
getPersistentRDDs() - Method in class org.apache.spark.SparkContext
Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
getPoolForName(String) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return the pool associated with the given name, if one exists
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
 
getRDDStorageInfo() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return information about what RDDs are cached, if they are in mem or on disk, how much space they take, etc.
getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
Gets the receiver object that will be sent to the worker nodes to receive data.
getRootDirectory() - Static method in class org.apache.spark.SparkFiles
Get the root directory that contains files added through SparkContext.addFile().
getSchedulingMode() - Method in class org.apache.spark.SparkContext
Return current scheduling mode
getSerializer(Serializer) - Method in interface org.apache.spark.serializer.Serializer
 
getShort(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a short.
getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext
Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageLevel() - Method in class org.apache.spark.rdd.RDD
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getString(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a String.
getThreadLocal() - Static method in class org.apache.spark.SparkEnv
Returns the ThreadLocal SparkEnv.
gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
 
gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo
The time when the task started remotely getting the result.
Gini - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating the Gini impurity during binary classification.
Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
 
global() - Method in class org.apache.spark.sql.execution.Sort
 
glom() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in class org.apache.spark.rdd.RDD
Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
glom() - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
Gradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to compute the gradient for a loss function, given a single data point.
Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
 
GradientDescent - Class in org.apache.spark.mllib.optimization
Class used to solve an optimization problem using Gradient Descent.
graph() - Method in class org.apache.spark.streaming.dstream.DStream
 
graph() - Method in class org.apache.spark.streaming.StreamingContext
 
groupBy(Function<T, K>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD of grouped elements.
groupBy(Function<T, K>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD of grouped elements.
groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped items.
groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped elements.
groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped items.
groupBy(Seq<Expression>, Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
Performs a grouping followed by an aggregation.
groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey on each RDD of this DStream.
groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey on each RDD.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Create a new DStream by applying groupByKey over a sliding window on this DStream.
groupingExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
 
groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
Alias for cogroup.
groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for cogroup.

H

hadoopConfiguration() - Method in class org.apache.spark.api.java.JavaSparkContext
Returns the Hadoop configuration used for the Hadoop code (e.g.
hadoopConfiguration() - Method in class org.apache.spark.SparkContext
A default Hadoop Configuration for the Hadoop code (e.g.
hadoopFile(String, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat.
hadoopFile(String, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, int, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopJobMetadata() - Method in class org.apache.spark.SparkEnv
 
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
HadoopRDD<K,V> - Class in org.apache.spark.rdd
:: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the older MapReduce API (org.apache.hadoop.mapred).
HadoopRDD(SparkContext, Broadcast<SerializableWritable<Configuration>>, Option<Function1<JobConf, BoxedUnit>>, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
 
HadoopRDD(SparkContext, JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
 
hadoopRDD(JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other necessary info (e.g.
hashCode() - Method in interface org.apache.spark.mllib.linalg.Vector
 
hashCode() - Method in interface org.apache.spark.Partition
 
hashCode() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
hashCode() - Method in class org.apache.spark.scheduler.SplitInfo
 
hashCode() - Method in class org.apache.spark.storage.BlockId
 
hashCode() - Method in class org.apache.spark.storage.BlockManagerId
 
hashCode() - Method in class org.apache.spark.storage.StorageLevel
 
HashJoin - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
HashJoin(Seq<Expression>, Seq<Expression>, BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.HashJoin
 
HashPartitioner - Class in org.apache.spark
A Partitioner that implements hash-based partitioning using Java's Object.hashCode.
HashPartitioner(int) - Constructor for class org.apache.spark.HashPartitioner
 
hasNext() - Method in class org.apache.spark.InterruptibleIterator
 
high() - Method in class org.apache.spark.partial.BoundedDouble
 
HingeGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.
HingeGradient() - Constructor for class org.apache.spark.mllib.optimization.HingeGradient
 
histogram(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[]) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute a histogram using the provided buckets.
histogram(Double[], boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
histogram(int) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[], boolean) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute a histogram using the provided buckets.
HiveContext - Class in org.apache.spark.sql.hive
An instance of the Spark SQL execution engine that integrates with data stored in Hive.
HiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext
 
hiveDevHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
The location of the hive source code.
hiveFilesTemp() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
hiveHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
The location of the compiled hive distribution
HiveMetastoreTypes - Class in org.apache.spark.sql.hive
:: DeveloperApi :: Provides conversions between Spark SQL data types and Hive Metastore types.
HiveMetastoreTypes() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreTypes
 
hivePlanner() - Method in class org.apache.spark.sql.hive.HiveContext
 
hiveql(String) - Method in class org.apache.spark.sql.hive.HiveContext
Executes a query expressed in HiveQL using Spark, returning the result as a SchemaRDD.
hiveQTestUtilTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
hiveString() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
HiveTableScan - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi :: The Hive table scan operator.
HiveTableScan(Seq<Attribute>, org.apache.spark.sql.hive.MetastoreRelation, Option<Expression>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.HiveTableScan
 
host() - Method in class org.apache.spark.scheduler.TaskInfo
 
host() - Method in class org.apache.spark.storage.BlockManagerId
 
hostLocation() - Method in class org.apache.spark.scheduler.SplitInfo
 
hostPort() - Method in class org.apache.spark.storage.BlockManagerId
 
hours() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
hql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
Executes a query expressed in HiveQL, returning the result as a JavaSchemaRDD.
hql(String) - Method in class org.apache.spark.sql.hive.HiveContext
An alias for `hiveql`.
HttpBroadcastFactory - Class in org.apache.spark.broadcast
A BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism.
HttpBroadcastFactory() - Constructor for class org.apache.spark.broadcast.HttpBroadcastFactory
 
httpFileServer() - Method in class org.apache.spark.SparkEnv
 

I

i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
id() - Method in class org.apache.spark.Accumulable
 
id() - Method in interface org.apache.spark.api.java.JavaRDDLike
A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.broadcast.Broadcast
 
id() - Method in class org.apache.spark.mllib.tree.model.Node
 
id() - Method in class org.apache.spark.rdd.RDD
A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.storage.RDDInfo
 
id() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
This is an unique identifier for the network input stream.
impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
Impurity - Interface in org.apache.spark.mllib.tree.impurity
:: Experimental :: Trait for calculating information gain.
impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
index() - Method in interface org.apache.spark.Partition
Get the split's index within its parent RDD
index() - Method in class org.apache.spark.scheduler.TaskInfo
 
IndexedRow - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row of IndexedRowMatrix.
IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row-oriented DistributedMatrix with indexed rows.
IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
InformationGainStats - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Information gain statistics for each split
InformationGainStats(double, double, double, double, double) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
 
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
initialized() - Method in interface org.apache.spark.Logging
 
initializeIfNecessary() - Method in interface org.apache.spark.Logging
 
initializeLogging() - Method in interface org.apache.spark.Logging
 
initialValue() - Method in class org.apache.spark.partial.PartialResult
 
initLocalProperties() - Method in class org.apache.spark.SparkContext
 
initLock() - Method in interface org.apache.spark.Logging
 
input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
InputDStream<T> - Class in org.apache.spark.streaming.dstream
This is the abstract base class for all input streams.
InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
 
inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
 
InputFormatInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Parses and holds information about inputFormat (and files) specified as a parameter.
InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
 
inRepoTests() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi ::
InsertIntoHiveTable(org.apache.spark.sql.hive.MetastoreRelation, Map<String, Option<String>>, SparkPlan, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
InsertIntoParquetTable - Class in org.apache.spark.sql.parquet
Operator that acts as a sink for queries on RDDs and can be used to store the output inside a directory of Parquet files.
InsertIntoParquetTable(ParquetRelation, SparkPlan, boolean, SQLContext) - Constructor for class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
 
intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
 
intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
interrupted() - Method in class org.apache.spark.TaskContext
 
InterruptibleIterator<T> - Class in org.apache.spark
:: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality.
InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
 
intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the intersection of this RDD and another one.
intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the intersection of this RDD and another one.
intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return the intersection of this RDD and another one.
intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return the intersection of this RDD and another one.
intersection(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
intersection(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
intersection(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
 
intToIntWritable(int) - Static method in class org.apache.spark.SparkContext
 
intWritableConverter() - Static method in class org.apache.spark.SparkContext
 
isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
 
isBroadcast() - Method in class org.apache.spark.storage.BlockId
 
isCached(String) - Method in class org.apache.spark.sql.SQLContext
Returns true if the table is currently cached in-memory.
isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.rdd.RDD
Return whether this RDD has been checkpointed or not
isCheckpointPresent() - Method in class org.apache.spark.streaming.StreamingContext
 
isCompleted() - Method in class org.apache.spark.ComplexFutureAction
 
isCompleted() - Method in interface org.apache.spark.FutureAction
Returns whether the action has already been completed with a value or an exception.
isCompleted() - Method in class org.apache.spark.SimpleFutureAction
 
isExtended() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
 
isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
 
isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
 
isLocal() - Method in class org.apache.spark.SparkContext
 
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
 
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
 
isNullAt(int) - Method in class org.apache.spark.sql.api.java.Row
Returns true if value at column `i` is NULL.
isRDD() - Method in class org.apache.spark.storage.BlockId
 
isShuffle() - Method in class org.apache.spark.storage.BlockId
 
isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver
Check if the receiver has started or not.
isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver
Check if receiver has been marked for stopping.
isTraceEnabled() - Method in interface org.apache.spark.Logging
 
isValid() - Method in class org.apache.spark.storage.StorageLevel
 
isZero() - Method in class org.apache.spark.streaming.Duration
 
iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.

J

j() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
jarOfClass(Class<?>) - Static method in class org.apache.spark.api.java.JavaSparkContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.SparkContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.StreamingContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfObject(Object) - Static method in class org.apache.spark.api.java.JavaSparkContext
Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jarOfObject(Object) - Static method in class org.apache.spark.SparkContext
Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jars() - Method in class org.apache.spark.api.java.JavaSparkContext
 
jars() - Method in class org.apache.spark.SparkContext
 
JavaDoubleRDD - Class in org.apache.spark.api.java
 
JavaDoubleRDD(RDD<Object>) - Constructor for class org.apache.spark.api.java.JavaDoubleRDD
 
JavaDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data.
JavaDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaDStream
 
JavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Interface in org.apache.spark.streaming.api.java
 
JavaHiveContext - Class in org.apache.spark.sql.hive.api.java
The entry point for executing Spark SQL queries from a Java program.
JavaHiveContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.hive.api.java.JavaHiveContext
 
JavaInputDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to InputDStream.
JavaInputDStream(InputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaInputDStream
 
JavaPairDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join.
JavaPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairDStream
 
JavaPairInputDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to InputDStream of key-value pairs.
JavaPairInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
JavaPairRDD<K,V> - Class in org.apache.spark.api.java
 
JavaPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaPairRDD
 
JavaPairReceiverInputDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaPairReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
JavaRDD<T> - Class in org.apache.spark.api.java
 
JavaRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.api.java.JavaRDD
 
JavaRDDLike<T,This extends JavaRDDLike<T,This>> - Interface in org.apache.spark.api.java
 
JavaReceiverInputDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
JavaSchemaRDD - Class in org.apache.spark.sql.api.java
An RDD of Row objects that is returned as the result of a Spark SQL query.
JavaSchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.api.java.JavaSchemaRDD
 
JavaSerializer - Class in org.apache.spark.serializer
:: DeveloperApi :: A Spark serializer that uses Java's built-in serialization.
JavaSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.JavaSerializer
 
JavaSparkContext - Class in org.apache.spark.api.java
A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones.
JavaSparkContext(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext() - Constructor for class org.apache.spark.api.java.JavaSparkContext
Create a JavaSparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
JavaSparkContext(SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String[]) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String[], Map<String, String>) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSQLContext - Class in org.apache.spark.sql.api.java
The entry point for executing Spark SQL queries from a Java program.
JavaSQLContext(SQLContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
 
JavaSQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
 
JavaStreamingContext - Class in org.apache.spark.streaming.api.java
A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality.
JavaStreamingContext(StreamingContext) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
 
JavaStreamingContext(String, String, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[]) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[], Map<String, String>) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(JavaSparkContext, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a JavaStreamingContext using an existing JavaSparkContext.
JavaStreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a JavaStreamingContext using a SparkConf configuration.
JavaStreamingContext(String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Recreate a JavaStreamingContext from a checkpoint file.
JavaStreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Re-creates a JavaStreamingContext from a checkpoint file.
JavaStreamingContextFactory - Interface in org.apache.spark.streaming.api.java
Factory interface for creating a new JavaStreamingContext
JdbcRDD<T> - Class in org.apache.spark.rdd
An RDD that executes an SQL query on a JDBC connection and reads results.
JdbcRDD(SparkContext, Function0<Connection>, String, long, long, int, Function1<ResultSet, T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.JdbcRDD
 
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
JobLogger - Class in org.apache.spark.scheduler
:: DeveloperApi :: A logger class to record runtime information for jobs in Spark.
JobLogger(String, String) - Constructor for class org.apache.spark.scheduler.JobLogger
 
JobLogger() - Constructor for class org.apache.spark.scheduler.JobLogger
 
JobProgressListener - Class in org.apache.spark.ui.jobs
:: DeveloperApi :: Tracks task-level information to be displayed in the UI.
JobProgressListener(SparkConf) - Constructor for class org.apache.spark.ui.jobs.JobProgressListener
 
JobResult - Interface in org.apache.spark.scheduler
:: DeveloperApi :: A result of a job in the DAGScheduler.
jobResult() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
JobSucceeded - Class in org.apache.spark.scheduler
 
JobSucceeded() - Constructor for class org.apache.spark.scheduler.JobSucceeded
 
join(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
join(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD containing all pairs of elements with matching keys in this and other.
join(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join() - Method in class org.apache.spark.sql.execution.Generate
 
join(SchemaRDD, JoinType, Option<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
Performs a relational join on two SchemaRDDs
join(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
joinType() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
jsonFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Loads a JSON file (one object per line), returning the result as a JavaSchemaRDD.
jsonFile(String) - Method in class org.apache.spark.sql.SQLContext
Loads a JSON file (one object per line), returning the result as a SchemaRDD.
jsonFile(String, double) - Method in class org.apache.spark.sql.SQLContext
:: Experimental ::
jsonRDD(JavaRDD<String>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a JavaSchemaRDD.
jsonRDD(RDD<String>) - Method in class org.apache.spark.sql.SQLContext
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a SchemaRDD.
jsonRDD(RDD<String>, double) - Method in class org.apache.spark.sql.SQLContext
:: Experimental ::
jvmInformation() - Method in class org.apache.spark.ui.env.EnvironmentListener
 

K

k() - Method in class org.apache.spark.mllib.clustering.KMeansModel
Total number of clusters.
K_MEANS_PARALLEL() - Static method in class org.apache.spark.mllib.clustering.KMeans
 
KafkaUtils - Class in org.apache.spark.streaming.kafka
 
KafkaUtils() - Constructor for class org.apache.spark.streaming.kafka.KafkaUtils
 
kClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
key() - Method in class org.apache.spark.sql.execution.SetCommand
 
keyBy(Function<T, K>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Creates tuples of the elements in this RDD by applying f.
keyBy(Function1<T, K>) - Method in class org.apache.spark.rdd.RDD
Creates tuples of the elements in this RDD by applying f.
keys() - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the keys of each tuple.
kFold(RDD<T>, int, int, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils
:: Experimental :: Return a k element array of pairs of RDDs with the first element of each pair containing the training data, a complement of the validation data and the second element, the validation data, containing a unique 1/kth of the data.
kManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
KMeans - Class in org.apache.spark.mllib.clustering
K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al).
KMeans() - Constructor for class org.apache.spark.mllib.clustering.KMeans
Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, runs: 1, initializationMode: "k-means||", initializationSteps: 5, epsilon: 1e-4}.
KMeansDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate test data for KMeans.
KMeansDataGenerator() - Constructor for class org.apache.spark.mllib.util.KMeansDataGenerator
 
KMeansModel - Class in org.apache.spark.mllib.clustering
A clustering model for K-means.
KryoRegistrator - Interface in org.apache.spark.serializer
Interface implemented by clients to register their classes with Kryo when using Kryo serialization.
KryoSerializer - Class in org.apache.spark.serializer
A Spark serializer that uses the Kryo serialization library.
KryoSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.KryoSerializer
 

L

L1Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Updater for L1 regularized problems.
L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
 
label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
LabeledPoint - Class in org.apache.spark.mllib.regression
Class that represents the features and labels of a data point.
LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
 
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
LassoModel - Class in org.apache.spark.mllib.regression
Regression model trained using Lasso.
LassoWithSGD - Class in org.apache.spark.mllib.regression
Train a regression model with L1-regularization using Stochastic Gradient Descent.
LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD
Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 1.0, miniBatchFraction: 1.0}.
lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
 
LBFGS - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to solve an optimization problem using Limited-memory BFGS.
LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
 
LeastSquaresGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a Least-squared loss function, as used in linear regression.
LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
left() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
The Streamed Relation
left() - Method in class org.apache.spark.sql.execution.CartesianProduct
 
left() - Method in class org.apache.spark.sql.execution.HashJoin
 
left() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
The Streamed Relation
left() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
leftKeys() - Method in class org.apache.spark.sql.execution.HashJoin
 
leftKeys() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
 
leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
LeftSemiJoinBNL - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Using BroadcastNestedLoopJoin to calculate left semi join result when there's no join keys for hash join.
LeftSemiJoinBNL(SparkPlan, SparkPlan, Option<Expression>, SQLContext) - Constructor for class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
LeftSemiJoinHash - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Build the right table's join keys into a HashSet, and iteratively go through the left table, to find the if join keys are in the Hash set.
LeftSemiJoinHash(Seq<Expression>, Seq<Expression>, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.LeftSemiJoinHash
 
length() - Method in class org.apache.spark.scheduler.SplitInfo
 
length() - Method in class org.apache.spark.sql.api.java.Row
Returns the number of columns present in this Row.
length() - Method in class org.apache.spark.util.Vector
 
Limit - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Take the first limit elements.
Limit(int, SparkPlan, SQLContext) - Constructor for class org.apache.spark.sql.execution.Limit
 
limit() - Method in class org.apache.spark.sql.execution.Limit
 
limit() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
limit(Expression) - Method in class org.apache.spark.sql.SchemaRDD
 
limit(int) - Method in class org.apache.spark.sql.SchemaRDD
Limits the results by the given integer.
LinearDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate sample data used for Linear Data.
LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
 
LinearRegressionModel - Class in org.apache.spark.mllib.regression
Regression model trained using LinearRegression.
LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
Train a linear regression model with no regularization using Stochastic Gradient Descent.
LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Construct a LinearRegression object with default parameters: {stepSize: 1.0, numIterations: 100, miniBatchFraction: 1.0}.
listenerBus() - Method in class org.apache.spark.SparkContext
 
loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
:: Experimental :: Load labeled data from a file.
loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of partitions.
loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the number of features determined automatically and the default number of partitions.
loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of features determined automatically and the default number of partitions.
loadTestTable(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
LocalHiveContext - Class in org.apache.spark.sql.hive
Starts up an instance of hive where metadata is stored locally.
LocalHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.LocalHiveContext
 
localValue() - Method in class org.apache.spark.Accumulable
Get the current value of this accumulator from within a task.
location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
log() - Method in interface org.apache.spark.Logging
 
log_() - Method in interface org.apache.spark.Logging
 
logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
 
logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logDirName() - Method in class org.apache.spark.scheduler.JobLogger
 
logError(Function0<String>) - Method in interface org.apache.spark.Logging
 
logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
Logging - Interface in org.apache.spark
:: DeveloperApi :: Utility trait for classes that want to log data.
logicalPlan() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
logicalPlanToSparkQuery(LogicalPlan) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Allows catalyst LogicalPlans to be executed as a SchemaRDD.
logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
 
logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
LogisticGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a logistic loss function, as used in binary classification.
LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
 
LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate test data for LogisticRegression.
LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
 
LogisticRegressionModel - Class in org.apache.spark.mllib.classification
Classification model trained using Logistic Regression.
LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
Train a classification model for Logistic Regression using Stochastic Gradient Descent.
LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Construct a LogisticRegression object with default parameters
logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
 
logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
 
logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
 
longWritableConverter() - Static method in class org.apache.spark.SparkContext
 
lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the list of values in the RDD for key key.
lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return the list of values in the RDD for key key.
low() - Method in class org.apache.spark.partial.BoundedDouble
 
LZFCompressionCodec - Class in org.apache.spark.io
:: DeveloperApi :: LZF implementation of CompressionCodec.
LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec
 

M

main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
 
makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD.
makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD, with one or more location preferences (hostnames of Spark nodes) for each object.
map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult
Transform this PartialResult into a PartialResult of type T.
map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to all elements of this RDD.
map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream.
map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by applying a function to all elements of this DStream.
mapId() - Method in class org.apache.spark.FetchFailed
 
mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
mapOutputTracker() - Method in class org.apache.spark.SparkEnv
 
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
:: DeveloperApi :: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream.
mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Maps f over this RDD, where f takes an additional parameter of type A.
master() - Method in class org.apache.spark.api.java.JavaSparkContext
 
master() - Method in class org.apache.spark.SparkContext
 
Matrices - Class in org.apache.spark.mllib.linalg
Factory methods for Matrix.
Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
 
Matrix - Interface in org.apache.spark.mllib.linalg
Trait for a local matrix.
MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents an entry in an distributed matrix.
MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation
Model representing the result of matrix factorization.
max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the maximum element from this RDD as defined by the specified Comparator[T].
max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Maximum value of each column.
max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the max of this RDD as defined by the implicit Ordering[T].
max(Duration) - Method in class org.apache.spark.streaming.Duration
 
max(Time) - Method in class org.apache.spark.streaming.Time
 
max() - Method in class org.apache.spark.util.StatCounter
 
maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
maxMem() - Method in class org.apache.spark.storage.StorageStatus
 
maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the mean of this RDD's elements.
mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample mean vector.
mean() - Method in class org.apache.spark.partial.BoundedDouble
 
mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.util.StatCounter
 
meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the approximate mean of the elements in this RDD.
meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the mean within a timeout.
meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
:: Experimental :: Approximate operation to return the mean within a timeout.
MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
 
memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
 
memRemaining() - Method in class org.apache.spark.storage.StorageStatus
 
memSize() - Method in class org.apache.spark.storage.BlockStatus
 
memSize() - Method in class org.apache.spark.storage.RDDInfo
 
memUsed() - Method in class org.apache.spark.storage.StorageStatus
 
memUsedByRDD(int) - Method in class org.apache.spark.storage.StorageStatus
 
merge(R) - Method in class org.apache.spark.Accumulable
Merge two accumulable objects together
merge(double) - Method in class org.apache.spark.util.StatCounter
Add a value into this StatCounter, updating the internal statistics.
merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter
Add multiple values into this StatCounter, updating the internal statistics.
merge(StatCounter) - Method in class org.apache.spark.util.StatCounter
Merge another StatCounter into this one, adding up the internal statistics.
mergeCombiners() - Method in class org.apache.spark.Aggregator
 
mergeValue() - Method in class org.apache.spark.Aggregator
 
metadataCleaner() - Method in class org.apache.spark.SparkContext
 
metastorePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
 
metastorePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
metrics() - Method in class org.apache.spark.ExceptionFailure
 
metricsSystem() - Method in class org.apache.spark.SparkEnv
 
MFDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate RDD(s) containing data for Matrix Factorization.
MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
 
milliseconds() - Method in class org.apache.spark.streaming.Duration
 
Milliseconds - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of milliseconds.
Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
 
milliseconds() - Method in class org.apache.spark.streaming.Time
 
millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener
Reformat a time interval in milliseconds to a prettier format for output
min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the minimum element from this RDD as defined by the specified Comparator[T].
min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Minimum value of each column.
min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the min of this RDD as defined by the implicit Ordering[T].
min(Duration) - Method in class org.apache.spark.streaming.Duration
 
min(Time) - Method in class org.apache.spark.streaming.Time
 
min() - Method in class org.apache.spark.util.StatCounter
 
MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
Minutes - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of minutes.
Minutes() - Constructor for class org.apache.spark.streaming.Minutes
 
MLUtils - Class in org.apache.spark.mllib.util
Helper methods to load, save and pre-process data used in ML Lib.
MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
 
MQTTUtils - Class in org.apache.spark.streaming.mqtt
 
MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
 
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Multiply this matrix by a local matrix on the right.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Multiply this matrix by a local matrix on the right.
multiply(double) - Method in class org.apache.spark.util.Vector
 
MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat
Trait for multivariate statistical summary of a data matrix.
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
 
MutablePair<T1,T2> - Class in org.apache.spark.util
:: DeveloperApi :: A tuple of 2 elements.
MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
 
MutablePair() - Constructor for class org.apache.spark.util.MutablePair
No-arg constructor for serialization

N

NaiveBayes - Class in org.apache.spark.mllib.classification
Trains a Naive Bayes model given an RDD of (label, features) pairs.
NaiveBayes() - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
 
NaiveBayesModel - Class in org.apache.spark.mllib.classification
Model for Naive Bayes Classifiers.
name() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
name() - Method in class org.apache.spark.rdd.RDD
A friendly name for this RDD
name() - Method in class org.apache.spark.scheduler.StageInfo
 
name() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
 
name() - Method in class org.apache.spark.storage.BlockId
A globally unique identifier for this Block.
name() - Method in class org.apache.spark.storage.BroadcastBlockId
 
name() - Method in class org.apache.spark.storage.RDDBlockId
 
name() - Method in class org.apache.spark.storage.RDDInfo
 
name() - Method in class org.apache.spark.storage.ShuffleBlockId
 
name() - Method in class org.apache.spark.storage.StreamBlockId
 
name() - Method in class org.apache.spark.storage.TaskResultBlockId
 
name() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
NarrowDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Base class for dependencies where each partition of the parent RDD is used by at most one partition of the child RDD.
NarrowDependency(RDD<T>) - Constructor for class org.apache.spark.NarrowDependency
 
NativeCommand - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi ::
NativeCommand(String, Seq<Attribute>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.NativeCommand
 
nettyPort() - Method in class org.apache.spark.storage.BlockManagerId
 
networkStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented receiver.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop file with an arbitrary new API InputFormat.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.SparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newBroadcast(T, boolean, long, ClassTag<T>) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
NewHadoopRDD<K,V> - Class in org.apache.spark.rdd
:: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the new MapReduce API (org.apache.hadoop.mapreduce).
NewHadoopRDD(SparkContext, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, Configuration) - Constructor for class org.apache.spark.rdd.NewHadoopRDD
 
newInstance() - Method in class org.apache.spark.serializer.JavaSerializer
 
newInstance() - Method in class org.apache.spark.serializer.KryoSerializer
 
newInstance() - Method in interface org.apache.spark.serializer.Serializer
 
newInstance() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
newKryo() - Method in class org.apache.spark.serializer.KryoSerializer
 
newKryoOutput() - Method in class org.apache.spark.serializer.KryoSerializer
 
newPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
 
next() - Method in class org.apache.spark.InterruptibleIterator
 
Node - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Node in a decision tree
Node(int, double, boolean, Option<Split>, Option<Node>, Option<Node>, Option<InformationGainStats>) - Constructor for class org.apache.spark.mllib.tree.model.Node
 
NODE_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
NONE - Static variable in class org.apache.spark.api.java.StorageLevels
 
NONE() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
NONE() - Static method in class org.apache.spark.storage.StorageLevel
 
numberOfHiccups() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numberOfMsgs() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numberOfWorkers() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numCachedPartitions() - Method in class org.apache.spark.storage.RDDInfo
 
numCols() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Gets or computes the number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.Matrix
Number of columns.
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.SparkContext
 
numNonzeros() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Number of nonzero elements (including explicitly presented zero values) in each column.
numPartitions() - Method in class org.apache.spark.HashPartitioner
 
numPartitions() - Method in class org.apache.spark.Partitioner
 
numPartitions() - Method in class org.apache.spark.RangePartitioner
 
numPartitions() - Method in class org.apache.spark.storage.RDDInfo
 
numRows() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Gets or computes the number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.Matrix
Number of rows.
numShufflePartitions() - Method in interface org.apache.spark.sql.SQLConf
Number of partitions to use for shuffle operators.
numTasks() - Method in class org.apache.spark.scheduler.StageInfo
 

O

objectFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
OFF_HEAP - Static variable in class org.apache.spark.api.java.StorageLevels
 
OFF_HEAP() - Static method in class org.apache.spark.storage.StorageLevel
 
onApplicationEnd(SparkListenerApplicationEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the application ends
onApplicationStart(SparkListenerApplicationStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the application starts
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
onBatchCompleted(StreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when processing of a batch of jobs has completed.
onBatchStarted(StreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when processing of a batch of jobs has started.
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a batch of jobs has been submitted for processing.
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a new block manager has joined
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.storage.StorageStatusListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in interface org.apache.spark.scheduler.SparkListener
Called when an existing block manager has been removed
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.storage.StorageStatusListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
 
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in interface org.apache.spark.FutureAction
When this action is completed, either through an exception, or a value, applies the provided function.
onComplete(Function1<R, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult
Set a handler to be called when this PartialResult completes.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in interface org.apache.spark.scheduler.SparkListener
Called when environment properties have been updated
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.env.EnvironmentListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
ones(int) - Static method in class org.apache.spark.util.Vector
 
OneToOneDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs.
OneToOneDependency(RDD<T>) - Constructor for class org.apache.spark.OneToOneDependency
 
onFail(Function1<Exception, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult
Set a handler to be called if this PartialResult's job fails.
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.JobLogger
When job ends, recording job completion status and close log file
onJobEnd(SparkListenerJobEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a job ends
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.JobLogger
When job starts, record job property and stage graph
onJobStart(SparkListenerJobStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a job starts
onReceiverError(StreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has reported an error
onReceiverStarted(StreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has been started
onReceiverStopped(StreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has been stopped
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.JobLogger
When stage is completed, record stage completion status
onStageCompleted(SparkListenerStageCompleted) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a stage completes successfully or fails, with information on the completed stage.
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.StatsReportListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.storage.StorageListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.JobLogger
When stage is submitted, record stage submit info
onStageSubmitted(SparkListenerStageSubmitted) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a stage is submitted
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
For FIFO, all stages are contained by "default" pool but "default" pool here is meaningless
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.storage.StorageListener
 
onStart() - Method in class org.apache.spark.streaming.receiver.Receiver
This method is called by the system when the receiver is started.
onStop() - Method in class org.apache.spark.streaming.receiver.Receiver
This method is called by the system when the receiver is stopped.
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.JobLogger
When task ends, record task completion status and metrics
onTaskEnd(SparkListenerTaskEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task ends
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.StatsReportListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.storage.StorageStatusListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.storage.StorageListener
Assumes the storage status list is fully up-to-date.
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task begins remotely fetching its result (will not be called for tasks that do not need to fetch the result remotely).
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onTaskStart(SparkListenerTaskStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task starts
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in interface org.apache.spark.scheduler.SparkListener
Called when an RDD is manually unpersisted by the application
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.storage.StorageStatusListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.ui.storage.StorageListener
 
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.GradientDescent
:: DeveloperApi :: Runs gradient descent on the given training data.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.LBFGS
 
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in interface org.apache.spark.mllib.optimization.Optimizer
Solve the provided convex optimization problem.
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
 
Optimizer - Interface in org.apache.spark.mllib.optimization
:: DeveloperApi :: Trait for optimization problem solvers.
optimizer() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
The optimizer to solve the problem.
optimizer() - Method in class org.apache.spark.mllib.regression.LassoWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
 
orderBy(Seq<SortOrder>) - Method in class org.apache.spark.sql.SchemaRDD
Sorts the results by the given expressions.
OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
OrderedRDDFunctions(RDD<P>, Ordering<K>, ClassTag<K>, ClassTag<V>, ClassTag<P>) - Constructor for class org.apache.spark.rdd.OrderedRDDFunctions
 
ordering() - Method in class org.apache.spark.sql.execution.Sort
 
ordering() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
ordering() - Static method in class org.apache.spark.streaming.Time
 
org.apache.spark - package org.apache.spark
Core Spark classes in Scala.
org.apache.spark.annotation - package org.apache.spark.annotation
Spark annotations to mark an API experimental or intended only for advanced usages by developers.
org.apache.spark.api.java - package org.apache.spark.api.java
Spark Java programming APIs.
org.apache.spark.api.java.function - package org.apache.spark.api.java.function
Set of interfaces to represent functions in Spark's Java API.
org.apache.spark.broadcast - package org.apache.spark.broadcast
Spark's broadcast variables, used to broadcast immutable datasets to all nodes.
org.apache.spark.io - package org.apache.spark.io
IO codecs used for compression.
org.apache.spark.mllib.classification - package org.apache.spark.mllib.classification
 
org.apache.spark.mllib.clustering - package org.apache.spark.mllib.clustering
 
org.apache.spark.mllib.evaluation - package org.apache.spark.mllib.evaluation
 
org.apache.spark.mllib.linalg - package org.apache.spark.mllib.linalg
 
org.apache.spark.mllib.linalg.distributed - package org.apache.spark.mllib.linalg.distributed
 
org.apache.spark.mllib.optimization - package org.apache.spark.mllib.optimization
 
org.apache.spark.mllib.recommendation - package org.apache.spark.mllib.recommendation
 
org.apache.spark.mllib.regression - package org.apache.spark.mllib.regression
 
org.apache.spark.mllib.stat - package org.apache.spark.mllib.stat
 
org.apache.spark.mllib.tree - package org.apache.spark.mllib.tree
 
org.apache.spark.mllib.tree.configuration - package org.apache.spark.mllib.tree.configuration
 
org.apache.spark.mllib.tree.impurity - package org.apache.spark.mllib.tree.impurity
 
org.apache.spark.mllib.tree.model - package org.apache.spark.mllib.tree.model
 
org.apache.spark.mllib.util - package org.apache.spark.mllib.util
 
org.apache.spark.partial - package org.apache.spark.partial
 
org.apache.spark.rdd - package org.apache.spark.rdd
Provides implementation's of various RDDs.
org.apache.spark.scheduler - package org.apache.spark.scheduler
Spark's DAG scheduler.
org.apache.spark.serializer - package org.apache.spark.serializer
Pluggable serializers for RDD and shuffle data.
org.apache.spark.sql - package org.apache.spark.sql
 
org.apache.spark.sql.api.java - package org.apache.spark.sql.api.java
 
org.apache.spark.sql.execution - package org.apache.spark.sql.execution
 
org.apache.spark.sql.hive - package org.apache.spark.sql.hive
 
org.apache.spark.sql.hive.api.java - package org.apache.spark.sql.hive.api.java
 
org.apache.spark.sql.hive.execution - package org.apache.spark.sql.hive.execution
 
org.apache.spark.sql.hive.test - package org.apache.spark.sql.hive.test
 
org.apache.spark.sql.parquet - package org.apache.spark.sql.parquet
 
org.apache.spark.sql.test - package org.apache.spark.sql.test
 
org.apache.spark.storage - package org.apache.spark.storage
 
org.apache.spark.streaming - package org.apache.spark.streaming
 
org.apache.spark.streaming.api.java - package org.apache.spark.streaming.api.java
Java APIs for spark streaming.
org.apache.spark.streaming.dstream - package org.apache.spark.streaming.dstream
Various implementations of DStreams.
org.apache.spark.streaming.flume - package org.apache.spark.streaming.flume
Spark streaming receiver for Flume.
org.apache.spark.streaming.kafka - package org.apache.spark.streaming.kafka
Kafka receiver for spark streaming.
org.apache.spark.streaming.mqtt - package org.apache.spark.streaming.mqtt
MQTT receiver for Spark Streaming.
org.apache.spark.streaming.receiver - package org.apache.spark.streaming.receiver
 
org.apache.spark.streaming.scheduler - package org.apache.spark.streaming.scheduler
 
org.apache.spark.streaming.twitter - package org.apache.spark.streaming.twitter
Twitter feed receiver for spark streaming.
org.apache.spark.streaming.zeromq - package org.apache.spark.streaming.zeromq
Zeromq receiver for spark streaming.
org.apache.spark.ui.env - package org.apache.spark.ui.env
 
org.apache.spark.ui.exec - package org.apache.spark.ui.exec
 
org.apache.spark.ui.jobs - package org.apache.spark.ui.jobs
 
org.apache.spark.ui.storage - package org.apache.spark.ui.storage
 
org.apache.spark.util - package org.apache.spark.util
Spark utilities.
org.apache.spark.util.random - package org.apache.spark.util.random
Utilities for random number generation.
otherCopyArgs() - Method in class org.apache.spark.sql.execution.Aggregate
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.Limit
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.SetCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.Union
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
otherCopyArgs() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
otherCopyArgs() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
otherInfo() - Method in class org.apache.spark.streaming.receiver.Statistics
 
outer() - Method in class org.apache.spark.sql.execution.Generate
 
output() - Method in class org.apache.spark.sql.execution.Aggregate
 
output() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
output() - Method in class org.apache.spark.sql.execution.CacheCommand
 
output() - Method in class org.apache.spark.sql.execution.CartesianProduct
 
output() - Method in class org.apache.spark.sql.execution.DescribeCommand
 
output() - Method in class org.apache.spark.sql.execution.Exchange
 
output() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
output() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
output() - Method in class org.apache.spark.sql.execution.Filter
 
output() - Method in class org.apache.spark.sql.execution.Generate
 
output() - Method in class org.apache.spark.sql.execution.HashJoin
 
output() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
output() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
output() - Method in class org.apache.spark.sql.execution.Limit
 
output() - Method in class org.apache.spark.sql.execution.Project
 
output() - Method in class org.apache.spark.sql.execution.Sample
 
output() - Method in class org.apache.spark.sql.execution.SetCommand
 
output() - Method in class org.apache.spark.sql.execution.Sort
 
output() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
output() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
output() - Method in class org.apache.spark.sql.execution.Union
 
output() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
output() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
output() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
output() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
output() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
output() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
output() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
outputClass() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.HashJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.SparkPlan
Specifies how data is partitioned across different nodes in the cluster.
overwrite() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
overwrite() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 

P

PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream
Extra functions available on DStream of (key, value) pairs through an implicit conversion.
PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function
A function that returns zero or more key-value pair records from each input record.
PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function
A function that returns key-value pairs (Tuple2), and can be used to construct PairRDDs.
PairRDDFunctions<K,V> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
 
parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
PARQUET_FILTER_DATA() - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
PARQUET_FILTER_PUSHDOWN_ENABLED() - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
parquetFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Loads a parquet file, returning the result as a JavaSchemaRDD.
parquetFile(String) - Method in class org.apache.spark.sql.SQLContext
Loads a Parquet file, returning the result as a SchemaRDD.
ParquetFilters - Class in org.apache.spark.sql.parquet
 
ParquetFilters() - Constructor for class org.apache.spark.sql.parquet.ParquetFilters
 
ParquetTableScan - Class in org.apache.spark.sql.parquet
Parquet table scan operator.
ParquetTableScan(Seq<Attribute>, ParquetRelation, Seq<Expression>, SQLContext) - Constructor for class org.apache.spark.sql.parquet.ParquetTableScan
 
partial() - Method in class org.apache.spark.sql.execution.Aggregate
 
PartialResult<R> - Class in org.apache.spark.partial
 
PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
 
Partition - Interface in org.apache.spark
A partition of an RDD.
partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return a copy of the RDD partitioned using the specified partitioner.
Partitioner - Class in org.apache.spark
An object that defines how the elements in a key-value pair RDD are partitioned by key.
Partitioner() - Constructor for class org.apache.spark.Partitioner
 
partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
partitioner() - Method in class org.apache.spark.rdd.RDD
Optionally overridden by subclasses to specify how they are partitioned.
partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
 
partitioner() - Method in class org.apache.spark.ShuffleDependency
 
partitionId() - Method in class org.apache.spark.TaskContext
 
partitionPruningPred() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
PartitionPruningRDD<T> - Class in org.apache.spark.rdd
:: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
 
partitions() - Method in class org.apache.spark.rdd.RDD
Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.
path() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
path() - Method in class org.apache.spark.scheduler.SplitInfo
 
percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.rdd.RDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist(StorageLevel) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist the RDDs of this DStream with the given storage level
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream
Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.dstream.DStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persistentRdds() - Method in class org.apache.spark.SparkContext
 
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(String) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector
return (this + plus) dot other, but without creating any intermediate storage
poisson() - Method in class org.apache.spark.util.random.PoissonSampler
 
PoissonSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on values drawn from Poisson distribution.
PoissonSampler(double, Poisson) - Constructor for class org.apache.spark.util.random.PoissonSampler
 
poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
port() - Method in class org.apache.spark.storage.BlockManagerId
 
pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.
precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, precision) curve.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for examples stored in a JavaRDD.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Returns the cluster index that a given point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Maps given points to their cluster indices.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Maps given points to their cluster indices.
predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Predict the rating of one user for one product.
predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Predict the rating of many users for many products.
predict(JavaRDD<byte[]>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
:: DeveloperApi :: Predict the rating of many users for many products.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for examples stored in a JavaRDD.
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Predict values for the given data set using the model trained.
predict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
predict() - Method in class org.apache.spark.mllib.tree.model.Node
 
predictIfLeaf(Vector) - Method in class org.apache.spark.mllib.tree.model.Node
predict value if node is not leaf
preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver
Override this to specify a preferred location (hostname).
preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
Get the preferred locations of a partition (as hostnames), taking into account whether the RDD is checkpointed.
preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
 
prettyPrint() - Method in class org.apache.spark.streaming.Duration
 
prev() - Method in class org.apache.spark.rdd.ShuffledRDD
 
print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Print the first ten elements of each RDD generated in this DStream.
print() - Method in class org.apache.spark.streaming.dstream.DStream
Print the first ten elements of each RDD generated in this DStream.
printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for the all jobs of this batch to finish processing from the time they started processing.
processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
product() - Method in class org.apache.spark.mllib.recommendation.Rating
 
productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
productToRowRdd(RDD<A>) - Static method in class org.apache.spark.sql.execution.ExistingRdd
 
Project - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Project(Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Project
 
projectList() - Method in class org.apache.spark.sql.execution.Project
 
properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
pruneColumns(Seq<Attribute>) - Method in class org.apache.spark.sql.parquet.ParquetTableScan
Applies a (candidate) projection.
Pseudorandom - Interface in org.apache.spark.util.random
:: DeveloperApi :: A class with pseudorandom behavior.
putCachedMetadata(String, Object) - Static method in class org.apache.spark.rdd.HadoopRDD
 

Q

quantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
QuantileStrategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum for selecting the quantile calculation strategy
QuantileStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
QueryExecutionException - Exception in org.apache.spark.sql.execution
 
QueryExecutionException(String) - Constructor for exception org.apache.spark.sql.execution.QueryExecutionException
 
queueStream(Queue<JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean, JavaRDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream from a queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream from a queue of RDDs.

R

RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
 
random(int, Random) - Static method in class org.apache.spark.util.Vector
Creates this Vector of given length containing random numbers between 0.0 and 1.0.
RandomSampler<T,U> - Interface in org.apache.spark.util.random
:: DeveloperApi :: A pseudorandom sampler.
randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD
Randomly splits this RDD with the provided weights.
RangeDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
 
RangePartitioner<K,V> - Class in org.apache.spark
A Partitioner that partitions sortable records by range into roughly equal ranges.
RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
 
rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
Rating - Class in org.apache.spark.mllib.recommendation
:: Experimental :: A more compact class to represent a rating than Tuple3[Int, Int, Double].
Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
 
rating() - Method in class org.apache.spark.mllib.recommendation.Rating
 
rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
 
rdd() - Method in class org.apache.spark.api.java.JavaRDD
 
rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
rdd() - Method in class org.apache.spark.Dependency
 
RDD<T> - Class in org.apache.spark.rdd
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
 
RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
Construct an RDD with just a one-to-one dependency on one parent
rdd() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
rdd() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
RDD() - Static method in class org.apache.spark.storage.BlockId
 
RDDBlockId - Class in org.apache.spark.storage
 
RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
 
rddBlocks() - Method in class org.apache.spark.storage.StorageStatus
 
rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
rddId() - Method in class org.apache.spark.storage.RDDBlockId
 
RDDInfo - Class in org.apache.spark.storage
 
RDDInfo(int, String, int, StorageLevel) - Constructor for class org.apache.spark.storage.RDDInfo
 
rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener
Filter RDD info to include only those with cached partitions
rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
 
rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
rdds() - Method in class org.apache.spark.rdd.UnionRDD
 
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
 
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
 
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
 
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
 
readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
 
readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
readObject(ClassTag<T>) - Method in interface org.apache.spark.serializer.DeserializationStream
 
ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
 
ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
Blocks until this action completes.
ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
 
reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, recall) curve.
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
Receiver<T> - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: Abstract class of a receiver that can be run on worker nodes to receive external data.
Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
 
ReceiverInfo - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Class having information about a receiver
ReceiverInfo(int, String, ActorRef, boolean, String, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream
Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data.
ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented receiver.
receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented receiver.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Create a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by reducing over a using incremental computation.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for reduceByKeyLocally
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceId() - Method in class org.apache.spark.FetchFailed
 
reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
references() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
 
registerRDDAsTable(JavaSchemaRDD, String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Registers the given RDD as a temporary table in the catalog.
registerRDDAsTable(SchemaRDD, String) - Method in class org.apache.spark.sql.SQLContext
Registers the given RDD as a temporary table in the catalog.
registerTestTable(TestHiveContext.TestTable) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
RegressionModel - Interface in org.apache.spark.mllib.regression
 
relation() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
relation() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
relation() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Sets each DStreams in this context to remember RDDs it generated in the last given duration.
remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext
Set each DStreams in this context to remember RDDs it generated in the last given duration.
rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
remove(String) - Method in class org.apache.spark.SparkConf
Remove a parameter from the configuration
repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream with an increased or decreased level of parallelism.
replication() - Method in class org.apache.spark.storage.StorageLevel
 
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Report exceptions in receiving data.
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Aggregate
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.HashJoin
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Sort
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.SparkPlan
Specifies any partition requirements on the input data for this operator.
reset() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
Resets the test instance by deleting any tables that have been created.
restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
Resubmitted - Class in org.apache.spark
:: DeveloperApi :: A ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed.
Resubmitted() - Constructor for class org.apache.spark.Resubmitted
 
result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
 
result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
Awaits and returns the result (of type T) of this action.
result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
 
resultAttribute() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
 
retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
RidgeRegressionModel - Class in org.apache.spark.mllib.regression
Regression model trained using RidgeRegression.
RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression
Train a regression model with L2-regularization using Stochastic Gradient Descent.
RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 1.0, miniBatchFraction: 1.0}.
right() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
The Broadcast relation
right() - Method in class org.apache.spark.sql.execution.CartesianProduct
 
right() - Method in class org.apache.spark.sql.execution.HashJoin
 
right() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
The Broadcast relation
right() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
rightKeys() - Method in class org.apache.spark.sql.execution.HashJoin
 
rightKeys() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
 
rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
Row - Class in org.apache.spark.sql.api.java
A result row from a SparkSQL query.
Row(Row) - Constructor for class org.apache.spark.sql.api.java.Row
 
row() - Method in class org.apache.spark.sql.api.java.Row
 
RowMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row-oriented distributed Matrix with no meaningful row indices.
RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
Executes some action enclosed in the closure.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans
Train a K-means model on the given set of points; data should be cached for high performance, because this is an iterative algorithm.
run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
Run ALS with the configured parameters on an input RDD of (user, product, rating) triples.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries starting from the initial weights provided.
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, <any>, long) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Run a job that can return approximate results.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction
Runs a Spark job.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a function on a given set of partitions in an RDD and pass the results to the given handler function.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on a given set of partitions of an RDD, but take a function of type Iterator[T] => U instead of (TaskContext, Iterator[T]) => U.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and pass the results to a handler function.
runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and pass the results to a handler function.
runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS
Run Limited-memory BFGS (L-BFGS) in parallel.
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
Run stochastic gradient descent (SGD) in parallel using mini batches.
running() - Method in class org.apache.spark.scheduler.TaskInfo
 
runningLocally() - Method in class org.apache.spark.TaskContext
 
runSqlHive(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 

S

s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a sampled subset of this RDD.
sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD
Return a sampled subset of this RDD.
Sample - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Sample(double, boolean, long, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sample
 
sample(boolean, double, long) - Method in class org.apache.spark.sql.SchemaRDD
:: Experimental :: Returns a sampled version of the underlying dataset.
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
 
sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
 
sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler
take a random sample
sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.util.StatCounter
Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the sample variance of this RDD's elements (which corrects for bias in estimating the standard variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the sample variance of this RDD's elements (which corrects for bias in estimating the variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.util.StatCounter
Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsHiveFile(RDD<Writable>, Class<?>, FileSinkDesc, JobConf, boolean) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Save labeled data in LIBSVM format.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported storage system, using a Configuration object for that storage system.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop Configuration object for that storage system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
Save each RDD in this DStream as a Sequence file of serialized objects.
saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.
saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
Save each RDD in this DStream as at text file, using string representation of elements.
saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
:: Experimental :: Save labeled data to a file.
sc() - Method in class org.apache.spark.api.java.JavaSparkContext
 
sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
sc() - Method in class org.apache.spark.streaming.StreamingContext
 
scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
scheduler() - Method in class org.apache.spark.streaming.StreamingContext
 
schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for the first job of this batch to start processing from the time this batch was submitted to the streaming scheduler.
SchedulingMode - Class in org.apache.spark.scheduler
"FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues "NONE" is used when the a Schedulable has no sub-queues.
SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
 
schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
SchemaRDD - Class in org.apache.spark.sql
:: AlphaComponent :: An RDD of Row objects that has an associated schema.
SchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.SchemaRDD
 
script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
ScriptTransformation - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi :: Transforms the input by forking and running the specified script.
ScriptTransformation(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformation
 
seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
Seconds - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of seconds.
Seconds() - Constructor for class org.apache.spark.streaming.Seconds
 
securityManager() - Method in class org.apache.spark.SparkEnv
 
seed() - Method in class org.apache.spark.sql.execution.Sample
 
select(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
Changes the output of this relation to the given expressions, similar to the SELECT clause in SQL.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop SequenceFile.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext
Version of sequenceFile() for types implicitly convertible to Writables through a WritableConverter.
SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
 
SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
 
SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
 
SerializationStream - Interface in org.apache.spark.serializer
:: DeveloperApi :: A stream for writing serialized objects.
serialize(T, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializerInstance
 
serializedSize() - Method in class org.apache.spark.scheduler.TaskInfo
 
serializeFilterExpressions(Seq<Expression>, Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
Note: Inside the Hadoop API we only have access to Configuration, not to SparkContext, so we cannot use broadcasts to convey the actual filter predicate.
serializeMany(Iterator<T>, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializerInstance
 
Serializer - Interface in org.apache.spark.serializer
:: DeveloperApi :: A serializer.
serializer() - Method in class org.apache.spark.ShuffleDependency
 
serializer() - Method in class org.apache.spark.SparkEnv
 
SerializerInstance - Interface in org.apache.spark.serializer
:: DeveloperApi :: An instance of a serializer, for use by one thread at a time.
serializeStream(OutputStream) - Method in interface org.apache.spark.serializer.SerializerInstance
 
set(String, String) - Method in class org.apache.spark.SparkConf
Set a configuration variable.
set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
 
set(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
 
set(Properties) - Method in interface org.apache.spark.sql.SQLConf
 
set(String, String) - Method in interface org.apache.spark.sql.SQLConf
 
setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
Set multiple parameters together
setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
:: Experimental :: Sets the constant used in computing confidence in implicit ALS.
setAppName(String) - Method in class org.apache.spark.SparkConf
Set a name for your application.
setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of blocks to parallelize the computation into; pass -1 for an auto-configured number of blocks.
setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Pass-through to SparkContext.setCallSite.
setCallSite(String) - Method in class org.apache.spark.SparkContext
Support function for API backtraces.
setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Set the directory under which RDDs are going to be checkpointed.
setCheckpointDir(String) - Method in class org.apache.spark.SparkContext
Set the directory under which RDDs are going to be checkpointed.
SetCommand - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
SetCommand(Option<String>, Option<String>, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.SetCommand
 
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the convergence tolerance of iterations for L-BFGS.
setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the distance threshold within which we've consider centers to have converged.
setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf
Set an environment variable to be used when launching executors for this application.
setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
Set multiple environment variables to be used when launching executors.
setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf
Set multiple environment variables to be used when launching executors.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the gradient function (of the loss function of one single data example) to be used for SGD.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.
setIfMissing(String, String) - Method in class org.apache.spark.SparkConf
Set a parameter if it isn't already configured
setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
Sets whether to use implicit preference.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the initialization algorithm.
setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the number of steps for the k-means|| initialization mode.
setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Set if the algorithm should add an intercept.
setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of iterations to run.
setJars(Seq<String>) - Method in class org.apache.spark.SparkConf
Set JAR files to distribute to the cluster.
setJars(String[]) - Method in class org.apache.spark.SparkConf
Set JAR files to distribute to the cluster.
setJobDescription(String) - Method in class org.apache.spark.SparkContext
Set a human readable description of the current job.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the number of clusters to create (k).
setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
Set the smoothing parameter.
setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the regularization parameter, lambda.
setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext
Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setMaster(String) - Method in class org.apache.spark.SparkConf
The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set maximum number of iterations to run.
setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the maximal number of iterations for L-BFGS.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
:: Experimental :: Set fraction of data to be used for each SGD iteration.
setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.rdd.RDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Assign a name to this RDD
setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the number of corrections used in the LBFGS update.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the number of iterations for SGD.
setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the rank of the feature matrices computed (number of features).
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the regularization parameter.
setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans
:: Experimental :: Set the number of runs of the algorithm to execute in parallel.
setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
Sets a random seed to have deterministic results.
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
 
setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
 
setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom
Set random seed.
setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
 
setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
 
setSparkHome(String) - Method in class org.apache.spark.SparkConf
Set the location where Spark is installed on worker nodes.
setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the initial step size of SGD for the first step.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
:: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel
:: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
settings() - Method in interface org.apache.spark.sql.SQLConf
 
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the updater function to actually perform a gradient step in a given direction.
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the updater function to actually perform a gradient step in a given direction.
setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Set if the algorithm should validate data before training.
setValue(R) - Method in class org.apache.spark.Accumulable
Set the accumulator's value; only allowed on master
showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
 
ShuffleBlockId - Class in org.apache.spark.storage
 
ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
 
ShuffleDependency<K,V> - Class in org.apache.spark
:: DeveloperApi :: Represents a dependency on the output of a shuffle stage.
ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Serializer) - Constructor for class org.apache.spark.ShuffleDependency
 
ShuffledRDD<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd
:: DeveloperApi :: The resulting RDD from a shuffle (e.g.
ShuffledRDD(RDD<P>, Partitioner, ClassTag<P>) - Constructor for class org.apache.spark.rdd.ShuffledRDD
 
shuffleFetcher() - Method in class org.apache.spark.SparkEnv
 
shuffleId() - Method in class org.apache.spark.FetchFailed
 
shuffleId() - Method in class org.apache.spark.ShuffleDependency
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
shuffleMemoryMap() - Method in class org.apache.spark.SparkEnv
 
shuffleRead() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
 
shuffleWrite() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
 
sideEffectResult() - Method in interface org.apache.spark.sql.execution.Command
A concrete command should override this lazy field to wrap up any side effects caused by the command or any other computation that should be evaluated exactly once.
SimpleFutureAction<T> - Class in org.apache.spark
:: Experimental :: A FutureAction holding the result of an action that triggers a single job.
SimpleUpdater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: A simple updater for gradient descent *without* any regularization.
SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
 
SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg
:: Experimental :: Represents singular value decomposition (SVD) factors.
SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
size() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
size() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
size() - Method in interface org.apache.spark.mllib.linalg.Vector
Size of the vector.
slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream
Return all the RDDs defined by the Interval object (both end times included)
slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream
Return all the RDDs between 'fromTime' to 'toTime' (both included)
slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream
Time interval after which the DStream generates a RDD
slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
SnappyCompressionCodec - Class in org.apache.spark.io
:: DeveloperApi :: Snappy implementation of CompressionCodec.
SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
 
socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from TCP source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from TCP source hostname:port.
Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
Sort - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Sort(Seq<SortOrder>, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sort
 
sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements in ascending order.
sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortOrder() - Method in class org.apache.spark.sql.execution.Sort
 
sortOrder() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
 
SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
 
SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
 
SPARK_UNKNOWN_USER() - Static method in class org.apache.spark.SparkContext
 
SPARK_VERSION() - Static method in class org.apache.spark.SparkContext
 
SparkConf - Class in org.apache.spark
Configuration for a Spark application.
SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
 
SparkConf() - Constructor for class org.apache.spark.SparkConf
Create a SparkConf that loads defaults from system properties and the classpath
sparkContext() - Method in class org.apache.spark.rdd.RDD
The SparkContext that created this RDD.
SparkContext - Class in org.apache.spark
Main entry point for Spark functionality.
SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
 
SparkContext() - Constructor for class org.apache.spark.SparkContext
Create a SparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
:: DeveloperApi :: Alternative constructor for setting preferred locations where Spark will create executors.
SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
sparkContext() - Method in class org.apache.spark.sql.SQLContext
 
sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
The underlying SparkContext
sparkContext() - Method in class org.apache.spark.streaming.StreamingContext
Return the associated Spark context
SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
 
SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
 
SparkEnv - Class in org.apache.spark
:: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc.
SparkEnv(String, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleFetcher, org.apache.spark.broadcast.BroadcastManager, org.apache.spark.storage.BlockManager, ConnectionManager, SecurityManager, HttpFileServer, String, org.apache.spark.metrics.MetricsSystem, SparkConf) - Constructor for class org.apache.spark.SparkEnv
 
SparkException - Exception in org.apache.spark
 
SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
 
SparkException(String) - Constructor for exception org.apache.spark.SparkException
 
SparkFiles - Class in org.apache.spark
Resolves paths to files added through SparkContext.addFile().
SparkFiles() - Constructor for class org.apache.spark.SparkFiles
 
sparkFilesDir() - Method in class org.apache.spark.SparkEnv
 
SparkFlumeEvent - Class in org.apache.spark.streaming.flume
A wrapper class for AvroFlumeEvent's with a custom serialization format.
SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
 
SparkListener - Interface in org.apache.spark.scheduler
:: DeveloperApi :: Interface for listening to events from the Spark scheduler.
SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
 
SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
SparkListenerApplicationStart - Class in org.apache.spark.scheduler
 
SparkListenerApplicationStart(String, long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
 
SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
 
SparkListenerBlockManagerAdded(BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
 
SparkListenerBlockManagerRemoved(BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
 
SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
SparkListenerEvent - Interface in org.apache.spark.scheduler
 
SparkListenerJobEnd - Class in org.apache.spark.scheduler
 
SparkListenerJobEnd(int, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
 
SparkListenerJobStart - Class in org.apache.spark.scheduler
 
SparkListenerJobStart(int, Seq<Object>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
 
SparkListenerStageCompleted - Class in org.apache.spark.scheduler
 
SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
 
SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
 
SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
SparkListenerTaskEnd - Class in org.apache.spark.scheduler
 
SparkListenerTaskEnd(int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
 
SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
 
SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
SparkListenerTaskStart - Class in org.apache.spark.scheduler
 
SparkListenerTaskStart(int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
 
SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
 
SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
SparkLogicalPlan - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Allows already planned SparkQueries to be linked into logical query plans.
SparkLogicalPlan(SparkPlan) - Constructor for class org.apache.spark.sql.execution.SparkLogicalPlan
 
SparkPlan - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
SparkPlan() - Constructor for class org.apache.spark.sql.execution.SparkPlan
 
sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
 
sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
sparkUser() - Method in class org.apache.spark.SparkContext
 
sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector providing its index array and value array.
sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector using unordered (index, value) pairs.
sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
SparseVector - Class in org.apache.spark.mllib.linalg
A sparse vector represented by an index array and an value array.
SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
 
split() - Method in class org.apache.spark.mllib.tree.model.Node
 
Split - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Split applied to a feature
Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
 
splitId() - Method in class org.apache.spark.TaskContext
 
splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
 
SplitInfo - Class in org.apache.spark.scheduler
 
SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
 
splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
Set of partitions in this RDD.
sql(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Executes a query expressed in SQL, returning the result as a JavaSchemaRDD
sql() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
sql(String) - Method in class org.apache.spark.sql.SQLContext
Executes a SQL query using Spark, returning the result as a SchemaRDD.
SQLConf - Interface in org.apache.spark.sql
SQLConf holds mutable config parameters and hints.
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSQLContext
 
sqlContext() - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
 
sqlContext() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
sqlContext() - Method in class org.apache.spark.sql.SchemaRDD
 
SQLContext - Class in org.apache.spark.sql
:: AlphaComponent :: The entry point for running relational queries using Spark.
SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
 
squaredDist(Vector) - Method in class org.apache.spark.util.Vector
 
SquaredL2Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Updater for L2 regularized problems.
SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
 
srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
ssc() - Method in class org.apache.spark.streaming.dstream.DStream
 
stackTrace() - Method in class org.apache.spark.ExceptionFailure
 
stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
 
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
stageId() - Method in class org.apache.spark.scheduler.StageInfo
 
stageId() - Method in class org.apache.spark.TaskContext
 
stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
stageIdToDescription() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToDiskBytesSpilled() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToExecutorSummaries() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToMemoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToPool() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToShuffleRead() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToShuffleWrite() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToTaskData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToTasksActive() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToTasksComplete() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToTasksFailed() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToTime() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
 
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
StageInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Stores information about a stage to pass from the scheduler to SparkListeners.
StageInfo(int, String, int, Seq<RDDInfo>) - Constructor for class org.apache.spark.scheduler.StageInfo
 
start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Start the execution of the streams.
start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.InputDStream
Method called to start receiving data.
start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
start() - Method in class org.apache.spark.streaming.StreamingContext
Start the execution of the streams.
startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
 
startTime() - Method in class org.apache.spark.SparkContext
 
StatCounter - Class in org.apache.spark.util
A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.
StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
 
StatCounter() - Constructor for class org.apache.spark.util.StatCounter
Initialize the StatCounter with no values.
state() - Method in class org.apache.spark.streaming.StreamingContext
 
Statistics - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: Statistics for querying the supervisor about state of workers.
Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
 
stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.mllib.tree.model.Node
 
stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
StatsReportListener - Class in org.apache.spark.scheduler
:: DeveloperApi :: Simple SparkListener that logs a few summary statistics when each stage completes
StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
 
StatsReportListener - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: A simple StreamingListener that logs summary statistics across Spark Streaming batches
StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
 
status() - Method in class org.apache.spark.scheduler.TaskInfo
 
stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.util.StatCounter
Return the standard deviation of the values.
stop() - Method in class org.apache.spark.api.java.JavaSparkContext
Shut down the SparkContext.
stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
stop() - Method in class org.apache.spark.SparkContext
Shut down the SparkContext.
stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.InputDStream
Method called to stop receiving data.
stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver
Stop the receiver completely.
stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Stop the receiver completely due to an exception
stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext
Stop the execution of the streams immediately (does not wait for all received data to be processed).
stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext
Stop the execution of the streams, with option of ensuring all received data has been processed.
storageLevel() - Method in class org.apache.spark.storage.BlockStatus
 
storageLevel() - Method in class org.apache.spark.storage.RDDInfo
 
StorageLevel - Class in org.apache.spark.storage
:: DeveloperApi :: Flags for controlling the storage of an RDD.
StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
 
storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
 
storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
 
storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
StorageLevels - Class in org.apache.spark.api.java
Expose some commonly useful storage level constants.
StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
 
StorageListener - Class in org.apache.spark.ui.storage
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the BlockManagerUI.
StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
 
StorageStatus - Class in org.apache.spark.storage
:: DeveloperApi :: Storage information for each BlockManager.
StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus
 
storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
 
storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
 
StorageStatusListener - Class in org.apache.spark.storage
:: DeveloperApi :: A SparkListener that maintains executor storage status.
StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
 
store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store the bytes of received data as a data block into Spark's memory.
store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store a single item of received data to Spark's memory.
store(T) - Method in class org.apache.spark.streaming.receiver.Receiver
Store a single item of received data to Spark's memory.
store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an ArrayBuffer of received data as a data block into Spark's memory.
store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an ArrayBuffer of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver
Store the bytes of received data as a data block into Spark's memory.
store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store the bytes of received data as a data block into Spark's memory.
Strategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Stores all the configuration options for tree construction
Strategy(Enumeration.Value, Impurity, int, int, Enumeration.Value, Map<Object, Object>, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
 
STREAM() - Static method in class org.apache.spark.storage.BlockId
 
StreamBlockId - Class in org.apache.spark.storage
 
StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
 
streamed() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
 
streamed() - Method in class org.apache.spark.sql.execution.LeftSemiJoinBNL
 
streamedKeys() - Method in class org.apache.spark.sql.execution.HashJoin
 
streamedKeys() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
streamedPlan() - Method in class org.apache.spark.sql.execution.HashJoin
 
streamedPlan() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
streamId() - Method in class org.apache.spark.storage.StreamBlockId
 
streamId() - Method in class org.apache.spark.streaming.receiver.Receiver
Get the unique identifier the receiver input stream that this receiver is associated with.
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
StreamingContext - Class in org.apache.spark.streaming
Main entry point for Spark Streaming functionality.
StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext using an existing SparkContext.
StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext by providing the configuration necessary for a new SparkContext.
StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext by providing the details necessary for creating a new SparkContext.
StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext
Recreate a StreamingContext from a checkpoint file.
StreamingContextState() - Method in class org.apache.spark.streaming.StreamingContext
Accessor for nested Scala object
StreamingListener - Interface in org.apache.spark.streaming.scheduler
:: DeveloperApi :: A listener interface for receiving information about an ongoing streaming computation.
StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Base trait for events related to StreamingListener
StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
streamSideKeyGenerator() - Method in class org.apache.spark.sql.execution.HashJoin
 
streamSideKeyGenerator() - Method in class org.apache.spark.sql.execution.LeftSemiJoinHash
 
stringToText(String) - Static method in class org.apache.spark.SparkContext
 
stringWritableConverter() - Static method in class org.apache.spark.SparkContext
 
submissionTime() - Method in class org.apache.spark.scheduler.StageInfo
When this stage was submitted from the DAGScheduler to a TaskScheduler.
submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext
:: Experimental :: Submit a job for execution and return a FutureJob holding the result.
subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(Vector) - Method in class org.apache.spark.util.Vector
 
subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from `this` whose keys are not in `other`.
succeededTasks() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
 
Success - Class in org.apache.spark
:: DeveloperApi :: Task succeeded.
Success() - Constructor for class org.apache.spark.Success
 
successful() - Method in class org.apache.spark.scheduler.TaskInfo
 
sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Add up the elements in this RDD.
sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Add up the elements in this RDD.
sum() - Method in class org.apache.spark.util.StatCounter
 
sum() - Method in class org.apache.spark.util.Vector
 
sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
:: Experimental :: Approximate operation to return the sum within a timeout.
SVMDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate sample data used for SVM.
SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
 
SVMModel - Class in org.apache.spark.mllib.classification
Model for Support Vector Machines (SVMs).
SVMWithSGD - Class in org.apache.spark.mllib.classification
Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD
Construct a SVM object with default parameters
systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
 

T

t() - Method in class org.apache.spark.SerializableWritable
 
table() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
table(String) - Method in class org.apache.spark.sql.SQLContext
Returns the specified table as a SchemaRDD
tableName() - Method in class org.apache.spark.sql.execution.CacheCommand
 
tachyonFolderName() - Method in class org.apache.spark.SparkContext
 
tachyonSize() - Method in class org.apache.spark.storage.BlockStatus
 
tachyonSize() - Method in class org.apache.spark.storage.RDDInfo
 
take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.rdd.RDD
Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.sql.SchemaRDD
 
takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for retrieving the first num elements of the RDD.
takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the first K elements from this RDD as defined by the specified Comparator[T] and maintains the order.
takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the first K elements from this RDD using the natural ordering for T while maintain the order.
takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the first K (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
TakeOrdered - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Take the first limit elements as defined by the sortOrder.
TakeOrdered(int, Seq<SortOrder>, SparkPlan, SQLContext) - Constructor for class org.apache.spark.sql.execution.TakeOrdered
 
takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
 
TaskContext - Class in org.apache.spark
:: DeveloperApi :: Contextual information about a task which can be read or mutated during execution.
TaskContext(int, int, long, boolean, TaskMetrics) - Constructor for class org.apache.spark.TaskContext
 
TaskEndReason - Interface in org.apache.spark
:: DeveloperApi :: Various possible reasons why a task ended.
TaskFailedReason - Interface in org.apache.spark
:: DeveloperApi :: Various possible reasons why a task failed.
taskId() - Method in class org.apache.spark.scheduler.TaskInfo
 
taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
TaskInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Information about a running task attempt inside a TaskSet.
TaskInfo(long, int, long, String, String, Enumeration.Value) - Constructor for class org.apache.spark.scheduler.TaskInfo
 
taskInfo() - Method in class org.apache.spark.ui.jobs.TaskUIData
 
TaskKilled - Class in org.apache.spark
:: DeveloperApi :: Task was killed intentionally and needs to be rescheduled.
TaskKilled() - Constructor for class org.apache.spark.TaskKilled
 
TaskKilledException - Exception in org.apache.spark
:: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected).
TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
 
taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
 
TaskLocality - Class in org.apache.spark.scheduler
 
TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
 
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
taskMetrics() - Method in class org.apache.spark.TaskContext
 
taskMetrics() - Method in class org.apache.spark.ui.jobs.TaskUIData
 
TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
 
TaskResultBlockId - Class in org.apache.spark.storage
 
TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
 
TaskResultLost - Class in org.apache.spark
:: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched.
TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
 
taskScheduler() - Method in class org.apache.spark.SparkContext
 
taskTime() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
 
taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
TaskUIData - Class in org.apache.spark.ui.jobs
 
TaskUIData(TaskInfo, Option<TaskMetrics>, Option<String>) - Constructor for class org.apache.spark.ui.jobs.TaskUIData
 
TEST() - Static method in class org.apache.spark.storage.BlockId
 
TestHive - Class in org.apache.spark.sql.hive.test
 
TestHive() - Constructor for class org.apache.spark.sql.hive.test.TestHive
 
TestHiveContext - Class in org.apache.spark.sql.hive.test
A locally running test instance of Spark's Hive execution engine.
TestHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext
 
TestHiveContext.QueryExecution - Class in org.apache.spark.sql.hive.test
Override QueryExecution with special debug workflow.
TestHiveContext.QueryExecution() - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
 
TestHiveContext.TestTable - Class in org.apache.spark.sql.hive.test
 
TestHiveContext.TestTable(String, Seq<Function0<BoxedUnit>>) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
 
TestSQLContext - Class in org.apache.spark.sql.test
A SQLContext that can be used for local testing.
TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
 
testTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
A list of test tables and the DDL required to initialize them.
textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.SparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
threshold() - Method in class org.apache.spark.mllib.tree.model.Split
 
thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns thresholds in descending order.
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
Time - Class in org.apache.spark.streaming
This is a simple class that represents an absolute instant of time.
Time(long) - Constructor for class org.apache.spark.streaming.Time
 
to(Time, Duration) - Method in class org.apache.spark.streaming.Time
 
toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike
Deprecated.
As of Spark 1.0.0, toArray() is deprecated, use JavaRDDLike.collect() instead
toArray() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix
Converts to a dense array in column major.
toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toArray() - Method in interface org.apache.spark.mllib.linalg.Vector
Converts the instance to a double array.
toArray() - Method in class org.apache.spark.rdd.RDD
Return an array that contains all of the elements in this RDD.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Collects data and assembles a local dense breeze matrix (for test only).
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
Converts to a breeze matrix.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
Converts the instance to a breeze vector.
toDataType(String) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
 
toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike
A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.rdd.RDD
A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.SparkConf
Return a string listing all keys and values, one per line.
toDebugString() - Method in interface org.apache.spark.sql.SQLConf
 
toErrorString() - Method in class org.apache.spark.ExceptionFailure
 
toErrorString() - Static method in class org.apache.spark.ExecutorLostFailure
 
toErrorString() - Method in class org.apache.spark.FetchFailed
 
toErrorString() - Static method in class org.apache.spark.Resubmitted
 
toErrorString() - Method in interface org.apache.spark.TaskFailedReason
Error message displayed in the web UI.
toErrorString() - Static method in class org.apache.spark.TaskKilled
 
toErrorString() - Static method in class org.apache.spark.TaskResultLost
 
toErrorString() - Static method in class org.apache.spark.UnknownReason
 
toFormattedString() - Method in class org.apache.spark.streaming.Duration
 
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Converts to IndexedRowMatrix.
toInt() - Method in class org.apache.spark.storage.StorageLevel
 
toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Convert to a JavaDStream
toJavaRDD() - Method in class org.apache.spark.rdd.RDD
 
toJavaSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
Returns this RDD as a JavaSchemaRDD.
toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an iterator that contains all of the elements in this RDD.
toLocalIterator() - Method in class org.apache.spark.rdd.RDD
Return an iterator that contains all of the elements in this RDD.
toMetastoreType(DataType) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
 
top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the top K elements from this RDD as defined by the specified Comparator[T].
top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the top K elements from this RDD using the natural ordering for T.
top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
 
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
 
topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
 
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Converts to RowMatrix, dropping row indices after grouping by row index.
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Drops row indices and converts this matrix to a RowMatrix.
TorrentBroadcastFactory - Class in org.apache.spark.broadcast
A Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors.
TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
 
toSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
Returns this RDD as a SchemaRDD.
toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
 
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
 
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
 
toString() - Method in class org.apache.spark.Accumulable
 
toString() - Method in class org.apache.spark.api.java.JavaRDD
 
toString() - Method in class org.apache.spark.broadcast.Broadcast
 
toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
 
toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
toString() - Method in class org.apache.spark.mllib.tree.model.Node
 
toString() - Method in class org.apache.spark.mllib.tree.model.Split
 
toString() - Method in class org.apache.spark.partial.BoundedDouble
 
toString() - Method in class org.apache.spark.partial.PartialResult
 
toString() - Method in class org.apache.spark.rdd.RDD
 
toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
toString() - Method in class org.apache.spark.scheduler.SplitInfo
 
toString() - Method in class org.apache.spark.SerializableWritable
 
toString() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
toString() - Method in class org.apache.spark.storage.BlockId
 
toString() - Method in class org.apache.spark.storage.BlockManagerId
 
toString() - Method in class org.apache.spark.storage.RDDInfo
 
toString() - Method in class org.apache.spark.storage.StorageLevel
 
toString() - Method in class org.apache.spark.streaming.Duration
 
toString() - Method in class org.apache.spark.streaming.Time
 
toString() - Method in class org.apache.spark.util.MutablePair
 
toString() - Method in class org.apache.spark.util.StatCounter
 
toString() - Method in class org.apache.spark.util.Vector
 
totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for all the jobs of this batch to finish processing from the time they were submitted.
totalShuffleRead() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
totalShuffleWrite() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
totalTime() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a Linear Regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model over an RDD
trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by users to some products, in the form of (userID, productID, rating) pairs.
transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
TwitterUtils - Class in org.apache.spark.streaming.twitter
 
TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
 

U

U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
ui() - Method in class org.apache.spark.SparkContext
 
uiTab() - Method in class org.apache.spark.streaming.StreamingContext
 
unbound() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
Remove all persisted state associated with the HTTP broadcast with the given ID.
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
Remove all persisted state associated with the torrent broadcast with the given ID.
uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext
Removes the specified table from the in-memory cache.
underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
 
union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the union of this RDD and another one.
union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the union of this RDD and another one.
union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return the union of this RDD and another one.
union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return the union of this RDD and another one.
union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Build the union of a list of RDDs.
union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Build the union of a list of RDDs passed as variable-length arguments.
Union - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Union(Seq<SparkPlan>, SQLContext) - Constructor for class org.apache.spark.sql.execution.Union
 
union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream by unifying data of another DStream with this DStream.
union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by unifying data of another DStream with this DStream.
union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by unifying data of another DStream with this DStream.
union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
unionAll(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD
Combines the tuples of two RDDs with the same schema, keeping duplicates.
UnionRDD<T> - Class in org.apache.spark.rdd
 
UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
 
uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
 
UnknownReason - Class in org.apache.spark
:: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.
UnknownReason() - Constructor for class org.apache.spark.UnknownReason
 
unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.broadcast.Broadcast
Asynchronously delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast
Delete cached copies of this broadcast on the executors.
unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Unpersist intermediate RDDs used in the computation.
unpersist(boolean) - Method in class org.apache.spark.rdd.RDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
until(Time, Duration) - Method in class org.apache.spark.streaming.Time
 
update(T1, T2) - Method in class org.apache.spark.util.MutablePair
Updates this pair with new values and returns itself
Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to perform steps (weight update) using Gradient Descent methods.
Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
 
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
useDisk() - Method in class org.apache.spark.storage.StorageLevel
 
useMemory() - Method in class org.apache.spark.storage.StorageLevel
 
useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
 
user() - Method in class org.apache.spark.mllib.recommendation.Rating
 
user() - Method in class org.apache.spark.scheduler.JobLogger
 
userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 

V

V() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
value() - Method in class org.apache.spark.Accumulable
Access the accumulator's current value; only allowed on master.
value() - Method in class org.apache.spark.broadcast.Broadcast
Get the broadcasted value.
value() - Method in class org.apache.spark.ComplexFutureAction
 
value() - Method in interface org.apache.spark.FutureAction
The value of this Future.
value() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
value() - Method in class org.apache.spark.SerializableWritable
 
value() - Method in class org.apache.spark.SimpleFutureAction
 
value() - Method in class org.apache.spark.sql.execution.SetCommand
 
values() - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the values of each tuple.
values() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
values() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
values() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
values() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the values of each tuple.
variance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the variance of this RDD's elements.
variance() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample variance vector.
Variance - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating variance during regression
Variance() - Constructor for class org.apache.spark.mllib.tree.impurity.Variance
 
variance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.util.StatCounter
Return the variance of the values.
vClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
vector() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
Vector - Interface in org.apache.spark.mllib.linalg
Represents a numeric vector, whose index type is Int and value type is Double.
Vector - Class in org.apache.spark.util
 
Vector(double[]) - Constructor for class org.apache.spark.util.Vector
 
Vector.Multiplier - Class in org.apache.spark.util
 
Vector.Multiplier(double) - Constructor for class org.apache.spark.util.Vector.Multiplier
 
Vector.VectorAccumParam$ - Class in org.apache.spark.util
 
Vector.VectorAccumParam$() - Constructor for class org.apache.spark.util.Vector.VectorAccumParam$
 
Vectors - Class in org.apache.spark.mllib.linalg
 
Vectors() - Constructor for class org.apache.spark.mllib.linalg.Vectors
 
version() - Method in class org.apache.spark.SparkContext
The version of Spark on which this application is running.
vManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
VoidFunction<T> - Interface in org.apache.spark.api.java.function
A function with no return value.

W

waiter() - Method in class org.apache.spark.streaming.StreamingContext
 
warehousePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
 
warehousePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
weights() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
weights() - Method in class org.apache.spark.mllib.classification.SVMModel
 
weights() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
weights() - Method in class org.apache.spark.mllib.regression.LassoModel
 
weights() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
weights() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
where(Expression) - Method in class org.apache.spark.sql.SchemaRDD
Filters the output, only returning those rows where condition evaluates to true.
where(Symbol, Function1<T1, Object>) - Method in class org.apache.spark.sql.SchemaRDD
Filters tuples using a function over the value of the specified column.
where(Function1<DynamicRow, Object>) - Method in class org.apache.spark.sql.SchemaRDD
:: Experimental :: Filters tuples using a function over a Dynamic version of a given Row.
wholeTextFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String, int) - Method in class org.apache.spark.SparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
withReplacement() - Method in class org.apache.spark.sql.execution.Sample
 
wrapRDD(RDD<Double>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaPairRDD
 
wrapRDD(RDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
 
wrapRDD(RDD<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
wrapRDD(RDD<Row>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
wrapRDD(RDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
wrapRDD(RDD<T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
writableWritableConverter() - Static method in class org.apache.spark.SparkContext
 
writeAll(Iterator<T>, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializationStream
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.serializer.JavaSerializer
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerId
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.StorageLevel
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
writeObject(T, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializationStream
 

Z

zero() - Method in class org.apache.spark.Accumulable
 
zero(R) - Method in interface org.apache.spark.AccumulableParam
Return the "zero" (identity) value for an accumulator type, given its initial value.
zero(double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
zero(float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
zero(int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
 
zero(long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
 
zero(Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
 
ZeroMQUtils - Class in org.apache.spark.streaming.zeromq
 
ZeroMQUtils() - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQUtils
 
zeros(int) - Static method in class org.apache.spark.util.Vector
 
zeroTime() - Method in class org.apache.spark.streaming.dstream.DStream
 
zip(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zip(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zipPartitions(JavaRDDLike<U, ?>, FlatMapFunction2<Iterator<T>, Iterator<U>, V>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD<B>, boolean, Function2<Iterator<T>, Iterator<B>, Iterator<V>>, ClassTag<B>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD<B>, Function2<Iterator<T>, Iterator<B>, Iterator<V>>, ClassTag<B>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, boolean, Function3<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, Function3<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, RDD<D>, boolean, Function4<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, RDD<D>, Function4<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipWithIndex() - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with its element indices.
zipWithIndex() - Method in class org.apache.spark.rdd.RDD
Zips this RDD with its element indices.
zipWithUniqueId() - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with generated unique Long ids.
zipWithUniqueId() - Method in class org.apache.spark.rdd.RDD
Zips this RDD with generated unique Long ids.

_

_1() - Method in class org.apache.spark.util.MutablePair
 
_2() - Method in class org.apache.spark.util.MutablePair
 
A B C D E F G H I J K L M N O P Q R S T U V W Z _