Index (Spark 1.0.0 JavaDoc)

A B C D E F G H I J K L M N O P Q R S T U V W Z _

A

Accumulable<R,T> - Class in org.apache.spark: A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.SparkContext: Create an Accumulable shared variable, to which tasks can add values with +=.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext: Create an accumulator from a "mutable collection" type.
AccumulableParam<R,T> - Interface in org.apache.spark: Helper object defining how to accumulate values of a particular type.
Accumulator<T> - Class in org.apache.spark: A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.e.
Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the += method.
AccumulatorParam<T> - Interface in org.apache.spark: A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
ActorHelper - Interface in org.apache.spark.streaming.receiver: :: DeveloperApi :: A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.
actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: A helper with set of defaults for supervisor strategy
ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
actorSystem() - Method in class org.apache.spark.SparkEnv
add(T) - Method in class org.apache.spark.Accumulable: Add more data to this accumulator / accumulable
add(Vector) - Method in class org.apache.spark.util.Vector
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam: Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
addedFiles() - Method in class org.apache.spark.SparkContext
addedJars() - Method in class org.apache.spark.SparkContext
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Add a file to be downloaded with this Spark job on every node.
addFile(String) - Method in class org.apache.spark.SparkContext: Add a file to be downloaded with this Spark job on every node.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam: Merge two accumulated values together.
addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
addInPlace(Vector) - Method in class org.apache.spark.util.Vector
addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in class org.apache.spark.SparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD: Add Hadoop configuration specific to a single partition and attempt.
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext: Add a callback function to be executed on task completion.
addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext: Add a StreamingListener object for receiving system events related to streaming.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
Aggregate - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Groups input data by groupingExpressions and computes the aggregateExpressions for each group.
Aggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan, SparkContext) - Constructor for class org.apache.spark.sql.execution.Aggregate
aggregate() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
aggregate(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD: Performs an aggregation over all Rows in this RDD.
Aggregate.ComputedAggregate - Class in org.apache.spark.sql.execution: An aggregate that needs to be computed for each row in a group.
Aggregate.ComputedAggregate(AggregateExpression, AggregateExpression, AttributeReference) - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
Aggregate.ComputedAggregate$ - Class in org.apache.spark.sql.execution
Aggregate.ComputedAggregate$() - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate$
aggregateExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
Aggregator<K,V,C> - Class in org.apache.spark: :: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
Algo - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
AlphaComponent - Annotation Type in org.apache.spark.annotation: A new component of Spark which may have unstable API's.
alreadyPlanned() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
ALS - Class in org.apache.spark.mllib.recommendation: Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS: Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
analyzed() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils: Returns a new vector with 1.0 (bias) appended to the input vector.
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: Gets the (i, j)-th element.
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector: Gets the value of the ith element.
apply(String) - Static method in class org.apache.spark.storage.BlockId: Converts a BlockId "name" String back into a BlockId.
apply(String, String, int, int) - Static method in class org.apache.spark.storage.BlockManagerId: Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
apply(long) - Static method in class org.apache.spark.streaming.Minutes
apply(long) - Static method in class org.apache.spark.streaming.Seconds
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values passed as variable-length arguments.
apply(int) - Method in class org.apache.spark.util.Vector
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Applies a schema to an RDD of Java Beans.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
appName() - Method in class org.apache.spark.SparkContext
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the receiver operating characteristic (ROC) curve.
as(Symbol) - Method in class org.apache.spark.sql.SchemaRDD: Applies a qualifier to the attributes of this relation.
asIterator() - Method in interface org.apache.spark.serializer.DeserializationStream: Read the elements of this stream through an iterator.
asRDDId() - Method in class org.apache.spark.storage.BlockId
AsyncRDDActions<T> - Class in org.apache.spark.rdd: :: Experimental :: A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
attemptId() - Method in class org.apache.spark.TaskContext
attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.

B

baseSchemaRDD() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
baseSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
BatchInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, ReceivedBlockInfo[]>, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
BernoulliSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, double, boolean, Random) - Constructor for class org.apache.spark.util.random.BernoulliSampler
BernoulliSampler(double, Random) - Constructor for class org.apache.spark.util.random.BernoulliSampler
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation: :: Experimental :: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators: Function to check if labels used for classification are either zero or one.
BlockId - Class in org.apache.spark.storage: :: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
blockManager() - Method in class org.apache.spark.SparkEnv
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
BlockManagerId - Class in org.apache.spark.storage: :: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
blocks() - Method in class org.apache.spark.storage.StorageStatus
BlockStatus - Class in org.apache.spark.storage
BlockStatus(StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
bmAddress() - Method in class org.apache.spark.FetchFailed
booleanWritableConverter() - Static method in class org.apache.spark.SparkContext
boolToBoolWritable(boolean) - Static method in class org.apache.spark.SparkContext
boundCondition() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
BoundedDouble - Class in org.apache.spark.partial: :: Experimental :: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast: A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
broadcast() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
BroadcastBlockId - Class in org.apache.spark.storage
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
BroadcastFactory - Interface in org.apache.spark.broadcast: :: DeveloperApi :: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
broadcastManager() - Method in class org.apache.spark.SparkEnv
BroadcastNestedLoopJoin - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
BroadcastNestedLoopJoin(SparkPlan, SparkPlan, JoinType, Option<Expression>, SparkContext) - Constructor for class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
build(Node[]) - Method in class org.apache.spark.mllib.tree.model.Node: build the left node and right nodes if not leaf
buildKeys() - Method in class org.apache.spark.sql.execution.HashJoin
BuildLeft - Class in org.apache.spark.sql.execution
BuildLeft() - Constructor for class org.apache.spark.sql.execution.BuildLeft
buildPlan() - Method in class org.apache.spark.sql.execution.HashJoin
BuildRight - Class in org.apache.spark.sql.execution
BuildRight() - Constructor for class org.apache.spark.sql.execution.BuildRight
BuildSide - Class in org.apache.spark.sql.execution
BuildSide() - Constructor for class org.apache.spark.sql.execution.BuildSide
buildSide() - Method in class org.apache.spark.sql.execution.HashJoin
buildSideKeyGenerator() - Method in class org.apache.spark.sql.execution.HashJoin
bytesToBytesWritable(byte[]) - Static method in class org.apache.spark.SparkContext
bytesWritableConverter() - Static method in class org.apache.spark.SparkContext

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cacheManager() - Method in class org.apache.spark.SparkEnv
cacheTable(String) - Method in class org.apache.spark.sql.SQLContext: Caches the specified table in-memory.
cacheTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
calculate(double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: entropy calculation
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
calculate(double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: Gini coefficient calculation
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
calculate(double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for binary classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for regression
calculate(double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: variance calculation
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
call(T1) - Method in interface org.apache.spark.api.java.function.Function
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
cancel() - Method in class org.apache.spark.ComplexFutureAction
cancel() - Method in interface org.apache.spark.FutureAction: Cancels the execution of this action.
cancel() - Method in class org.apache.spark.SimpleFutureAction
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.SparkContext: Cancel all jobs that have been scheduled or are running.
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext: Cancel active jobs for the specified group.
cancelled() - Method in class org.apache.spark.ComplexFutureAction: Returns whether the promise has been cancelled.
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
CartesianProduct - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
CartesianProduct(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.CartesianProduct
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
categories() - Method in class org.apache.spark.mllib.tree.model.Split
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike: Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
checkpoint() - Method in class org.apache.spark.rdd.RDD: Mark this RDD for checkpointing.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Enable periodic checkpointing of RDDs of this DStream
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext: Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointData() - Method in class org.apache.spark.rdd.RDD
checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
checkpointDir() - Method in class org.apache.spark.SparkContext
checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
child() - Method in class org.apache.spark.sql.execution.Aggregate
child() - Method in class org.apache.spark.sql.execution.Exchange
child() - Method in class org.apache.spark.sql.execution.Filter
child() - Method in class org.apache.spark.sql.execution.Generate
child() - Method in class org.apache.spark.sql.execution.Limit
child() - Method in class org.apache.spark.sql.execution.Project
child() - Method in class org.apache.spark.sql.execution.Sample
child() - Method in class org.apache.spark.sql.execution.Sort
child() - Method in class org.apache.spark.sql.execution.TakeOrdered
child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
children() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
children() - Method in class org.apache.spark.sql.execution.Union
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
ClassificationModel - Interface in org.apache.spark.mllib.classification: :: Experimental :: Represents a classification model that predicts to which of a set of categories an example belongs.
className() - Method in class org.apache.spark.ExceptionFailure
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
classTag() - Method in class org.apache.spark.api.java.JavaRDD
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
classTag() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
cleaner() - Method in class org.apache.spark.SparkContext
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext: Support function for API backtraces.
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearFiles() - Method in class org.apache.spark.SparkContext: Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.SparkContext: Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext: Clear the current thread's job group ID and its description.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: :: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel: :: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clone() - Method in class org.apache.spark.SparkConf: Copy this object
clone() - Method in class org.apache.spark.storage.StorageLevel
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
clone() - Method in class org.apache.spark.util.random.PoissonSampler
clone() - Method in interface org.apache.spark.util.random.RandomSampler
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliSampler: Return a sampler with is the complement of the range specified of the current sampler.
close() - Method in interface org.apache.spark.serializer.DeserializationStream
close() - Method in interface org.apache.spark.serializer.SerializationStream
closureSerializer() - Method in class org.apache.spark.SparkEnv
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd: :: DeveloperApi :: A RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.rdd.RDD: Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return an RDD that contains all matching values by applying f.
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD: Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving all elements of this RDD.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in a specific partition of this RDD.
columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Combine elements of each key in DStream's RDDs using custom functions.
combineCombinersByKey(Iterator<Tuple2<K, C>>) - Method in class org.apache.spark.Aggregator
combineCombinersByKey(Iterator<Tuple2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
commands() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
completed() - Method in class org.apache.spark.TaskContext
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
completionTime() - Method in class org.apache.spark.scheduler.StageInfo: Time when all tasks in the stage completed or when the stage was cancelled.
ComplexFutureAction<T> - Class in org.apache.spark: :: Experimental :: A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
CompressionCodec - Interface in org.apache.spark.io: :: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater: Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: :: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.sql.SchemaRDD
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Method that generates a RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream: Method that generates a RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: Ask ReceiverInputTracker for received data blocks and generates RDDs with them.
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes column-wise summary statistics.
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the covariance matrix, treating each row as an observation.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the Gramian matrix A^T A.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo: Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the top k principal components.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Computes the singular value decomposition of this matrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the singular value decomposition of this matrix.
condition() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
condition() - Method in class org.apache.spark.sql.execution.Filter
conf() - Method in class org.apache.spark.SparkContext
conf() - Method in class org.apache.spark.SparkEnv
conf() - Method in class org.apache.spark.streaming.StreamingContext
confidence() - Method in class org.apache.spark.partial.BoundedDouble
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
connectionManager() - Method in class org.apache.spark.SparkEnv
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream: An input stream that always returns the same RDD on each timestep.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
contains(String) - Method in class org.apache.spark.SparkConf: Does the configuration contain a given parameter?
containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
context() - Method in interface org.apache.spark.api.java.JavaRDDLike: The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
context() - Method in class org.apache.spark.rdd.RDD: The SparkContext that this RDD was created on.
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream: Return the StreamingContext associated with this DStream
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
convertToCatalyst(Object) - Static method in class org.apache.spark.sql.execution.ExistingRdd
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
copy() - Method in class org.apache.spark.util.StatCounter: Clone this StatCounter
count() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the number of elements in the RDD.
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample size.
count() - Method in class org.apache.spark.rdd.RDD: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Return the number of elements in the RDD.
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.util.StatCounter
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Count the number of elements for each key, and return the result to the master as a Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: (Experimental) Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: (Experimental) Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Deprecated.
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Create a new StorageLevel object.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD: Create a PartitionPruningRDD.
create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
createCodec(SparkConf) - Method in interface org.apache.spark.io.CompressionCodec
createCodec(SparkConf, String) - Method in interface org.apache.spark.io.CompressionCodec
createCombiner() - Method in class org.apache.spark.Aggregator
createParquetFile(Class<?>, String, boolean, Configuration) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: :: Experimental :: Creates an empty parquet file with the schema of class beanClass, which can be registered as a table.
createParquetFile(String, boolean, Configuration, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext: :: Experimental :: Creates an empty parquet file with the schema of class A, which can be registered as a table.
createSchemaRDD(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext: Creates a SchemaRDD from an RDD of case classes.
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, Manifest, Manifest<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages form a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages form a Kafka Broker.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages form a Kafka Broker.
createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createTable(String, boolean, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.hive.HiveContext: Creates a table using the schema of the given class.
creationSiteInfo() - Method in class org.apache.spark.rdd.RDD: User code that created this RDD (e.g.

D

dagScheduler() - Method in class org.apache.spark.SparkContext
DataValidators - Class in org.apache.spark.mllib.util: :: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
DecisionTree - Class in org.apache.spark.mllib.tree: :: Experimental :: A class that implements a decision tree algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model: :: Experimental :: Model to store the decision tree parameters
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
DEFAULT_COMPRESSION_CODEC() - Method in interface org.apache.spark.io.CompressionCodec
DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
As of Spark 1.0.0, defaultMinSplits is deprecated, use JavaSparkContext.defaultMinPartitions() instead
defaultMinSplits() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext: Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in class org.apache.spark.SparkContext: Default level of parallelism to use when not given by user (e.g.
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner: Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
delegate() - Method in class org.apache.spark.InterruptibleIterator
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Creates a column-majored dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from a double array.
DenseMatrix - Class in org.apache.spark.mllib.linalg: Column-majored dense matrix.
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
DenseVector - Class in org.apache.spark.mllib.linalg: A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
dependencies() - Method in class org.apache.spark.rdd.RDD: Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream: List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
Dependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies.
Dependency(RDD<T>) - Constructor for class org.apache.spark.Dependency
describedTable() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
description() - Method in class org.apache.spark.ExceptionFailure
description() - Method in class org.apache.spark.storage.StorageLevel
DeserializationStream - Interface in org.apache.spark.serializer: :: DeveloperApi :: A stream for reading serialized objects.
deserialize(ByteBuffer, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializerInstance
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializerInstance
deserialized() - Method in class org.apache.spark.storage.StorageLevel
deserializeMany(ByteBuffer) - Method in interface org.apache.spark.serializer.SerializerInstance
deserializeStream(InputStream) - Method in interface org.apache.spark.serializer.SerializerInstance
DeveloperApi - Annotation Type in org.apache.spark.annotation: A lower-level, unstable API intended for developers.
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
diskSize() - Method in class org.apache.spark.storage.BlockStatus
diskSize() - Method in class org.apache.spark.storage.RDDInfo
diskUsed() - Method in class org.apache.spark.storage.StorageStatus
diskUsedByRDD(int) - Method in class org.apache.spark.storage.StorageStatus
dist(Vector) - Method in class org.apache.spark.util.Vector
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.SchemaRDD
distinct(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed: Represents a distributively stored matrix backed by one or more RDDs.
divide(double) - Method in class org.apache.spark.util.Vector
dot(Vector) - Method in class org.apache.spark.util.Vector
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleRDDFunctions - Class in org.apache.spark.rdd: Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
DStream<T> - Class in org.apache.spark.streaming.dstream: A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
duration() - Method in class org.apache.spark.scheduler.TaskInfo
Duration - Class in org.apache.spark.streaming
Duration(long) - Constructor for class org.apache.spark.streaming.Duration

E

elements() - Method in class org.apache.spark.util.Vector
emittedTaskSizeWarning() - Method in class org.apache.spark.scheduler.StageInfo
emptyRDD(ClassTag<T>) - Method in class org.apache.spark.SparkContext: Get an RDD that has no partitions or elements.
entries() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Entropy - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating entropy during binary classification.
Entropy() - Constructor for class org.apache.spark.mllib.tree.impurity.Entropy
env() - Method in class org.apache.spark.api.java.JavaSparkContext
env() - Method in class org.apache.spark.SparkContext
env() - Method in class org.apache.spark.streaming.StreamingContext
environmentDetails() - Method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
EnvironmentListener - Class in org.apache.spark.ui.env: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the EnvironmentTab
EnvironmentListener() - Constructor for class org.apache.spark.ui.env.EnvironmentListener
EPSILON() - Static method in class org.apache.spark.mllib.util.MLUtils
equals(Object) - Method in class org.apache.spark.HashPartitioner
equals(Object) - Method in interface org.apache.spark.mllib.linalg.Vector
equals(Object) - Method in class org.apache.spark.RangePartitioner
equals(Object) - Method in class org.apache.spark.scheduler.InputFormatInfo
equals(Object) - Method in class org.apache.spark.scheduler.SplitInfo
equals(Object) - Method in class org.apache.spark.storage.BlockId
equals(Object) - Method in class org.apache.spark.storage.BlockManagerId
equals(Object) - Method in class org.apache.spark.storage.StorageLevel
event() - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
eventLogger() - Method in class org.apache.spark.SparkContext
exception() - Method in class org.apache.spark.ui.jobs.TaskUIData
ExceptionFailure - Class in org.apache.spark
ExceptionFailure(String, String, StackTraceElement[], Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
Exchange - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Exchange(Partitioning, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Exchange
execute() - Method in class org.apache.spark.sql.execution.Aggregate
execute() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
execute() - Method in class org.apache.spark.sql.execution.CartesianProduct
execute() - Method in class org.apache.spark.sql.execution.Exchange
execute() - Method in class org.apache.spark.sql.execution.ExistingRdd
execute() - Method in class org.apache.spark.sql.execution.Filter
execute() - Method in class org.apache.spark.sql.execution.Generate
execute() - Method in class org.apache.spark.sql.execution.HashJoin
execute() - Method in class org.apache.spark.sql.execution.Limit
execute() - Method in class org.apache.spark.sql.execution.Project
execute() - Method in class org.apache.spark.sql.execution.Sample
execute() - Method in class org.apache.spark.sql.execution.Sort
execute() - Method in class org.apache.spark.sql.execution.SparkPlan: Runs this query returning the result as an RDD.
execute() - Method in class org.apache.spark.sql.execution.TakeOrdered
execute() - Method in class org.apache.spark.sql.execution.Union
execute() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
execute() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable: Inserts all the rows in the table into Hive.
execute() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
execute() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable: Inserts all rows into the Parquet file.
execute() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
executeCollect() - Method in class org.apache.spark.sql.execution.Limit
executeCollect() - Method in class org.apache.spark.sql.execution.SparkPlan: Runs this query returning the result as an array.
executeCollect() - Method in class org.apache.spark.sql.execution.TakeOrdered
executeOnCompleteCallbacks() - Method in class org.apache.spark.TaskContext
executePlan(LogicalPlan) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
executor_() - Method in class org.apache.spark.streaming.receiver.Receiver: Handler object that runs the receiver.
executorEnvs() - Method in class org.apache.spark.SparkContext
executorId() - Method in class org.apache.spark.scheduler.TaskInfo
executorId() - Method in class org.apache.spark.SparkEnv
executorId() - Method in class org.apache.spark.storage.BlockManagerId
executorIdToBlockManagerId() - Method in class org.apache.spark.ui.jobs.JobProgressListener
ExecutorLostFailure - Class in org.apache.spark: :: DeveloperApi :: The task failed because the executor that it was running on was lost.
ExecutorLostFailure() - Constructor for class org.apache.spark.ExecutorLostFailure
executorMemory() - Method in class org.apache.spark.SparkContext
ExecutorsListener - Class in org.apache.spark.ui.exec: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the ExecutorsTab
ExecutorsListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.exec.ExecutorsListener
ExecutorSummary - Class in org.apache.spark.ui.jobs: :: DeveloperApi :: Class for reporting aggregated metrics for each executor in stage UI.
ExecutorSummary() - Constructor for class org.apache.spark.ui.jobs.ExecutorSummary
executorToDuration() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksActive() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksComplete() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksFailed() - Method in class org.apache.spark.ui.exec.ExecutorsListener
ExistingRdd - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
ExistingRdd(Seq<Attribute>, RDD<Row>) - Constructor for class org.apache.spark.sql.execution.ExistingRdd
Experimental - Annotation Type in org.apache.spark.annotation: An experimental user-facing API.
extractDistribution(Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
extractDoubleDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
extractLongDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener

F

failed() - Method in class org.apache.spark.scheduler.TaskInfo
failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
failedTasks() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
failureReason() - Method in class org.apache.spark.scheduler.StageInfo: If the stage failed, the reason why.
FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
feature() - Method in class org.apache.spark.mllib.tree.model.Split
features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
FeatureType - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Enum to describe whether a feature is "continuous" or "categorical"
FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
featureType() - Method in class org.apache.spark.mllib.tree.model.Split
FetchFailed - Class in org.apache.spark
FetchFailed(BlockManagerId, int, int, int) - Constructor for class org.apache.spark.FetchFailed
field() - Method in class org.apache.spark.storage.BroadcastBlockId
FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
files() - Method in class org.apache.spark.SparkContext
fileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Row, Boolean>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD containing only the elements that satisfy a predicate.
Filter - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Filter(Expression, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Filter
filter(Function1<Row, Object>) - Method in class org.apache.spark.sql.SchemaRDD
filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream containing only the elements that satisfy a predicate.
filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD: Filters this RDD with p, where p takes an additional parameter of type A.
finished() - Method in class org.apache.spark.scheduler.TaskInfo
finishTime() - Method in class org.apache.spark.scheduler.TaskInfo: The time when the task has completed successfully (including the time to remotely fetch results, if necessary).
first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
first() - Method in class org.apache.spark.api.java.JavaPairRDD
first() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the first element in this RDD.
first() - Method in class org.apache.spark.rdd.RDD: Return the first element in this RDD.
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<T, TraversableOnce>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMap(Function1<T, Traversable>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function: A function that returns zero or more output records from each input record.
FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function: A function that takes two inputs and returns zero or more output records.
flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMapValues(Function<V, Iterable>) - Method in class org.apache.spark.api.java.JavaPairRDD: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function1<V, TraversableOnce>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function<V, Iterable>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapValues(Function1<V, TraversableOnce>, ClassTag) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq>, ClassTag) - Method in class org.apache.spark.rdd.RDD: FlatMaps f over this RDD, where f takes an additional parameter of type A.
floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
floatWritableConverter() - Static method in class org.apache.spark.SparkContext
floor(Duration) - Method in class org.apache.spark.streaming.Time
FlumeUtils - Class in org.apache.spark.streaming.flume
FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
flush() - Method in interface org.apache.spark.serializer.SerializationStream
fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, F-Measure) curve.
fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, F-Measure) curve with beta = 1.0.
fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Applies a function f to all elements of this RDD.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies a function f to all elements of this RDD.
foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions: Applies a function f to all elements of this RDD.
foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies a function f to each partition of this RDD.
foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions: Applies a function f to each partition of this RDD.
foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Apply a function to each RDD in this DStream.
foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Apply a function to each RDD in this DStream.
foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies f to each element of this RDD, where f takes an additional parameter of type A.
formatExecutorId(String) - Method in class org.apache.spark.storage.StorageStatusListener: In the local mode, there is a discrepancy between the executor ID according to the task ("localhost") and that according to SparkEnv ("").
fraction() - Method in class org.apache.spark.sql.execution.Sample
fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream: Convert a scala DStream to a Java-friendly JavaDStream.
fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream: Convert a scala InputDStream to a Java-friendly JavaInputDStream.
fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream: Convert a scala InputDStream of pairs to a Java-friendly JavaPairInputDStream.
fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD: Convert a JavaRDD of key-value pairs to JavaPairRDD.
fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
fromProductRdd(RDD<A>, TypeTags.TypeTag<A>) - Static method in class org.apache.spark.sql.execution.ExistingRdd
fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream: Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream: Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
fromStage(Stage) - Static method in class org.apache.spark.scheduler.StageInfo: Construct a StageInfo from a Stage.
Function<T1,R> - Interface in org.apache.spark.api.java.function: Base interface for functions whose return types do not create special RDDs.
Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function: A two-argument function that takes arguments of type T1 and T2 and returns an R.
Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function: A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
FutureAction<T> - Interface in org.apache.spark: :: Experimental :: A future for the result of an action to support cancellation.

G

gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
GeneralizedLinearModel - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm.
GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
Generate - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Applies a Generator to a stream of input rows, combining the output of each into a new stream of rows.
Generate(Generator, boolean, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Generate
generate(Generator, boolean, boolean, Option<String>) - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Applies the given Generator, or table generating function, to this relation.
generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator: Generate an RDD containing test data for KMeans.
generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: Return a Java List of synthetic data randomly generated according to a multi collinear model.
generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and uregularized variants.
generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator: Generate an RDD containing test data for LogisticRegression.
generator() - Method in class org.apache.spark.sql.execution.Generate
get() - Method in interface org.apache.spark.FutureAction: Blocks and returns the result of this job.
get(String) - Method in class org.apache.spark.SparkConf: Get a parameter; throws a NoSuchElementException if it's not set
get(String, String) - Method in class org.apache.spark.SparkConf: Get a parameter, falling back to a default if not set
get() - Static method in class org.apache.spark.SparkEnv: Returns the ThreadLocal SparkEnv, if non-null.
get(String) - Static method in class org.apache.spark.SparkFiles: Get the absolute path of a file added through SparkContext.addFile().
get(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column `i`.
getAkkaConf() - Method in class org.apache.spark.SparkConf: Get all akka conf variables set on this SparkConf
getAll() - Method in class org.apache.spark.SparkConf: Get all parameters as a list of pairs
getAllPools() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return pools for fair scheduler
getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf: Get a parameter as a boolean, falling back to a default if not set
getBoolean(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a bool.
getByte(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a byte.
getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD: The three methods below are helpers for accessing the local map, a property of the SparkEnv of the local process.
getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
getCheckpointDir() - Method in class org.apache.spark.SparkContext
getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike: Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.rdd.RDD: Gets the name of the file to which this RDD was checkpointed
getConf() - Method in class org.apache.spark.api.java.JavaSparkContext: Return a copy of this JavaSparkContext's configuration.
getConf() - Method in class org.apache.spark.rdd.HadoopRDD
getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
getConf() - Method in class org.apache.spark.SparkContext: Return a copy of this SparkContext's configuration.
getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
getDouble(String, double) - Method in class org.apache.spark.SparkConf: Get a parameter as a double, falling back to a default if not set
getDouble(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a double.
getExecutorEnv() - Method in class org.apache.spark.SparkConf: Get all executor environment variables set on this SparkConf
getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext: Return a map from the slave to the max memory available for caching and the remaining memory available for caching.
getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext: Return information about blocks stored in all of the slaves
getFinalValue() - Method in class org.apache.spark.partial.PartialResult: Blocking method to wait for and return the final value.
getFloat(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a float.
getHiveFile(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
getInt(String, int) - Method in class org.apache.spark.SparkConf: Get a parameter as an integer, falling back to a default if not set
getInt(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as an int.
getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Get a local property set in this thread, or null if it is missing.
getLocalProperty(String) - Method in class org.apache.spark.SparkContext: Get a local property set in this thread, or null if it is missing.
getLong(String, long) - Method in class org.apache.spark.SparkConf: Get a parameter as a long, falling back to a default if not set
getLong(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a long.
getOption(String) - Method in class org.apache.spark.SparkConf: Get a parameter as an Option
getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getParents(int) - Method in class org.apache.spark.NarrowDependency: Get the parent partitions for a child partition.
getParents(int) - Method in class org.apache.spark.OneToOneDependency
getParents(int) - Method in class org.apache.spark.RangeDependency
getPartition(Object) - Method in class org.apache.spark.HashPartitioner
getPartition(Object) - Method in class org.apache.spark.Partitioner
getPartition(Object) - Method in class org.apache.spark.RangePartitioner
getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
getPartitions() - Method in class org.apache.spark.sql.SchemaRDD
getPersistentRDDs() - Method in class org.apache.spark.SparkContext: Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
getPoolForName(String) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return the pool associated with the given name, if one exists
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
getRDDStorageInfo() - Method in class org.apache.spark.SparkContext: Return information about what RDDs are cached, if they are in mem or on disk, how much space they take, etc.
getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: Gets the receiver object that will be sent to the worker nodes to receive data.
getRootDirectory() - Static method in class org.apache.spark.SparkFiles: Get the root directory that contains files added through SparkContext.addFile().
getSchedulingMode() - Method in class org.apache.spark.SparkContext: Return current scheduling mode
getSerializer(Serializer) - Method in interface org.apache.spark.serializer.Serializer
getShort(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a short.
getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext: Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike: Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageLevel() - Method in class org.apache.spark.rdd.RDD: Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getString(int) - Method in class org.apache.spark.sql.api.java.Row: Returns the value of column i as a String.
getThreadLocal() - Static method in class org.apache.spark.SparkEnv: Returns the ThreadLocal SparkEnv.
gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo: The time when the task started remotely getting the result.
Gini - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating the Gini impurity during binary classification.
Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
global() - Method in class org.apache.spark.sql.execution.Sort
glom() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in class org.apache.spark.rdd.RDD: Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
glom() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
Gradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to compute the gradient for a loss function, given a single data point.
Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
GradientDescent - Class in org.apache.spark.mllib.optimization: Class used to solve an optimization problem using Gradient Descent.
graph() - Method in class org.apache.spark.streaming.dstream.DStream
graph() - Method in class org.apache.spark.streaming.StreamingContext
groupBy(Function<T, K>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD of grouped elements.
groupBy(Function<T, K>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD of grouped elements.
groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped items.
groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped elements.
groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped items.
groupBy(Seq<Expression>, Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD: Performs a grouping followed by an aggregation.
groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey on each RDD of this DStream.
groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey on each RDD.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Create a new DStream by applying groupByKey over a sliding window on this DStream.
groupingExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.

H

hadoopConfiguration() - Method in class org.apache.spark.api.java.JavaSparkContext: Returns the Hadoop configuration used for the Hadoop code (e.g.
hadoopConfiguration() - Method in class org.apache.spark.SparkContext: A default Hadoop Configuration for the Hadoop code (e.g.
hadoopFile(String, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat.
hadoopFile(String, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, int, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopJobMetadata() - Method in class org.apache.spark.SparkEnv
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
HadoopRDD<K,V> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the older MapReduce API (org.apache.hadoop.mapred).
HadoopRDD(SparkContext, Broadcast<SerializableWritable<Configuration>>, Option<Function1<JobConf, BoxedUnit>>, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
HadoopRDD(SparkContext, JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
hadoopRDD(JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other necessary info (e.g.
hadoopReader() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
hashCode() - Method in interface org.apache.spark.mllib.linalg.Vector
hashCode() - Method in interface org.apache.spark.Partition
hashCode() - Method in class org.apache.spark.scheduler.InputFormatInfo
hashCode() - Method in class org.apache.spark.scheduler.SplitInfo
hashCode() - Method in class org.apache.spark.storage.BlockId
hashCode() - Method in class org.apache.spark.storage.BlockManagerId
hashCode() - Method in class org.apache.spark.storage.StorageLevel
HashJoin - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
HashJoin(Seq<Expression>, Seq<Expression>, BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.HashJoin
HashPartitioner - Class in org.apache.spark: A Partitioner that implements hash-based partitioning using Java's Object.hashCode.
HashPartitioner(int) - Constructor for class org.apache.spark.HashPartitioner
hasNext() - Method in class org.apache.spark.InterruptibleIterator
high() - Method in class org.apache.spark.partial.BoundedDouble
HingeGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.
HingeGradient() - Constructor for class org.apache.spark.mllib.optimization.HingeGradient
histogram(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[]) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute a histogram using the provided buckets.
histogram(Double[], boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
histogram(int) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[], boolean) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute a histogram using the provided buckets.
HiveContext - Class in org.apache.spark.sql.hive: An instance of the Spark SQL execution engine that integrates with data stored in Hive.
HiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext
hiveDevHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext: The location of the hive source code.
hiveFilesTemp() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
hiveHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext: The location of the compiled hive distribution
HiveMetastoreTypes - Class in org.apache.spark.sql.hive: :: DeveloperApi :: Provides conversions between Spark SQL data types and Hive Metastore types.
HiveMetastoreTypes() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreTypes
hivePlanner() - Method in class org.apache.spark.sql.hive.HiveContext
hiveql(String) - Method in class org.apache.spark.sql.hive.HiveContext: Executes a query expressed in HiveQL using Spark, returning the result as a SchemaRDD.
hiveQTestUtilTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
HiveTableScan - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi :: The Hive table scan operator.
HiveTableScan(Seq<Attribute>, org.apache.spark.sql.hive.MetastoreRelation, Option<Expression>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.HiveTableScan
host() - Method in class org.apache.spark.scheduler.TaskInfo
host() - Method in class org.apache.spark.storage.BlockManagerId
hostLocation() - Method in class org.apache.spark.scheduler.SplitInfo
hostPort() - Method in class org.apache.spark.storage.BlockManagerId
hours() - Static method in class org.apache.spark.scheduler.StatsReportListener
hql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext: Executes a query expressed in HiveQL, returning the result as a JavaSchemaRDD.
hql(String) - Method in class org.apache.spark.sql.hive.HiveContext: An alias for `hiveql`.
HttpBroadcastFactory - Class in org.apache.spark.broadcast: A BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism.
HttpBroadcastFactory() - Constructor for class org.apache.spark.broadcast.HttpBroadcastFactory
httpFileServer() - Method in class org.apache.spark.SparkEnv

I

i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
id() - Method in class org.apache.spark.Accumulable
id() - Method in interface org.apache.spark.api.java.JavaRDDLike: A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.broadcast.Broadcast
id() - Method in class org.apache.spark.mllib.tree.model.Node
id() - Method in class org.apache.spark.rdd.RDD: A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.storage.RDDInfo
id() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: This is an unique identifier for the network input stream.
impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Impurity - Interface in org.apache.spark.mllib.tree.impurity: :: Experimental :: Trait for calculating information gain.
impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
index() - Method in interface org.apache.spark.Partition: Get the split's index within its parent RDD
index() - Method in class org.apache.spark.scheduler.TaskInfo
IndexedRow - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a row of IndexedRowMatrix.
IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a row-oriented DistributedMatrix with indexed rows.
IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
InformationGainStats - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Information gain statistics for each split
InformationGainStats(double, double, double, double, double) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
initialized() - Method in interface org.apache.spark.Logging
initializeIfNecessary() - Method in interface org.apache.spark.Logging
initializeLogging() - Method in interface org.apache.spark.Logging
initialValue() - Method in class org.apache.spark.partial.PartialResult
initLocalProperties() - Method in class org.apache.spark.SparkContext
initLock() - Method in interface org.apache.spark.Logging
input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
InputDStream<T> - Class in org.apache.spark.streaming.dstream: This is the abstract base class for all input streams.
InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
InputFormatInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Parses and holds information about inputFormat (and files) specified as a parameter.
InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
inputRdd() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
inRepoTests() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi ::
InsertIntoHiveTable(org.apache.spark.sql.hive.MetastoreRelation, Map<String, Option<String>>, SparkPlan, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
InsertIntoParquetTable - Class in org.apache.spark.sql.parquet: Operator that acts as a sink for queries on RDDs and can be used to store the output inside a directory of Parquet files.
InsertIntoParquetTable(ParquetRelation, SparkPlan, boolean, SparkContext) - Constructor for class org.apache.spark.sql.parquet.InsertIntoParquetTable
intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
interrupted() - Method in class org.apache.spark.TaskContext
InterruptibleIterator<T> - Class in org.apache.spark: :: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality.
InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the intersection of this RDD and another one.
intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the intersection of this RDD and another one.
intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return the intersection of this RDD and another one.
intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return the intersection of this RDD and another one.
intersection(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
intersection(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
intersection(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
intToIntWritable(int) - Static method in class org.apache.spark.SparkContext
intWritableConverter() - Static method in class org.apache.spark.SparkContext
isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
isBroadcast() - Method in class org.apache.spark.storage.BlockId
isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.rdd.RDD: Return whether this RDD has been checkpointed or not
isCheckpointPresent() - Method in class org.apache.spark.streaming.StreamingContext
isCompleted() - Method in class org.apache.spark.ComplexFutureAction
isCompleted() - Method in interface org.apache.spark.FutureAction: Returns whether the action has already been completed with a value or an exception.
isCompleted() - Method in class org.apache.spark.SimpleFutureAction
isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
isLocal() - Method in class org.apache.spark.SparkContext
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
isNullAt(int) - Method in class org.apache.spark.sql.api.java.Row: Returns true if value at column `i` is NULL.
isRDD() - Method in class org.apache.spark.storage.BlockId
isShuffle() - Method in class org.apache.spark.storage.BlockId
isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver: Check if the receiver has started or not.
isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver: Check if receiver has been marked for stopping.
isTraceEnabled() - Method in interface org.apache.spark.Logging
isValid() - Method in class org.apache.spark.storage.StorageLevel
isZero() - Method in class org.apache.spark.streaming.Duration
iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike: Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: Internal method to this RDD; will read from cache if applicable, or otherwise compute it.

J

j() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
jarOfClass(Class<?>) - Static method in class org.apache.spark.api.java.JavaSparkContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.SparkContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.StreamingContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfObject(Object) - Static method in class org.apache.spark.api.java.JavaSparkContext: Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jarOfObject(Object) - Static method in class org.apache.spark.SparkContext: Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jars() - Method in class org.apache.spark.api.java.JavaSparkContext
jars() - Method in class org.apache.spark.SparkContext
JavaDoubleRDD - Class in org.apache.spark.api.java
JavaDoubleRDD(RDD<Object>) - Constructor for class org.apache.spark.api.java.JavaDoubleRDD
JavaDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data.
JavaDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaDStream
JavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Interface in org.apache.spark.streaming.api.java
JavaHiveContext - Class in org.apache.spark.sql.hive.api.java: The entry point for executing Spark SQL queries from a Java program.
JavaHiveContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.hive.api.java.JavaHiveContext
JavaInputDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to InputDStream.
JavaInputDStream(InputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaInputDStream
JavaPairDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join.
JavaPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairDStream
JavaPairInputDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to InputDStream of key-value pairs.
JavaPairInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairInputDStream
JavaPairRDD<K,V> - Class in org.apache.spark.api.java
JavaPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaPairRDD
JavaPairReceiverInputDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaPairReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
JavaRDD<T> - Class in org.apache.spark.api.java
JavaRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.api.java.JavaRDD
JavaRDDLike<T,This extends JavaRDDLike<T,This>> - Interface in org.apache.spark.api.java
JavaReceiverInputDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
JavaSchemaRDD - Class in org.apache.spark.sql.api.java: An RDD of Row objects that is returned as the result of a Spark SQL query.
JavaSchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.api.java.JavaSchemaRDD
JavaSerializer - Class in org.apache.spark.serializer: :: DeveloperApi :: A Spark serializer that uses Java's built-in serialization.
JavaSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.JavaSerializer
JavaSparkContext - Class in org.apache.spark.api.java: A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones.
JavaSparkContext(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext() - Constructor for class org.apache.spark.api.java.JavaSparkContext: Create a JavaSparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
JavaSparkContext(SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String[]) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String[], Map<String, String>) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSQLContext - Class in org.apache.spark.sql.api.java: The entry point for executing Spark SQL queries from a Java program.
JavaSQLContext(SQLContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
JavaSQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
JavaStreamingContext - Class in org.apache.spark.streaming.api.java: A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality.
JavaStreamingContext(StreamingContext) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
JavaStreamingContext(String, String, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[]) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[], Map<String, String>) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(JavaSparkContext, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a JavaStreamingContext using an existing JavaSparkContext.
JavaStreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a JavaStreamingContext using a SparkConf configuration.
JavaStreamingContext(String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Recreate a JavaStreamingContext from a checkpoint file.
JavaStreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Re-creates a JavaStreamingContext from a checkpoint file.
JavaStreamingContextFactory - Interface in org.apache.spark.streaming.api.java: Factory interface for creating a new JavaStreamingContext
JdbcRDD<T> - Class in org.apache.spark.rdd: An RDD that executes an SQL query on a JDBC connection and reads results.
JdbcRDD(SparkContext, Function0<Connection>, String, long, long, int, Function1<ResultSet, T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.JdbcRDD
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
JobLogger - Class in org.apache.spark.scheduler: :: DeveloperApi :: A logger class to record runtime information for jobs in Spark.
JobLogger(String, String) - Constructor for class org.apache.spark.scheduler.JobLogger
JobLogger() - Constructor for class org.apache.spark.scheduler.JobLogger
JobProgressListener - Class in org.apache.spark.ui.jobs: :: DeveloperApi :: Tracks task-level information to be displayed in the UI.
JobProgressListener(SparkConf) - Constructor for class org.apache.spark.ui.jobs.JobProgressListener
JobResult - Interface in org.apache.spark.scheduler: :: DeveloperApi :: A result of a job in the DAGScheduler.
jobResult() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
JobSucceeded - Class in org.apache.spark.scheduler
JobSucceeded() - Constructor for class org.apache.spark.scheduler.JobSucceeded
join(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
join(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join() - Method in class org.apache.spark.sql.execution.Generate
join(SchemaRDD, JoinType, Option<Expression>) - Method in class org.apache.spark.sql.SchemaRDD: Performs a relational join on two SchemaRDDs
join(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
joinType() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
jvmInformation() - Method in class org.apache.spark.ui.env.EnvironmentListener

K

k() - Method in class org.apache.spark.mllib.clustering.KMeansModel: Total number of clusters.
K_MEANS_PARALLEL() - Static method in class org.apache.spark.mllib.clustering.KMeans
KafkaUtils - Class in org.apache.spark.streaming.kafka
KafkaUtils() - Constructor for class org.apache.spark.streaming.kafka.KafkaUtils
kClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
keyBy(Function<T, K>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Creates tuples of the elements in this RDD by applying f.
keyBy(Function1<T, K>) - Method in class org.apache.spark.rdd.RDD: Creates tuples of the elements in this RDD by applying f.
keys() - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the keys of each tuple.
kFold(RDD<T>, int, int, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils: :: Experimental :: Return a k element array of pairs of RDDs with the first element of each pair containing the training data, a complement of the validation data and the second element, the validation data, containing a unique 1/kth of the data.
kManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
KMeans - Class in org.apache.spark.mllib.clustering: K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al).
KMeans() - Constructor for class org.apache.spark.mllib.clustering.KMeans: Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, runs: 1, initializationMode: "k-means||", initializationSteps: 5, epsilon: 1e-4}.
KMeansDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate test data for KMeans.
KMeansDataGenerator() - Constructor for class org.apache.spark.mllib.util.KMeansDataGenerator
KMeansModel - Class in org.apache.spark.mllib.clustering: A clustering model for K-means.
KryoRegistrator - Interface in org.apache.spark.serializer: Interface implemented by clients to register their classes with Kryo when using Kryo serialization.
KryoSerializer - Class in org.apache.spark.serializer: A Spark serializer that uses the Kryo serialization library.
KryoSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.KryoSerializer

L

L1Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Updater for L1 regularized problems.
L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
LabeledPoint - Class in org.apache.spark.mllib.regression: Class that represents the features and labels of a data point.
LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
LassoModel - Class in org.apache.spark.mllib.regression: Regression model trained using Lasso.
LassoWithSGD - Class in org.apache.spark.mllib.regression: Train a regression model with L1-regularization using Stochastic Gradient Descent.
LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD: Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 1.0, miniBatchFraction: 1.0}.
lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
LBFGS - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to solve an optimization problem using Limited-memory BFGS.
LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
LeastSquaresGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a Least-squared loss function, as used in linear regression.
LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
left() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin: The Streamed Relation
left() - Method in class org.apache.spark.sql.execution.CartesianProduct
left() - Method in class org.apache.spark.sql.execution.HashJoin
leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
leftKeys() - Method in class org.apache.spark.sql.execution.HashJoin
leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
length() - Method in class org.apache.spark.scheduler.SplitInfo
length() - Method in class org.apache.spark.sql.api.java.Row: Returns the number of columns present in this Row.
length() - Method in class org.apache.spark.util.Vector
Limit - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Take the first limit elements.
Limit(int, SparkPlan, SparkContext) - Constructor for class org.apache.spark.sql.execution.Limit
limit() - Method in class org.apache.spark.sql.execution.Limit
limit() - Method in class org.apache.spark.sql.execution.TakeOrdered
limit(Expression) - Method in class org.apache.spark.sql.SchemaRDD: Limits the results by the given expressions.
LinearDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate sample data used for Linear Data.
LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
LinearRegressionModel - Class in org.apache.spark.mllib.regression: Regression model trained using LinearRegression.
LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train a linear regression model with no regularization using Stochastic Gradient Descent.
LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Construct a LinearRegression object with default parameters: {stepSize: 1.0, numIterations: 100, miniBatchFraction: 1.0}.
listenerBus() - Method in class org.apache.spark.SparkContext
loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: :: Experimental :: Load labeled data from a file.
loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of partitions.
loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the number of features determined automatically and the default number of partitions.
loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of features determined automatically and the default number of partitions.
loadTestTable(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
LocalHiveContext - Class in org.apache.spark.sql.hive: Starts up an instance of hive where metadata is stored locally.
LocalHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.LocalHiveContext
localValue() - Method in class org.apache.spark.Accumulable: Get the current value of this accumulator from within a task.
location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
log() - Method in interface org.apache.spark.Logging
log_() - Method in interface org.apache.spark.Logging
logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
logDirName() - Method in class org.apache.spark.scheduler.JobLogger
logError(Function0<String>) - Method in interface org.apache.spark.Logging
logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
Logging - Interface in org.apache.spark: :: DeveloperApi :: Utility trait for classes that want to log data.
logicalPlanToSparkQuery(LogicalPlan) - Method in class org.apache.spark.sql.SQLContext: :: DeveloperApi :: Allows catalyst LogicalPlans to be executed as a SchemaRDD.
logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
LogisticGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a logistic loss function, as used in binary classification.
LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate test data for LogisticRegression.
LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
LogisticRegressionModel - Class in org.apache.spark.mllib.classification: Classification model trained using Logistic Regression.
LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification: Train a classification model for Logistic Regression using Stochastic Gradient Descent.
LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Construct a LogisticRegression object with default parameters
logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
longWritableConverter() - Static method in class org.apache.spark.SparkContext
lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the list of values in the RDD for key key.
lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the list of values in the RDD for key key.
low() - Method in class org.apache.spark.partial.BoundedDouble
LZFCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: LZF implementation of CompressionCodec.
LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec

M

main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD.
makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD, with one or more location preferences (hostnames of Spark nodes) for each object.
map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult: Transform this PartialResult into a PartialResult of type T.
map(Function1<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to all elements of this RDD.
map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream.
map(Function1<T, U>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by applying a function to all elements of this DStream.
mapId() - Method in class org.apache.spark.FetchFailed
mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
mapOutputTracker() - Method in class org.apache.spark.SparkEnv
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitions(Function1<Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: :: DeveloperApi :: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream.
mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapValues(Function1<V, U>, ClassTag) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Maps f over this RDD, where f takes an additional parameter of type A.
master() - Method in class org.apache.spark.api.java.JavaSparkContext
master() - Method in class org.apache.spark.SparkContext
Matrices - Class in org.apache.spark.mllib.linalg: Factory methods for Matrix.
Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
Matrix - Interface in org.apache.spark.mllib.linalg: Trait for a local matrix.
MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents an entry in an distributed matrix.
MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation: Model representing the result of matrix factorization.
max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the maximum element from this RDD as defined by the specified Comparator[T].
max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Maximum value of each column.
max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the max of this RDD as defined by the implicit Ordering[T].
max(Duration) - Method in class org.apache.spark.streaming.Duration
max(Time) - Method in class org.apache.spark.streaming.Time
max() - Method in class org.apache.spark.util.StatCounter
maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
maxMem() - Method in class org.apache.spark.storage.StorageStatus
maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the mean of this RDD's elements.
mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample mean vector.
mean() - Method in class org.apache.spark.partial.BoundedDouble
mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.util.StatCounter
meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the approximate mean of the elements in this RDD.
meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: :: Experimental :: Approximate operation to return the mean within a timeout.
meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: :: Experimental :: Approximate operation to return the mean within a timeout.
MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
memRemaining() - Method in class org.apache.spark.storage.StorageStatus
memSize() - Method in class org.apache.spark.storage.BlockStatus
memSize() - Method in class org.apache.spark.storage.RDDInfo
memUsed() - Method in class org.apache.spark.storage.StorageStatus
memUsedByRDD(int) - Method in class org.apache.spark.storage.StorageStatus
merge(R) - Method in class org.apache.spark.Accumulable: Merge two accumulable objects together
merge(double) - Method in class org.apache.spark.util.StatCounter: Add a value into this StatCounter, updating the internal statistics.
merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter: Add multiple values into this StatCounter, updating the internal statistics.
merge(StatCounter) - Method in class org.apache.spark.util.StatCounter: Merge another StatCounter into this one, adding up the internal statistics.
mergeCombiners() - Method in class org.apache.spark.Aggregator
mergeValue() - Method in class org.apache.spark.Aggregator
metadataCleaner() - Method in class org.apache.spark.SparkContext
metastorePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
metastorePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
metrics() - Method in class org.apache.spark.ExceptionFailure
metricsSystem() - Method in class org.apache.spark.SparkEnv
MFDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate RDD(s) containing data for Matrix Factorization.
MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
milliseconds() - Method in class org.apache.spark.streaming.Duration
Milliseconds - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of milliseconds.
Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
milliseconds() - Method in class org.apache.spark.streaming.Time
millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener: Reformat a time interval in milliseconds to a prettier format for output
min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the minimum element from this RDD as defined by the specified Comparator[T].
min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Minimum value of each column.
min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the min of this RDD as defined by the implicit Ordering[T].
min(Duration) - Method in class org.apache.spark.streaming.Duration
min(Time) - Method in class org.apache.spark.streaming.Time
min() - Method in class org.apache.spark.util.StatCounter
MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
Minutes - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of minutes.
Minutes() - Constructor for class org.apache.spark.streaming.Minutes
MLUtils - Class in org.apache.spark.mllib.util: Helper methods to load, save and pre-process data used in ML Lib.
MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
MQTTUtils - Class in org.apache.spark.streaming.mqtt
MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Multiply this matrix by a local matrix on the right.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Multiply this matrix by a local matrix on the right.
multiply(double) - Method in class org.apache.spark.util.Vector
MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat: Trait for multivariate statistical summary of a data matrix.
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
MutablePair<T1,T2> - Class in org.apache.spark.util: :: DeveloperApi :: A tuple of 2 elements.
MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
MutablePair() - Constructor for class org.apache.spark.util.MutablePair: No-arg constructor for serialization

N

NaiveBayes - Class in org.apache.spark.mllib.classification: Trains a Naive Bayes model given an RDD of (label, features) pairs.
NaiveBayes() - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
NaiveBayesModel - Class in org.apache.spark.mllib.classification: Model for Naive Bayes Classifiers.
name() - Method in interface org.apache.spark.api.java.JavaRDDLike
name() - Method in class org.apache.spark.rdd.RDD: A friendly name for this RDD
name() - Method in class org.apache.spark.scheduler.StageInfo
name() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
name() - Method in class org.apache.spark.storage.BlockId: A globally unique identifier for this Block.
name() - Method in class org.apache.spark.storage.BroadcastBlockId
name() - Method in class org.apache.spark.storage.RDDBlockId
name() - Method in class org.apache.spark.storage.RDDInfo
name() - Method in class org.apache.spark.storage.ShuffleBlockId
name() - Method in class org.apache.spark.storage.StreamBlockId
name() - Method in class org.apache.spark.storage.TaskResultBlockId
name() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
NarrowDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies where each partition of the parent RDD is used by at most one partition of the child RDD.
NarrowDependency(RDD<T>) - Constructor for class org.apache.spark.NarrowDependency
nettyPort() - Method in class org.apache.spark.storage.BlockManagerId
networkStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented receiver.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop file with an arbitrary new API InputFormat.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.SparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newBroadcast(T, boolean, long, ClassTag<T>) - Method in interface org.apache.spark.broadcast.BroadcastFactory
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
NewHadoopRDD<K,V> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the new MapReduce API (org.apache.hadoop.mapreduce).
NewHadoopRDD(SparkContext, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, Configuration) - Constructor for class org.apache.spark.rdd.NewHadoopRDD
newInstance() - Method in class org.apache.spark.serializer.JavaSerializer
newInstance() - Method in class org.apache.spark.serializer.KryoSerializer
newInstance() - Method in interface org.apache.spark.serializer.Serializer
newInstance() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
newKryo() - Method in class org.apache.spark.serializer.KryoSerializer
newKryoOutput() - Method in class org.apache.spark.serializer.KryoSerializer
newPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
next() - Method in class org.apache.spark.InterruptibleIterator
Node - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Node in a decision tree
Node(int, double, boolean, Option<Split>, Option<Node>, Option<Node>, Option<InformationGainStats>) - Constructor for class org.apache.spark.mllib.tree.model.Node
NODE_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
NONE - Static variable in class org.apache.spark.api.java.StorageLevels
NONE() - Static method in class org.apache.spark.scheduler.SchedulingMode
NONE() - Static method in class org.apache.spark.storage.StorageLevel
numberOfHiccups() - Method in class org.apache.spark.streaming.receiver.Statistics
numberOfMsgs() - Method in class org.apache.spark.streaming.receiver.Statistics
numberOfWorkers() - Method in class org.apache.spark.streaming.receiver.Statistics
numCachedPartitions() - Method in class org.apache.spark.storage.RDDInfo
numCols() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Gets or computes the number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.Matrix: Number of columns.
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.SparkContext
numNonzeros() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Number of nonzero elements (including explicitly presented zero values) in each column.
numPartitions() - Method in class org.apache.spark.HashPartitioner
numPartitions() - Method in class org.apache.spark.Partitioner
numPartitions() - Method in class org.apache.spark.RangePartitioner
numPartitions() - Method in class org.apache.spark.storage.RDDInfo
numRows() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Gets or computes the number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.Matrix: Number of rows.
numTasks() - Method in class org.apache.spark.scheduler.StageInfo

O

objectFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectInspector() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan: The hive object inspector for this table, which can be used to extract values from the serialized row representation.
OFF_HEAP - Static variable in class org.apache.spark.api.java.StorageLevels
OFF_HEAP() - Static method in class org.apache.spark.storage.StorageLevel
onApplicationEnd(SparkListenerApplicationEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the application ends
onApplicationStart(SparkListenerApplicationStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the application starts
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
onBatchCompleted(StreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a batch of jobs has completed.
onBatchStarted(StreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a batch of jobs has started.
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a batch of jobs has been submitted for processing.
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a new block manager has joined
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.storage.StorageStatusListener
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in interface org.apache.spark.scheduler.SparkListener: Called when an existing block manager has been removed
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.storage.StorageStatusListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in interface org.apache.spark.FutureAction: When this action is completed, either through an exception, or a value, applies the provided function.
onComplete(Function1<R, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult: Set a handler to be called when this PartialResult completes.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in interface org.apache.spark.scheduler.SparkListener: Called when environment properties have been updated
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.env.EnvironmentListener
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
ones(int) - Static method in class org.apache.spark.util.Vector
OneToOneDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs.
OneToOneDependency(RDD<T>) - Constructor for class org.apache.spark.OneToOneDependency
onFail(Function1<Exception, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult: Set a handler to be called if this PartialResult's job fails.
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.JobLogger: When job ends, recording job completion status and close log file
onJobEnd(SparkListenerJobEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a job ends
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.JobLogger: When job starts, record job property and stage graph
onJobStart(SparkListenerJobStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a job starts
onReceiverError(StreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has reported an error
onReceiverStarted(StreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has been started
onReceiverStopped(StreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has been stopped
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.JobLogger: When stage is completed, record stage completion status
onStageCompleted(SparkListenerStageCompleted) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a stage completes successfully or fails, with information on the completed stage.
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.StatsReportListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.storage.StorageListener
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.JobLogger: When stage is submitted, record stage submit info
onStageSubmitted(SparkListenerStageSubmitted) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a stage is submitted
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.jobs.JobProgressListener: For FIFO, all stages are contained by "default" pool but "default" pool here is meaningless
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.storage.StorageListener
onStart() - Method in class org.apache.spark.streaming.receiver.Receiver: This method is called by the system when the receiver is started.
onStop() - Method in class org.apache.spark.streaming.receiver.Receiver: This method is called by the system when the receiver is stopped.
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.JobLogger: When task ends, record task completion status and metrics
onTaskEnd(SparkListenerTaskEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task ends
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.StatsReportListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.storage.StorageStatusListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.storage.StorageListener: Assumes the storage status list is fully up-to-date.
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task begins remotely fetching its result (will not be called for tasks that do not need to fetch the result remotely).
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onTaskStart(SparkListenerTaskStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task starts
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in interface org.apache.spark.scheduler.SparkListener: Called when an RDD is manually unpersisted by the application
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.storage.StorageStatusListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.ui.storage.StorageListener
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.GradientDescent: :: DeveloperApi :: Runs gradient descent on the given training data.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.LBFGS
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in interface org.apache.spark.mllib.optimization.Optimizer: Solve the provided convex optimization problem.
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
optimizer() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
Optimizer - Interface in org.apache.spark.mllib.optimization: :: DeveloperApi :: Trait for optimization problem solvers.
optimizer() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: The optimizer to solve the problem.
optimizer() - Method in class org.apache.spark.mllib.regression.LassoWithSGD
optimizer() - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
optimizer() - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
orderBy(Seq<SortOrder>) - Method in class org.apache.spark.sql.SchemaRDD: Sorts the results by the given expressions.
OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
OrderedRDDFunctions(RDD, Ordering<K>, ClassTag<K>, ClassTag<V>, ClassTag) - Constructor for class org.apache.spark.rdd.OrderedRDDFunctions
ordering() - Method in class org.apache.spark.sql.execution.Sort
ordering() - Method in class org.apache.spark.sql.execution.TakeOrdered
ordering() - Static method in class org.apache.spark.streaming.Time
org.apache.spark - package org.apache.spark: Core Spark classes in Scala.
org.apache.spark.annotation - package org.apache.spark.annotation: Spark annotations to mark an API experimental or intended only for advanced usages by developers.
org.apache.spark.api.java - package org.apache.spark.api.java: Spark Java programming APIs.
org.apache.spark.api.java.function - package org.apache.spark.api.java.function: Set of interfaces to represent functions in Spark's Java API.
org.apache.spark.broadcast - package org.apache.spark.broadcast: Spark's broadcast variables, used to broadcast immutable datasets to all nodes.
org.apache.spark.io - package org.apache.spark.io: IO codecs used for compression.
org.apache.spark.mllib.classification - package org.apache.spark.mllib.classification
org.apache.spark.mllib.clustering - package org.apache.spark.mllib.clustering
org.apache.spark.mllib.evaluation - package org.apache.spark.mllib.evaluation
org.apache.spark.mllib.linalg - package org.apache.spark.mllib.linalg
org.apache.spark.mllib.linalg.distributed - package org.apache.spark.mllib.linalg.distributed
org.apache.spark.mllib.optimization - package org.apache.spark.mllib.optimization
org.apache.spark.mllib.recommendation - package org.apache.spark.mllib.recommendation
org.apache.spark.mllib.regression - package org.apache.spark.mllib.regression
org.apache.spark.mllib.stat - package org.apache.spark.mllib.stat
org.apache.spark.mllib.tree - package org.apache.spark.mllib.tree
org.apache.spark.mllib.tree.configuration - package org.apache.spark.mllib.tree.configuration
org.apache.spark.mllib.tree.impurity - package org.apache.spark.mllib.tree.impurity
org.apache.spark.mllib.tree.model - package org.apache.spark.mllib.tree.model
org.apache.spark.mllib.util - package org.apache.spark.mllib.util
org.apache.spark.partial - package org.apache.spark.partial
org.apache.spark.rdd - package org.apache.spark.rdd: Provides implementation's of various RDDs.
org.apache.spark.scheduler - package org.apache.spark.scheduler: Spark's DAG scheduler.
org.apache.spark.serializer - package org.apache.spark.serializer: Pluggable serializers for RDD and shuffle data.
org.apache.spark.sql - package org.apache.spark.sql
org.apache.spark.sql.api.java - package org.apache.spark.sql.api.java
org.apache.spark.sql.execution - package org.apache.spark.sql.execution
org.apache.spark.sql.hive - package org.apache.spark.sql.hive
org.apache.spark.sql.hive.api.java - package org.apache.spark.sql.hive.api.java
org.apache.spark.sql.hive.execution - package org.apache.spark.sql.hive.execution
org.apache.spark.sql.hive.test - package org.apache.spark.sql.hive.test
org.apache.spark.sql.parquet - package org.apache.spark.sql.parquet
org.apache.spark.sql.test - package org.apache.spark.sql.test
org.apache.spark.storage - package org.apache.spark.storage
org.apache.spark.streaming - package org.apache.spark.streaming
org.apache.spark.streaming.api.java - package org.apache.spark.streaming.api.java: Java APIs for spark streaming.
org.apache.spark.streaming.dstream - package org.apache.spark.streaming.dstream: Various implementations of DStreams.
org.apache.spark.streaming.flume - package org.apache.spark.streaming.flume: Spark streaming receiver for Flume.
org.apache.spark.streaming.kafka - package org.apache.spark.streaming.kafka: Kafka receiver for spark streaming.
org.apache.spark.streaming.mqtt - package org.apache.spark.streaming.mqtt: MQTT receiver for Spark Streaming.
org.apache.spark.streaming.receiver - package org.apache.spark.streaming.receiver
org.apache.spark.streaming.scheduler - package org.apache.spark.streaming.scheduler
org.apache.spark.streaming.twitter - package org.apache.spark.streaming.twitter: Twitter feed receiver for spark streaming.
org.apache.spark.streaming.zeromq - package org.apache.spark.streaming.zeromq: Zeromq receiver for spark streaming.
org.apache.spark.ui.env - package org.apache.spark.ui.env
org.apache.spark.ui.exec - package org.apache.spark.ui.exec
org.apache.spark.ui.jobs - package org.apache.spark.ui.jobs
org.apache.spark.ui.storage - package org.apache.spark.ui.storage
org.apache.spark.util - package org.apache.spark.util: Spark utilities.
org.apache.spark.util.random - package org.apache.spark.util.random: Utilities for random number generation.
otherCopyArgs() - Method in class org.apache.spark.sql.execution.Aggregate
otherCopyArgs() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
otherCopyArgs() - Method in class org.apache.spark.sql.execution.Limit
otherCopyArgs() - Method in class org.apache.spark.sql.execution.TakeOrdered
otherCopyArgs() - Method in class org.apache.spark.sql.execution.Union
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
otherCopyArgs() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
otherCopyArgs() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
otherInfo() - Method in class org.apache.spark.streaming.receiver.Statistics
outer() - Method in class org.apache.spark.sql.execution.Generate
output() - Method in class org.apache.spark.sql.execution.Aggregate
output() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
output() - Method in class org.apache.spark.sql.execution.CartesianProduct
output() - Method in class org.apache.spark.sql.execution.Exchange
output() - Method in class org.apache.spark.sql.execution.ExistingRdd
output() - Method in class org.apache.spark.sql.execution.Filter
output() - Method in class org.apache.spark.sql.execution.Generate
output() - Method in class org.apache.spark.sql.execution.HashJoin
output() - Method in class org.apache.spark.sql.execution.Limit
output() - Method in class org.apache.spark.sql.execution.Project
output() - Method in class org.apache.spark.sql.execution.Sample
output() - Method in class org.apache.spark.sql.execution.Sort
output() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
output() - Method in class org.apache.spark.sql.execution.TakeOrdered
output() - Method in class org.apache.spark.sql.execution.Union
output() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
output() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
output() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
output() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
output() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
outputClass() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
outputPartitioning() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
outputPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
outputPartitioning() - Method in class org.apache.spark.sql.execution.HashJoin
outputPartitioning() - Method in class org.apache.spark.sql.execution.SparkPlan: Specifies how data is partitioned across different nodes in the cluster.
overwrite() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
overwrite() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable

P

PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream: Extra functions available on DStream of (key, value) pairs through an implicit conversion.
PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function: A function that returns zero or more key-value pair records from each input record.
PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function: A function that returns key-value pairs (Tuple2), and can be used to construct PairRDDs.
PairRDDFunctions<K,V> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parquetFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Loads a parquet file, returning the result as a JavaSchemaRDD.
parquetFile(String) - Method in class org.apache.spark.sql.SQLContext: Loads a Parquet file, returning the result as a SchemaRDD.
ParquetTableScan - Class in org.apache.spark.sql.parquet: Parquet table scan operator.
ParquetTableScan(Seq<Attribute>, ParquetRelation, Option<Expression>, SparkContext) - Constructor for class org.apache.spark.sql.parquet.ParquetTableScan
partial() - Method in class org.apache.spark.sql.execution.Aggregate
PartialResult<R> - Class in org.apache.spark.partial
PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
Partition - Interface in org.apache.spark: A partition of an RDD.
partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a copy of the RDD partitioned using the specified partitioner.
Partitioner - Class in org.apache.spark: An object that defines how the elements in a key-value pair RDD are partitioned by key.
Partitioner() - Constructor for class org.apache.spark.Partitioner
partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
partitioner() - Method in class org.apache.spark.rdd.RDD: Optionally overridden by subclasses to specify how they are partitioned.
partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
partitioner() - Method in class org.apache.spark.ShuffleDependency
partitionId() - Method in class org.apache.spark.TaskContext
partitionPruningPred() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
PartitionPruningRDD<T> - Class in org.apache.spark.rdd: :: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
partitions() - Method in class org.apache.spark.rdd.RDD: Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.
path() - Method in class org.apache.spark.scheduler.InputFormatInfo
path() - Method in class org.apache.spark.scheduler.SplitInfo
percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist(StorageLevel) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist the RDDs of this DStream with the given storage level
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream: Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persistentRdds() - Method in class org.apache.spark.SparkContext
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(String) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector: return (this + plus) dot other, but without creating any intermediate storage
poisson() - Method in class org.apache.spark.util.random.PoissonSampler
PoissonSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on values drawn from Poisson distribution.
PoissonSampler(double, Poisson) - Constructor for class org.apache.spark.util.random.PoissonSampler
poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
port() - Method in class org.apache.spark.storage.BlockManagerId
pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.
precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, precision) curve.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for examples stored in a JavaRDD.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Returns the cluster index that a given point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Maps given points to their cluster indices.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Maps given points to their cluster indices.
predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Predict the rating of one user for one product.
predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Predict the rating of many users for many products.
predict(JavaRDD<byte[]>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: :: DeveloperApi :: Predict the rating of many users for many products.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for examples stored in a JavaRDD.
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for the given data set using the model trained.
predict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
predict() - Method in class org.apache.spark.mllib.tree.model.Node
predictIfLeaf(Vector) - Method in class org.apache.spark.mllib.tree.model.Node: predict value if node is not leaf
preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver: Override this to specify a preferred location (hostname).
preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD: Get the preferred locations of a partition (as hostnames), taking into account whether the RDD is checkpointed.
preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
prettyPrint() - Method in class org.apache.spark.streaming.Duration
prev() - Method in class org.apache.spark.rdd.ShuffledRDD
print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Print the first ten elements of each RDD generated in this DStream.
print() - Method in class org.apache.spark.streaming.dstream.DStream: Print the first ten elements of each RDD generated in this DStream.
printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for the all jobs of this batch to finish processing from the time they started processing.
processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
product() - Method in class org.apache.spark.mllib.recommendation.Rating
productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
productToRowRdd(RDD<A>) - Static method in class org.apache.spark.sql.execution.ExistingRdd
Project - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Project(Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Project
projectList() - Method in class org.apache.spark.sql.execution.Project
properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
pruneColumns(Seq<Attribute>) - Method in class org.apache.spark.sql.parquet.ParquetTableScan: Applies a (candidate) projection.
Pseudorandom - Interface in org.apache.spark.util.random: :: DeveloperApi :: A class with pseudorandom behavior.
putCachedMetadata(String, Object) - Static method in class org.apache.spark.rdd.HadoopRDD

Q

quantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
QuantileStrategy - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Enum for selecting the quantile calculation strategy
QuantileStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.QuantileStrategy
QueryExecutionException - Exception in org.apache.spark.sql.execution
QueryExecutionException(String) - Constructor for exception org.apache.spark.sql.execution.QueryExecutionException
queueStream(Queue<JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean, JavaRDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from a queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from a queue of RDDs.

R

RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
random(int, Random) - Static method in class org.apache.spark.util.Vector: Creates this Vector of given length containing random numbers between 0.0 and 1.0.
RandomSampler<T,U> - Interface in org.apache.spark.util.random: :: DeveloperApi :: A pseudorandom sampler.
randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD: Randomly splits this RDD with the provided weights.
RangeDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
RangePartitioner<K,V> - Class in org.apache.spark: A Partitioner that partitions sortable records by range into roughly equal ranges.
RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Rating - Class in org.apache.spark.mllib.recommendation: :: Experimental :: A more compact class to represent a rating than Tuple3[Int, Int, Double].
Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
rating() - Method in class org.apache.spark.mllib.recommendation.Rating
rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
rdd() - Method in class org.apache.spark.api.java.JavaRDD
rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
rdd() - Method in class org.apache.spark.Dependency
RDD<T> - Class in org.apache.spark.rdd: A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD: Construct an RDD with just a one-to-one dependency on one parent
rdd() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
rdd() - Method in class org.apache.spark.sql.execution.ExistingRdd
RDD() - Static method in class org.apache.spark.storage.BlockId
RDDBlockId - Class in org.apache.spark.storage
RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
rddBlocks() - Method in class org.apache.spark.storage.StorageStatus
rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
rddId() - Method in class org.apache.spark.storage.RDDBlockId
RDDInfo - Class in org.apache.spark.storage
RDDInfo(int, String, int, StorageLevel) - Constructor for class org.apache.spark.storage.RDDInfo
rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener: Filter RDD info to include only those with cached partitions
rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
rdds() - Method in class org.apache.spark.rdd.UnionRDD
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
readObject(ClassTag<T>) - Method in interface org.apache.spark.serializer.DeserializationStream
ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction: Blocks until this action completes.
ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, recall) curve.
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Receiver<T> - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: Abstract class of a receiver that can be run on worker nodes to receive external data.
Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
ReceiverInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information about a receiver
ReceiverInfo(int, String, ActorRef, boolean, String, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream: Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data.
ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented receiver.
receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented receiver.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD: Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Create a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by reducing over a using incremental computation.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for reduceByKeyLocally
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceId() - Method in class org.apache.spark.FetchFailed
reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
references() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
registerRDDAsTable(JavaSchemaRDD, String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Registers the given RDD as a temporary table in the catalog.
registerRDDAsTable(SchemaRDD, String) - Method in class org.apache.spark.sql.SQLContext: Registers the given RDD as a temporary table in the catalog.
registerTestTable(TestHiveContext.TestTable) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
RegressionModel - Interface in org.apache.spark.mllib.regression
relation() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
relation() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
relation() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets each DStreams in this context to remember RDDs it generated in the last given duration.
remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext: Set each DStreams in this context to remember RDDs it generated in the last given duration.
rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
remove(String) - Method in class org.apache.spark.SparkConf: Remove a parameter from the configuration
repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream with an increased or decreased level of parallelism.
replication() - Method in class org.apache.spark.storage.StorageLevel
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Report exceptions in receiving data.
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Aggregate
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.HashJoin
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Sort
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.SparkPlan: Specifies any partition requirements on the input data for this operator.
reset() - Method in class org.apache.spark.sql.hive.test.TestHiveContext: Resets the test instance by deleting any tables that have been created.
restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
Resubmitted - Class in org.apache.spark
Resubmitted() - Constructor for class org.apache.spark.Resubmitted
result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction: Awaits and returns the result (of type T) of this action.
result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
resultAttribute() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
RidgeRegressionModel - Class in org.apache.spark.mllib.regression: Regression model trained using RidgeRegression.
RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train a regression model with L2-regularization using Stochastic Gradient Descent.
RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 1.0, miniBatchFraction: 1.0}.
right() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin: The Broadcast relation
right() - Method in class org.apache.spark.sql.execution.CartesianProduct
right() - Method in class org.apache.spark.sql.execution.HashJoin
rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
rightKeys() - Method in class org.apache.spark.sql.execution.HashJoin
rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
Row - Class in org.apache.spark.sql.api.java: A result row from a SparkSQL query.
Row(Row) - Constructor for class org.apache.spark.sql.api.java.Row
row() - Method in class org.apache.spark.sql.api.java.Row
RowMatrix - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a row-oriented distributed Matrix with no meaningful row indices.
RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction: Executes some action enclosed in the closure.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans: Train a K-means model on the given set of points; data should be cached for high performance, because this is an iterative algorithm.
run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS: Run ALS with the configured parameters on an input RDD of (user, product, rating) triples.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries starting from the initial weights provided.
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, <any>, long) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Run a job that can return approximate results.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction: Runs a Spark job.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and pass the results to the given handler function.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on a given set of partitions of an RDD, but take a function of type Iterator[T] => U instead of (TaskContext, Iterator[T]) => U.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and pass the results to a handler function.
runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and pass the results to a handler function.
runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS: Run Limited-memory BFGS (L-BFGS) in parallel.
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent: Run stochastic gradient descent (SGD) in parallel using mini batches.
running() - Method in class org.apache.spark.scheduler.TaskInfo
runningLocally() - Method in class org.apache.spark.TaskContext
runSqlHive(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext

S

s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a sampled subset of this RDD.
sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD: Return a sampled subset of this RDD.
Sample - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Sample(double, boolean, long, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sample
sample(boolean, double, long) - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Returns a sampled version of the underlying dataset.
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler: take a random sample
sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.util.StatCounter: Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the sample variance of this RDD's elements (which corrects for bias in estimating the standard variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the sample variance of this RDD's elements (which corrects for bias in estimating the variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.util.StatCounter: Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsHiveFile(RDD<Writable>, Class<?>, FileSinkDesc, JobConf, boolean) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Save labeled data in LIBSVM format.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported storage system, using a Configuration object for that storage system.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop Configuration object for that storage system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream: Save each RDD in this DStream as a Sequence file of serialized objects.
saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions: Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.
saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream: Save each RDD in this DStream as at text file, using string representation of elements.
saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils: :: Experimental :: Save labeled data to a file.
sc() - Method in class org.apache.spark.api.java.JavaSparkContext
sc() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
sc() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
sc() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
sc() - Method in class org.apache.spark.streaming.StreamingContext
scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
scheduler() - Method in class org.apache.spark.streaming.StreamingContext
schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for the first job of this batch to start processing from the time this batch was submitted to the streaming scheduler.
SchedulingMode - Class in org.apache.spark.scheduler: "FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues "NONE" is used when the a Schedulable has no sub-queues.
SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
SchemaRDD - Class in org.apache.spark.sql: :: AlphaComponent :: An RDD of Row objects that has an associated schema.
SchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.SchemaRDD
script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
ScriptTransformation - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi :: Transforms the input by forking and running the specified script.
ScriptTransformation(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformation
seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
Seconds - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of seconds.
Seconds() - Constructor for class org.apache.spark.streaming.Seconds
securityManager() - Method in class org.apache.spark.SparkEnv
seed() - Method in class org.apache.spark.sql.execution.Sample
select(Seq<NamedExpression>) - Method in class org.apache.spark.sql.SchemaRDD: Changes the output of this relation to the given expressions, similar to the SELECT clause in SQL.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop SequenceFile.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext: Version of sequenceFile() for types implicitly convertible to Writables through a WritableConverter.
SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
SerializationStream - Interface in org.apache.spark.serializer: :: DeveloperApi :: A stream for writing serialized objects.
serialize(T, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializerInstance
serializedSize() - Method in class org.apache.spark.scheduler.TaskInfo
serializeMany(Iterator<T>, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializerInstance
Serializer - Interface in org.apache.spark.serializer: :: DeveloperApi :: A serializer.
serializer() - Method in class org.apache.spark.ShuffleDependency
serializer() - Method in class org.apache.spark.SparkEnv
SerializerInstance - Interface in org.apache.spark.serializer: :: DeveloperApi :: An instance of a serializer, for use by one thread at a time.
serializeStream(OutputStream) - Method in interface org.apache.spark.serializer.SerializerInstance
set(String, String) - Method in class org.apache.spark.SparkConf: Set a configuration variable.
set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf: Set multiple parameters together
setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS: :: Experimental :: Sets the constant used in computing confidence in implicit ALS.
setAppName(String) - Method in class org.apache.spark.SparkConf: Set a name for your application.
setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of blocks to parallelize the computation into; pass -1 for an auto-configured number of blocks.
setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
setCallSite(String) - Method in class org.apache.spark.SparkContext: Support function for API backtraces.
setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set the directory under which RDDs are going to be checkpointed.
setCheckpointDir(String) - Method in class org.apache.spark.SparkContext: Set the directory under which RDDs are going to be checkpointed.
setConvergenceTol(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the convergence tolerance of iterations for L-BFGS.
setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the distance threshold within which we've consider centers to have converged.
setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf: Set an environment variable to be used when launching executors for this application.
setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf: Set multiple environment variables to be used when launching executors.
setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf: Set multiple environment variables to be used when launching executors.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the gradient function (of the loss function of one single data example) to be used for SGD.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.
setIfMissing(String, String) - Method in class org.apache.spark.SparkConf: Set a parameter if it isn't already configured
setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS: Sets whether to use implicit preference.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the initialization algorithm.
setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the number of steps for the k-means|| initialization mode.
setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Set if the algorithm should add an intercept.
setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of iterations to run.
setJars(Seq<String>) - Method in class org.apache.spark.SparkConf: Set JAR files to distribute to the cluster.
setJars(String[]) - Method in class org.apache.spark.SparkConf: Set JAR files to distribute to the cluster.
setJobDescription(String) - Method in class org.apache.spark.SparkContext: Set a human readable description of the current job.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the number of clusters to create (k).
setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes: Set the smoothing parameter.
setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the regularization parameter, lambda.
setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext: Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setMaster(String) - Method in class org.apache.spark.SparkConf: The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set maximum number of iterations to run.
setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the maximal number of iterations for L-BFGS.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: :: Experimental :: Set fraction of data to be used for each SGD iteration.
setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.rdd.RDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Assign a name to this RDD
setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the number of corrections used in the LBFGS update.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the number of iterations for SGD.
setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the rank of the feature matrices computed (number of features).
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the regularization parameter.
setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans: :: Experimental :: Set the number of runs of the algorithm to execute in parallel.
setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS: Sets a random seed to have deterministic results.
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom: Set random seed.
setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
setSparkHome(String) - Method in class org.apache.spark.SparkConf: Set the location where Spark is installed on worker nodes.
setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the initial step size of SGD for the first step.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: :: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel: :: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the updater function to actually perform a gradient step in a given direction.
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the updater function to actually perform a gradient step in a given direction.
setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Set if the algorithm should validate data before training.
setValue(R) - Method in class org.apache.spark.Accumulable: Set the accumulator's value; only allowed on master
showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
ShuffleBlockId - Class in org.apache.spark.storage
ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
ShuffleDependency<K,V> - Class in org.apache.spark: :: DeveloperApi :: Represents a dependency on the output of a shuffle stage.
ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Serializer) - Constructor for class org.apache.spark.ShuffleDependency
ShuffledRDD<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd: :: DeveloperApi :: The resulting RDD from a shuffle (e.g.
ShuffledRDD(RDD, Partitioner, ClassTag) - Constructor for class org.apache.spark.rdd.ShuffledRDD
shuffleFetcher() - Method in class org.apache.spark.SparkEnv
shuffleId() - Method in class org.apache.spark.FetchFailed
shuffleId() - Method in class org.apache.spark.ShuffleDependency
shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
shuffleMemoryMap() - Method in class org.apache.spark.SparkEnv
shuffleRead() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
shuffleWrite() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
SimpleFutureAction<T> - Class in org.apache.spark: :: Experimental :: A FutureAction holding the result of an action that triggers a single job.
SimpleUpdater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: A simple updater for gradient descent *without* any regularization.
SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg: :: Experimental :: Represents singular value decomposition (SVD) factors.
SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
size() - Method in class org.apache.spark.mllib.linalg.DenseVector
size() - Method in class org.apache.spark.mllib.linalg.SparseVector
size() - Method in interface org.apache.spark.mllib.linalg.Vector: Size of the vector.
slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream: Return all the RDDs defined by the Interval object (both end times included)
slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream: Return all the RDDs between 'fromTime' to 'toTime' (both included)
slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream: Time interval after which the DStream generates a RDD
slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
SnappyCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: Snappy implementation of CompressionCodec.
SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from TCP source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from TCP source hostname:port.
Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
Sort - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Sort(Seq<SortOrder>, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sort
sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements in ascending order.
sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortOrder() - Method in class org.apache.spark.sql.execution.Sort
sortOrder() - Method in class org.apache.spark.sql.execution.TakeOrdered
SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
SPARK_UNKNOWN_USER() - Static method in class org.apache.spark.SparkContext
SPARK_VERSION() - Static method in class org.apache.spark.SparkContext
SparkConf - Class in org.apache.spark: Configuration for a Spark application.
SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
SparkConf() - Constructor for class org.apache.spark.SparkConf: Create a SparkConf that loads defaults from system properties and the classpath
sparkContext() - Method in class org.apache.spark.rdd.RDD: The SparkContext that created this RDD.
SparkContext - Class in org.apache.spark: Main entry point for Spark functionality.
SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
SparkContext() - Constructor for class org.apache.spark.SparkContext: Create a SparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext: :: DeveloperApi :: Alternative constructor for setting preferred locations where Spark will create executors.
SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext: Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext: Alternative constructor that allows setting common Spark properties directly
sparkContext() - Method in class org.apache.spark.sql.SQLContext
sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: The underlying SparkContext
sparkContext() - Method in class org.apache.spark.streaming.StreamingContext: Return the associated Spark context
SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
SparkEnv - Class in org.apache.spark: :: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc.
SparkEnv(String, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleFetcher, org.apache.spark.broadcast.BroadcastManager, org.apache.spark.storage.BlockManager, ConnectionManager, SecurityManager, HttpFileServer, String, org.apache.spark.metrics.MetricsSystem, SparkConf) - Constructor for class org.apache.spark.SparkEnv
SparkException - Exception in org.apache.spark
SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
SparkException(String) - Constructor for exception org.apache.spark.SparkException
SparkFiles - Class in org.apache.spark: Resolves paths to files added through SparkContext.addFile().
SparkFiles() - Constructor for class org.apache.spark.SparkFiles
sparkFilesDir() - Method in class org.apache.spark.SparkEnv
SparkFlumeEvent - Class in org.apache.spark.streaming.flume: A wrapper class for AvroFlumeEvent's with a custom serialization format.
SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
SparkListener - Interface in org.apache.spark.scheduler: :: DeveloperApi :: Interface for listening to events from the Spark scheduler.
SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
SparkListenerApplicationStart - Class in org.apache.spark.scheduler
SparkListenerApplicationStart(String, long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
SparkListenerBlockManagerAdded(BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
SparkListenerBlockManagerRemoved(BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
SparkListenerEvent - Interface in org.apache.spark.scheduler
SparkListenerJobEnd - Class in org.apache.spark.scheduler
SparkListenerJobEnd(int, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
SparkListenerJobStart - Class in org.apache.spark.scheduler
SparkListenerJobStart(int, Seq<Object>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
SparkListenerStageCompleted - Class in org.apache.spark.scheduler
SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
SparkListenerTaskEnd - Class in org.apache.spark.scheduler
SparkListenerTaskEnd(int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
SparkListenerTaskStart - Class in org.apache.spark.scheduler
SparkListenerTaskStart(int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
SparkLogicalPlan - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Allows already planned SparkQueries to be linked into logical query plans.
SparkLogicalPlan(SparkPlan) - Constructor for class org.apache.spark.sql.execution.SparkLogicalPlan
SparkPlan - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
SparkPlan() - Constructor for class org.apache.spark.sql.execution.SparkPlan
sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
sparkUser() - Method in class org.apache.spark.SparkContext
sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector providing its index array and value array.
sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs.
sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
SparseVector - Class in org.apache.spark.mllib.linalg: A sparse vector represented by an index array and an value array.
SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
split() - Method in class org.apache.spark.mllib.tree.model.Node
Split - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Split applied to a feature
Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
splitId() - Method in class org.apache.spark.TaskContext
splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
SplitInfo - Class in org.apache.spark.scheduler
SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
splits() - Method in interface org.apache.spark.api.java.JavaRDDLike: Set of partitions in this RDD.
sql(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Executes a query expressed in SQL, returning the result as a JavaSchemaRDD
sql(String) - Method in class org.apache.spark.sql.SQLContext: Executes a SQL query using Spark, returning the result as a SchemaRDD.
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSQLContext
sqlContext() - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
sqlContext() - Method in class org.apache.spark.sql.SchemaRDD
SQLContext - Class in org.apache.spark.sql: :: AlphaComponent :: The entry point for running relational queries using Spark.
SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
squaredDist(Vector) - Method in class org.apache.spark.util.Vector
SquaredL2Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Updater for L2 regularized problems.
SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
ssc() - Method in class org.apache.spark.streaming.dstream.DStream
stackTrace() - Method in class org.apache.spark.ExceptionFailure
stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
stageId() - Method in class org.apache.spark.scheduler.StageInfo
stageId() - Method in class org.apache.spark.TaskContext
stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
stageIdToDescription() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToDiskBytesSpilled() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToExecutorSummaries() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToMemoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToPool() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToShuffleRead() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToShuffleWrite() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToTaskData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToTasksActive() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToTasksComplete() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToTasksFailed() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToTime() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
StageInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Stores information about a stage to pass from the scheduler to SparkListeners.
StageInfo(int, String, int, Seq<RDDInfo>) - Constructor for class org.apache.spark.scheduler.StageInfo
start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Start the execution of the streams.
start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
start() - Method in class org.apache.spark.streaming.dstream.InputDStream: Method called to start receiving data.
start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
start() - Method in class org.apache.spark.streaming.StreamingContext: Start the execution of the streams.
startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
startTime() - Method in class org.apache.spark.SparkContext
StatCounter - Class in org.apache.spark.util: A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.
StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
StatCounter() - Constructor for class org.apache.spark.util.StatCounter: Initialize the StatCounter with no values.
state() - Method in class org.apache.spark.streaming.StreamingContext
Statistics - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: Statistics for querying the supervisor about state of workers.
Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.mllib.tree.model.Node
stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
StatsReportListener - Class in org.apache.spark.scheduler: :: DeveloperApi :: Simple SparkListener that logs a few summary statistics when each stage completes
StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
StatsReportListener - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: A simple StreamingListener that logs summary statistics across Spark Streaming batches
StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
status() - Method in class org.apache.spark.scheduler.TaskInfo
stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.util.StatCounter: Return the standard deviation of the values.
stop() - Method in class org.apache.spark.api.java.JavaSparkContext: Shut down the SparkContext.
stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
stop() - Method in class org.apache.spark.SparkContext: Shut down the SparkContext.
stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
stop() - Method in class org.apache.spark.streaming.dstream.InputDStream: Method called to stop receiving data.
stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver: Stop the receiver completely.
stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Stop the receiver completely due to an exception
stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext: Stop the execution of the streams immediately (does not wait for all received data to be processed).
stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext: Stop the execution of the streams, with option of ensuring all received data has been processed.
storageLevel() - Method in class org.apache.spark.storage.BlockStatus
storageLevel() - Method in class org.apache.spark.storage.RDDInfo
StorageLevel - Class in org.apache.spark.storage: :: DeveloperApi :: Flags for controlling the storage of an RDD.
StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
StorageLevels - Class in org.apache.spark.api.java: Expose some commonly useful storage level constants.
StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
StorageListener - Class in org.apache.spark.ui.storage: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the BlockManagerUI.
StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
StorageStatus - Class in org.apache.spark.storage: :: DeveloperApi :: Storage information for each BlockManager.
StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus
storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
StorageStatusListener - Class in org.apache.spark.storage: :: DeveloperApi :: A SparkListener that maintains executor storage status.
StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store the bytes of received data as a data block into Spark's memory.
store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store a single item of received data to Spark's memory.
store(T) - Method in class org.apache.spark.streaming.receiver.Receiver: Store a single item of received data to Spark's memory.
store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an ArrayBuffer of received data as a data block into Spark's memory.
store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an ArrayBuffer of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver: Store the bytes of received data as a data block into Spark's memory.
store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store the bytes of received data as a data block into Spark's memory.
Strategy - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Stores all the configuration options for tree construction
Strategy(Enumeration.Value, Impurity, int, int, Enumeration.Value, Map<Object, Object>, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
STREAM() - Static method in class org.apache.spark.storage.BlockId
StreamBlockId - Class in org.apache.spark.storage
StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
streamed() - Method in class org.apache.spark.sql.execution.BroadcastNestedLoopJoin
streamedKeys() - Method in class org.apache.spark.sql.execution.HashJoin
streamedPlan() - Method in class org.apache.spark.sql.execution.HashJoin
streamId() - Method in class org.apache.spark.storage.StreamBlockId
streamId() - Method in class org.apache.spark.streaming.receiver.Receiver: Get the unique identifier the receiver input stream that this receiver is associated with.
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
StreamingContext - Class in org.apache.spark.streaming: Main entry point for Spark Streaming functionality.
StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext using an existing SparkContext.
StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext by providing the configuration necessary for a new SparkContext.
StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext by providing the details necessary for creating a new SparkContext.
StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file.
StreamingContextState() - Method in class org.apache.spark.streaming.StreamingContext: Accessor for nested Scala object
StreamingListener - Interface in org.apache.spark.streaming.scheduler: :: DeveloperApi :: A listener interface for receiving information about an ongoing streaming computation.
StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Base trait for events related to StreamingListener
StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
streamSideKeyGenerator() - Method in class org.apache.spark.sql.execution.HashJoin
stringToText(String) - Static method in class org.apache.spark.SparkContext
stringWritableConverter() - Static method in class org.apache.spark.SparkContext
submissionTime() - Method in class org.apache.spark.scheduler.StageInfo: When this stage was submitted from the DAGScheduler to a TaskScheduler.
submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext: :: Experimental :: Submit a job for execution and return a FutureJob holding the result.
subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
subtract(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
subtract(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
subtract(Vector) - Method in class org.apache.spark.util.Vector
subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from `this` whose keys are not in `other`.
succeededTasks() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
Success - Class in org.apache.spark
Success() - Constructor for class org.apache.spark.Success
successful() - Method in class org.apache.spark.scheduler.TaskInfo
sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Add up the elements in this RDD.
sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Add up the elements in this RDD.
sum() - Method in class org.apache.spark.util.StatCounter
sum() - Method in class org.apache.spark.util.Vector
sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: :: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: :: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: :: Experimental :: Approximate operation to return the sum within a timeout.
SVMDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate sample data used for SVM.
SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
SVMModel - Class in org.apache.spark.mllib.classification: Model for Support Vector Machines (SVMs).
SVMWithSGD - Class in org.apache.spark.mllib.classification: Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD: Construct a SVM object with default parameters
systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener

T

t() - Method in class org.apache.spark.SerializableWritable
table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
table(String) - Method in class org.apache.spark.sql.SQLContext: Returns the specified table as a SchemaRDD
tachyonFolderName() - Method in class org.apache.spark.SparkContext
tachyonSize() - Method in class org.apache.spark.storage.BlockStatus
tachyonSize() - Method in class org.apache.spark.storage.RDDInfo
take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.rdd.RDD: Take the first num elements of the RDD.
takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving the first num elements of the RDD.
takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the first K elements from this RDD as defined by the specified Comparator[T] and maintains the order.
takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the first K elements from this RDD using the natural ordering for T while maintain the order.
takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the first K (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
TakeOrdered - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Take the first limit elements as defined by the sortOrder.
TakeOrdered(int, Seq<SortOrder>, SparkPlan, SparkContext) - Constructor for class org.apache.spark.sql.execution.TakeOrdered
takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
TaskContext - Class in org.apache.spark: :: DeveloperApi :: Contextual information about a task which can be read or mutated during execution.
TaskContext(int, int, long, boolean, TaskMetrics) - Constructor for class org.apache.spark.TaskContext
TaskEndReason - Interface in org.apache.spark: :: DeveloperApi :: Various possible reasons why a task ended.
taskId() - Method in class org.apache.spark.scheduler.TaskInfo
taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
TaskInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Information about a running task attempt inside a TaskSet.
TaskInfo(long, int, long, String, String, Enumeration.Value) - Constructor for class org.apache.spark.scheduler.TaskInfo
taskInfo() - Method in class org.apache.spark.ui.jobs.TaskUIData
TaskKilled - Class in org.apache.spark
TaskKilled() - Constructor for class org.apache.spark.TaskKilled
TaskKilledException - Exception in org.apache.spark: :: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected).
TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
TaskLocality - Class in org.apache.spark.scheduler
TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
taskMetrics() - Method in class org.apache.spark.TaskContext
taskMetrics() - Method in class org.apache.spark.ui.jobs.TaskUIData
TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
TaskResultBlockId - Class in org.apache.spark.storage
TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
TaskResultLost - Class in org.apache.spark: :: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched.
TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
taskScheduler() - Method in class org.apache.spark.SparkContext
taskTime() - Method in class org.apache.spark.ui.jobs.ExecutorSummary
taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
TaskUIData - Class in org.apache.spark.ui.jobs
TaskUIData(TaskInfo, Option<TaskMetrics>, Option<ExceptionFailure>) - Constructor for class org.apache.spark.ui.jobs.TaskUIData
TEST() - Static method in class org.apache.spark.storage.BlockId
TestHive - Class in org.apache.spark.sql.hive.test
TestHive() - Constructor for class org.apache.spark.sql.hive.test.TestHive
TestHiveContext - Class in org.apache.spark.sql.hive.test: A locally running test instance of Spark's Hive execution engine.
TestHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext
TestHiveContext.QueryExecution - Class in org.apache.spark.sql.hive.test: Override QueryExecution with special debug workflow.
TestHiveContext.QueryExecution() - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
TestHiveContext.TestTable - Class in org.apache.spark.sql.hive.test
TestHiveContext.TestTable(String, Seq<Function0<BoxedUnit>>) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
TestSQLContext - Class in org.apache.spark.sql.test: A SQLContext that can be used for local testing.
TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
testTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext: A list of test tables and the DDL required to initialize them.
textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.SparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
threshold() - Method in class org.apache.spark.mllib.tree.model.Split
thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns thresholds in descending order.
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
Time - Class in org.apache.spark.streaming: This is a simple class that represents an absolute instant of time.
Time(long) - Constructor for class org.apache.spark.streaming.Time
to(Time, Duration) - Method in class org.apache.spark.streaming.Time
toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike: Deprecated.
As of Spark 1.0.0, toArray() is deprecated, use JavaRDDLike.collect() instead
toArray() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix: Converts to a dense array in column major.
toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
toArray() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the instance to a double array.
toArray() - Method in class org.apache.spark.rdd.RDD: Return an array that contains all of the elements in this RDD.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Collects data and assembles a local dense breeze matrix (for test only).
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix: Converts to a breeze matrix.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the instance to a breeze vector.
toDataType(String) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike: A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.rdd.RDD: A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.SparkConf: Return a string listing all keys and values, one per line.
toFormattedString() - Method in class org.apache.spark.streaming.Duration
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Converts to IndexedRowMatrix.
toInt() - Method in class org.apache.spark.storage.StorageLevel
toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Convert to a JavaDStream
toJavaRDD() - Method in class org.apache.spark.rdd.RDD
toJavaSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD: Returns this RDD as a JavaSchemaRDD.
toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an iterator that contains all of the elements in this RDD.
toLocalIterator() - Method in class org.apache.spark.rdd.RDD: Return an iterator that contains all of the elements in this RDD.
toMetastoreType(DataType) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the top K elements from this RDD as defined by the specified Comparator[T].
top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the top K elements from this RDD using the natural ordering for T.
top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Converts to RowMatrix, dropping row indices after grouping by row index.
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Drops row indices and converts this matrix to a RowMatrix.
TorrentBroadcastFactory - Class in org.apache.spark.broadcast: A Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors.
TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
toSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD: Returns this RDD as a SchemaRDD.
toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
toString() - Method in class org.apache.spark.Accumulable
toString() - Method in class org.apache.spark.api.java.JavaRDD
toString() - Method in class org.apache.spark.broadcast.Broadcast
toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
toString() - Method in class org.apache.spark.mllib.tree.model.Node
toString() - Method in class org.apache.spark.mllib.tree.model.Split
toString() - Method in class org.apache.spark.partial.BoundedDouble
toString() - Method in class org.apache.spark.partial.PartialResult
toString() - Method in class org.apache.spark.rdd.RDD
toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
toString() - Method in class org.apache.spark.scheduler.SplitInfo
toString() - Method in class org.apache.spark.SerializableWritable
toString() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
toString() - Method in class org.apache.spark.storage.BlockId
toString() - Method in class org.apache.spark.storage.BlockManagerId
toString() - Method in class org.apache.spark.storage.RDDInfo
toString() - Method in class org.apache.spark.storage.StorageLevel
toString() - Method in class org.apache.spark.streaming.Duration
toString() - Method in class org.apache.spark.streaming.Time
toString() - Method in class org.apache.spark.util.MutablePair
toString() - Method in class org.apache.spark.util.StatCounter
toString() - Method in class org.apache.spark.util.Vector
totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for all the jobs of this batch to finish processing from the time they were submitted.
totalShuffleRead() - Method in class org.apache.spark.ui.jobs.JobProgressListener
totalShuffleWrite() - Method in class org.apache.spark.ui.jobs.JobProgressListener
totalTime() - Method in class org.apache.spark.ui.jobs.JobProgressListener
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes: Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes: Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a Linear Regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model over an RDD
trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by users to some products, in the form of (userID, productID, rating) pairs.
transform(Function<R, JavaRDD>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<R, Time, JavaRDD>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transform(Function1<RDD<T>, RDD>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<RDD<T>, Time, RDD>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformWith(JavaDStream, Function3<R, JavaRDD, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream, Function2<RDD<T>, RDD, RDD<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream, Function3<RDD<T>, RDD, Time, RDD<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaDStream, Function3<R, JavaRDD, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
TwitterUtils - Class in org.apache.spark.streaming.twitter
TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils

U

U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
ui() - Method in class org.apache.spark.SparkContext
uiTab() - Method in class org.apache.spark.streaming.StreamingContext
unbound() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory: Remove all persisted state associated with the HTTP broadcast with the given ID.
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory: Remove all persisted state associated with the torrent broadcast with the given ID.
uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext: Removes the specified table from the in-memory cache.
underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the union of this RDD and another one.
union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the union of this RDD and another one.
union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return the union of this RDD and another one.
union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return the union of this RDD and another one.
union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Build the union of a list of RDDs.
union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Build the union of a list of RDDs passed as variable-length arguments.
Union - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
Union(Seq<SparkPlan>, SparkContext) - Constructor for class org.apache.spark.sql.execution.Union
union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream by unifying data of another DStream with this DStream.
union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by unifying data of another DStream with this DStream.
union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by unifying data of another DStream with this DStream.
union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
unionAll(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD: Combines the tuples of two RDDs with the same schema, keeping duplicates.
UnionRDD<T> - Class in org.apache.spark.rdd
UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
UnknownReason - Class in org.apache.spark: :: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.
UnknownReason() - Constructor for class org.apache.spark.UnknownReason
unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.broadcast.Broadcast: Asynchronously delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast: Delete cached copies of this broadcast on the executors.
unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Unpersist intermediate RDDs used in the computation.
unpersist(boolean) - Method in class org.apache.spark.rdd.RDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
until(Time, Duration) - Method in class org.apache.spark.streaming.Time
update(T1, T2) - Method in class org.apache.spark.util.MutablePair: Updates this pair with new values and returns itself
Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to perform steps (weight update) using Gradient Descent methods.
Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStorageStatus(String, Seq<Tuple2<BlockId, BlockStatus>>) - Method in class org.apache.spark.storage.StorageStatusListener: Update storage status list to reflect updated block statuses
updateStorageStatus(int) - Method in class org.apache.spark.storage.StorageStatusListener: Update storage status list to reflect the removal of an RDD from the cache
useDisk() - Method in class org.apache.spark.storage.StorageLevel
useMemory() - Method in class org.apache.spark.storage.StorageLevel
useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
user() - Method in class org.apache.spark.mllib.recommendation.Rating
user() - Method in class org.apache.spark.scheduler.JobLogger
userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel

V

V() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
value() - Method in class org.apache.spark.Accumulable: Access the accumulator's current value; only allowed on master.
value() - Method in class org.apache.spark.broadcast.Broadcast: Get the broadcasted value.
value() - Method in class org.apache.spark.ComplexFutureAction
value() - Method in interface org.apache.spark.FutureAction: The value of this Future.
value() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
value() - Method in class org.apache.spark.SerializableWritable
value() - Method in class org.apache.spark.SimpleFutureAction
values() - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the values of each tuple.
values() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
values() - Method in class org.apache.spark.mllib.linalg.DenseVector
values() - Method in class org.apache.spark.mllib.linalg.SparseVector
values() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the values of each tuple.
variance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the variance of this RDD's elements.
variance() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample variance vector.
Variance - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating variance during regression
Variance() - Constructor for class org.apache.spark.mllib.tree.impurity.Variance
variance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.util.StatCounter: Return the variance of the values.
vClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
vector() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
Vector - Interface in org.apache.spark.mllib.linalg: Represents a numeric vector, whose index type is Int and value type is Double.
Vector - Class in org.apache.spark.util
Vector(double[]) - Constructor for class org.apache.spark.util.Vector
Vector.Multiplier - Class in org.apache.spark.util
Vector.Multiplier(double) - Constructor for class org.apache.spark.util.Vector.Multiplier
Vector.VectorAccumParam$ - Class in org.apache.spark.util
Vector.VectorAccumParam$() - Constructor for class org.apache.spark.util.Vector.VectorAccumParam$
Vectors - Class in org.apache.spark.mllib.linalg
Vectors() - Constructor for class org.apache.spark.mllib.linalg.Vectors
version() - Method in class org.apache.spark.SparkContext: The version of Spark on which this application is running.
vManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
VoidFunction<T> - Interface in org.apache.spark.api.java.function: A function with no return value.

W

waiter() - Method in class org.apache.spark.streaming.StreamingContext
warehousePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
warehousePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
weights() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
weights() - Method in class org.apache.spark.mllib.classification.SVMModel
weights() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
weights() - Method in class org.apache.spark.mllib.regression.LassoModel
weights() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
weights() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
where(Expression) - Method in class org.apache.spark.sql.SchemaRDD: Filters the output, only returning those rows where condition evaluates to true.
where(Symbol, Function1<T1, Object>) - Method in class org.apache.spark.sql.SchemaRDD: Filters tuples using a function over the value of the specified column.
where(Function1<DynamicRow, Object>) - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Filters tuples using a function over a Dynamic version of a given Row.
wholeTextFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String, int) - Method in class org.apache.spark.SparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
withReplacement() - Method in class org.apache.spark.sql.execution.Sample
wrapRDD(RDD<Double>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaPairRDD
wrapRDD(RDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
wrapRDD(RDD<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
wrapRDD(RDD<Row>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
wrapRDD(RDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
wrapRDD(RDD<T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
writableWritableConverter() - Static method in class org.apache.spark.SparkContext
writeAll(Iterator<T>, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializationStream
writeExternal(ObjectOutput) - Method in class org.apache.spark.serializer.JavaSerializer
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerId
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.StorageLevel
writeExternal(ObjectOutput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
writeObject(T, ClassTag<T>) - Method in interface org.apache.spark.serializer.SerializationStream

Z

zero() - Method in class org.apache.spark.Accumulable
zero(R) - Method in interface org.apache.spark.AccumulableParam: Return the "zero" (identity) value for an accumulator type, given its initial value.
zero(double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
zero(float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
zero(int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
zero(long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
zero(Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
ZeroMQUtils - Class in org.apache.spark.streaming.zeromq
ZeroMQUtils() - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQUtils
zeros(int) - Static method in class org.apache.spark.util.Vector
zeroTime() - Method in class org.apache.spark.streaming.dstream.DStream
zip(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zip(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zipPartitions(JavaRDDLike<U, ?>, FlatMapFunction2<Iterator<T>, Iterator, V>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD, boolean, Function2<Iterator<T>, Iterator, Iterator<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD: Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD, Function2<Iterator<T>, Iterator, Iterator<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, boolean, Function3<Iterator<T>, Iterator, Iterator<C>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, Function3<Iterator<T>, Iterator, Iterator<C>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, RDD<D>, boolean, Function4<Iterator<T>, Iterator, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, RDD<D>, Function4<Iterator<T>, Iterator, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipWithIndex() - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with its element indices.
zipWithIndex() - Method in class org.apache.spark.rdd.RDD: Zips this RDD with its element indices.
zipWithUniqueId() - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with generated unique Long ids.
zipWithUniqueId() - Method in class org.apache.spark.rdd.RDD: Zips this RDD with generated unique Long ids.

_

_1() - Method in class org.apache.spark.util.MutablePair
_2() - Method in class org.apache.spark.util.MutablePair

A B C D E F G H I J K L M N O P Q R S T U V W Z _