A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ 

A

abort(String) - Method in class org.apache.spark.scheduler.TaskSetManager
 
abortStage(Stage, String) - Method in class org.apache.spark.scheduler.DAGScheduler
Aborts all jobs depending on a particular Stage.
AbsoluteError - Class in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Class for absolute error loss calculation (for regression).
AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
 
AbstractJavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Class in org.apache.spark.streaming.api.java
As a workaround for https://issues.scala-lang.org/browse/SI-8905, implementations of JavaDStreamLike should extend this dummy abstract class instead of directly inheriting from the trait.
AbstractJavaDStreamLike() - Constructor for class org.apache.spark.streaming.api.java.AbstractJavaDStreamLike
 
AbstractJavaRDDLike<T,This extends JavaRDDLike<T,This>> - Class in org.apache.spark.api.java
As a workaround for https://issues.scala-lang.org/browse/SI-8905, implementations of JavaRDDLike should extend this dummy abstract class instead of directly inheriting from the trait.
AbstractJavaRDDLike() - Constructor for class org.apache.spark.api.java.AbstractJavaRDDLike
 
accept(File, String) - Method in class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
 
AcceptanceResult - Class in org.apache.spark.util.random
Object used by seqOp to keep track of the number of items accepted and items waitlisted per stratum, as well as the bounds for accepting and waitlisting items.
AcceptanceResult(long, long) - Constructor for class org.apache.spark.util.random.AcceptanceResult
 
acceptBound() - Method in class org.apache.spark.util.random.AcceptanceResult
 
Accumulable<R,T> - Class in org.apache.spark
A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
 
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
 
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
Create an Accumulable shared variable, to which tasks can add values with +=.
accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
Create an Accumulable shared variable, with a name for display in the Spark UI.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
Create an accumulator from a "mutable collection" type.
AccumulableInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Information about an Accumulable modified during a task or stage.
AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
 
accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
AccumulableParam<R,T> - Interface in org.apache.spark
Helper object defining how to accumulate values of a particular type.
accumulables() - Method in class org.apache.spark.scheduler.StageInfo
Terminal values of accumulables updated during this stage.
accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
Intermediate updates to accumulables during this task.
accumulables() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
Accumulator<T> - Class in org.apache.spark
A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.e.
Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
 
Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
 
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the += method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Create an Accumulator variable of a given type, with a name for display in the Spark UI.
accumulator() - Method in class org.apache.spark.sql.execution.PythonUDF
 
AccumulatorParam<T> - Interface in org.apache.spark
A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.
Accumulators - Class in org.apache.spark
 
Accumulators() - Constructor for class org.apache.spark.Accumulators
 
accumUpdates() - Method in class org.apache.spark.scheduler.CompletionEvent
 
accumUpdates() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns accuracy
aclsEnabled() - Method in class org.apache.spark.SecurityManager
Check to see if Acls for the UI are enabled
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
activeExecutorIds() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
ActiveJob - Class in org.apache.spark.scheduler
Tracks information about an active job in the DAGScheduler.
ActiveJob(int, Stage, Function2<TaskContext, Iterator<Object>, ?>, int[], CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.ActiveJob
 
activeJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
activeTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
activeTaskSets() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 
ActorHelper - Interface in org.apache.spark.streaming.receiver
:: DeveloperApi :: A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.
ActorLogReceive - Interface in org.apache.spark.util
A trait to enable logging all Akka actor messages.
ActorReceiver<T> - Class in org.apache.spark.streaming.receiver
Provides Actors as receivers for receiving stream.
ActorReceiver(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver
 
ActorReceiver.Supervisor - Class in org.apache.spark.streaming.receiver
 
ActorReceiver.Supervisor() - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
ActorReceiverData - Interface in org.apache.spark.streaming.receiver
Case class to receive data sent by child actors
actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: A helper with set of defaults for supervisor strategy
ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
 
actorSystem() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
actorSystem() - Method in class org.apache.spark.SparkEnv
 
actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Returns the size of the value row(ordinal).
actualSize(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
add(T) - Method in class org.apache.spark.Accumulable
Add more data to this accumulator / accumulable
add(Map<Object, Object>) - Static method in class org.apache.spark.Accumulators
 
add(long, long, ED) - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
Add a new edge to the partition.
add(long, long, int, int, ED) - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
Add a new edge to the partition.
add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
Adds a new document.
add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
Add a new sample to this summarizer, and update the statistical summary.
add(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Add the stats from another calculator into this one, modifying and returning this calculator.
add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
add(Vector) - Method in class org.apache.spark.util.Vector
 
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
 
addAccumulator(R, T) - Method in class org.apache.spark.GrowableAccumulableParam
 
addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus
Add the given block to this storage status.
AddBlock - Class in org.apache.spark.streaming.scheduler
 
AddBlock(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.AddBlock
 
addBlock(ReceivedBlockInfo) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Add received block.
addBoolean(boolean) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addData(Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Push a single data item into the buffer.
addDataWithCallback(Object, Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Push a single data item into the buffer.
addDouble(double) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addedFiles() - Method in class org.apache.spark.SparkContext
 
addedJars() - Method in class org.apache.spark.SparkContext
 
AddExchange - Class in org.apache.spark.sql.execution
Ensures that the Partitioning of input data meets the Distribution requirements for each operator by inserting Exchange Operators where required.
AddExchange(SQLContext) - Constructor for class org.apache.spark.sql.execution.AddExchange
 
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(File) - Method in class org.apache.spark.HttpFileServer
 
addFile(String) - Method in class org.apache.spark.SparkContext
Add a file to be downloaded with this Spark job on every node.
AddFile - Class in org.apache.spark.sql.hive
 
AddFile(String) - Constructor for class org.apache.spark.sql.hive.AddFile
 
AddFile - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi ::
AddFile(String) - Constructor for class org.apache.spark.sql.hive.execution.AddFile
 
addFileToDir(File, File) - Method in class org.apache.spark.HttpFileServer
 
addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
Add filters, if any, to the given list of ServletContextHandlers
addFloat(float) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a param with multiple values (overwrites if the input param exists).
addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a double param with multiple values.
addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a int param with multiple values.
addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a float param with multiple values.
addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a long param with multiple values.
addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a boolean param with true and false.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
Merge two accumulated values together.
addInPlace(R, R) - Method in class org.apache.spark.GrowableAccumulableParam
 
addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
 
addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
 
addInPlace(Vector) - Method in class org.apache.spark.util.Vector
 
addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
 
addInputStream(InputDStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
 
addInt(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(File) - Method in class org.apache.spark.HttpFileServer
 
addJar(String) - Method in class org.apache.spark.SparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
AddJar - Class in org.apache.spark.sql.hive
 
AddJar(String) - Constructor for class org.apache.spark.sql.hive.AddJar
 
AddJar - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi ::
AddJar(String) - Constructor for class org.apache.spark.sql.hive.execution.AddJar
 
addListener(SparkListener) - Method in interface org.apache.spark.scheduler.SparkListenerBus
 
addListener(StreamingListener) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
Add Hadoop configuration specific to a single partition and attempt.
addLong(long) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addOnCompleteCallback(Function0<Unit>) - Method in class org.apache.spark.TaskContext
Deprecated.
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
 
addOutputLoc(int, MapStatus) - Method in class org.apache.spark.scheduler.Stage
 
addOutputStream(DStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
 
addPartitioningAttributes(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
address() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
addresses() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
addRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager
If the given task ID is not in the set of running tasks, adds it.
addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
 
addSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
 
addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
 
addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
Add a (Java friendly) listener to be executed on task completion.
addTaskCompletionListener(Function1<TaskContext, Unit>) - Method in class org.apache.spark.TaskContext
Add a listener in the form of a Scala closure to be executed on task completion.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContextImpl
 
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
 
addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
addTaskSetManager(Schedulable, Properties) - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
addToTime(long) - Method in class org.apache.spark.streaming.util.ManualClock
 
adminAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
advanceCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
Advance the checkpoint clock by the checkpoint interval.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
Aggregate - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Groups input data by groupingExpressions and computes the aggregateExpressions for each group.
Aggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Aggregate
 
aggregate() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
aggregate(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
Performs an aggregation over all Rows in this RDD.
Aggregate.ComputedAggregate - Class in org.apache.spark.sql.execution
An aggregate that needs to be computed for each row in a group.
Aggregate.ComputedAggregate(AggregateExpression, AggregateExpression, AttributeReference) - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
Aggregate.ComputedAggregate$ - Class in org.apache.spark.sql.execution
 
Aggregate.ComputedAggregate$() - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate$
 
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
AggregateEvaluation - Class in org.apache.spark.sql.execution
 
AggregateEvaluation(Seq<Attribute>, Seq<Expression>, Seq<Expression>, Expression) - Constructor for class org.apache.spark.sql.execution.AggregateEvaluation
 
aggregateExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
 
aggregateExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesEdgeScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Send messages along edges and aggregate them at the receiving vertices.
aggregateMessagesIndexScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Send messages along edges and aggregate them at the receiving vertices.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
aggregateSizeForNode(DecisionTreeMetadata, Option<int[]>) - Static method in class org.apache.spark.mllib.tree.RandomForest
Get the number of values to be stored for this node in the bin aggregates.
aggregateUsingIndex(Iterator<Product2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.
AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
 
AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
Aggregator<K,V,C> - Class in org.apache.spark
:: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
 
aggregator() - Method in class org.apache.spark.ShuffleDependency
 
AkkaUtils - Class in org.apache.spark.util
Various utility classes for working with Akka.
AkkaUtils() - Constructor for class org.apache.spark.util.AkkaUtils
 
Algo - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
 
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
alias() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
aliasNames() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
All - Static variable in class org.apache.spark.graphx.TripletFields
Expose all the fields (source, edge, and destination).
allAggregates(Seq<Expression>) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
 
AllCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
 
AllJobsCancelled - Class in org.apache.spark.scheduler
 
AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
 
AllJobsPage - Class in org.apache.spark.ui.jobs
Page showing list of all ongoing and recently finished jobs
AllJobsPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.AllJobsPage
 
allJoinTokens() - Static method in class org.apache.spark.sql.hive.HiveQl
 
allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Allocate all unallocated blocks to the given batch.
allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Allocate all unallocated blocks to the given batch.
AllocatedBlocks - Class in org.apache.spark.streaming.scheduler
Class representing the blocks of all the streams allocated to a batch
AllocatedBlocks(Map<Object, Seq<ReceivedBlockInfo>>) - Constructor for class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
allocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
allowLocal() - Method in class org.apache.spark.scheduler.JobSubmitted
 
allPendingTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
AllStagesPage - Class in org.apache.spark.ui.jobs
Page showing list of all ongoing and recently finished stages and pools
AllStagesPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.AllStagesPage
 
AlphaComponent - Annotation Type in org.apache.spark.annotation
A new component of Spark which may have unstable API's.
alreadyPlanned() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
ALS - Class in org.apache.spark.mllib.recommendation
Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
ALS.BlockStats - Class in org.apache.spark.mllib.recommendation
:: DeveloperApi :: Statistics of a block in ALS computation.
ALS.BlockStats(String, int, long, long, long, long) - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
ALS.BlockStats$ - Class in org.apache.spark.mllib.recommendation
 
ALS.BlockStats$() - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats$
 
ALSPartitioner - Class in org.apache.spark.mllib.recommendation
Partitioner for ALS.
ALSPartitioner(int) - Constructor for class org.apache.spark.mllib.recommendation.ALSPartitioner
 
analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext
Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
analyzeBlocks(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
:: DeveloperApi :: Given an RDD of ratings, number of user blocks, and number of product blocks, computes the statistics of each block in ALS computation.
analyzed() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
 
AnalyzeTable - Class in org.apache.spark.sql.hive
 
AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.AnalyzeTable
 
AnalyzeTable - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi :: Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.execution.AnalyzeTable
 
AND() - Static method in class org.apache.spark.sql.hive.HiveQl
 
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
 
append(boolean, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
append(byte, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
append(byte[], ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
append(JvmType, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
Appends the given value v of type T into the given ByteBuffer.
append(Row, int, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
Appends row(ordinal) of type T into the given ByteBuffer.
append(Date, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
 
append(double, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
append(float, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
append(long, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
append(short, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
append(String, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
 
append(Timestamp, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
append(AvroFlumeEvent) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
 
appendBatch(List<AvroFlumeEvent>) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
 
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
Returns a new vector with 1.0 (bias) appended to the input vector.
appendFrom(Row, int) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Appends row(ordinal) to the column builder.
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
AppendingParquetOutputFormat - Class in org.apache.spark.sql.parquet
TODO: this will be able to append to directories it created itself, not necessarily to imported ones.
AppendingParquetOutputFormat(int) - Constructor for class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
 
appId() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
appId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
appId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
appId() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
APPLICATION_COMPLETE() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
applicationComplete() - Method in class org.apache.spark.scheduler.EventLoggingInfo
 
applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
ApplicationEventListener - Class in org.apache.spark.scheduler
A simple listener for application events.
ApplicationEventListener() - Constructor for class org.apache.spark.scheduler.ApplicationEventListener
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
applicationId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
Get an application ID associated with the job.
applicationId() - Method in interface org.apache.spark.scheduler.TaskScheduler
Get an application ID associated with the job.
applicationId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
applicationId() - Method in class org.apache.spark.SparkContext
 
applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
 
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
Construct a graph from a collection of vertices and edges with attributes.
apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from edges, setting referenced vertices to `defaultVertexAttr`.
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`.
apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a `ShippableVertexPartition` from the given vertices without any routing table.
apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a ShippableVertexPartition from the given vertices with the specified routing table, filling in missing vertices mentioned in the routing table using defaultVal.
apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a ShippableVertexPartition from the given vertices with the specified routing table, filling in missing vertices mentioned in the routing table using defaultVal, and merging duplicate vertex atrribute with mergeFunc.
apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition
Construct a `VertexPartition` from the given vertices.
apply(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
Return the vertex attribute for the given vertex ID.
apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
Execute a Pregel-like iterative vertex-parallel abstraction.
apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a standalone VertexRDD (one that is not set up for efficient joins with an EdgeRDD) from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
Gets the value of the input param or its default value if it does not exist.
apply(BinaryConfusionMatrix) - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryClassificationMetricComputer
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
 
apply(BinaryConfusionMatrix) - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
 
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
 
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
Gets the value of the ith element.
apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
 
apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
 
apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
 
apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
Alternate factory method that takes a ByteBuffer directly for the data field
apply(BlockManagerId, long[]) - Static method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
 
apply(DataType) - Static method in class org.apache.spark.sql.columnar.ColumnType
 
apply(boolean, int, StorageLevel, SparkPlan, Option<String>) - Static method in class org.apache.spark.sql.columnar.InMemoryRelation
 
apply(SparkPlan) - Method in class org.apache.spark.sql.execution.AddExchange
 
apply(PythonUDF, LogicalPlan) - Static method in class org.apache.spark.sql.execution.EvaluatePython
 
apply(LogicalPlan) - Static method in class org.apache.spark.sql.execution.ExtractPythonUdfs
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.BasicOperators
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.BroadcastNestedLoopJoin
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.CartesianProduct
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.CommandStrategy
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashJoin
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.InMemoryScans
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.LeftSemiJoin
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.ParquetOperations
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.TakeOrdered
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.DataSinks
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.Scripts
 
apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.DataSourceStrategy
 
apply(String) - Method in class org.apache.spark.sql.sources.DDLParser
 
apply(String) - Static method in class org.apache.spark.storage.BlockId
Converts a BlockId "name" String back into a BlockId.
apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId
Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
 
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
 
apply(long) - Static method in class org.apache.spark.streaming.Minutes
 
apply(long) - Static method in class org.apache.spark.streaming.Seconds
 
apply(I, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.CompletionIterator
 
apply(Traversable<Object>) - Static method in class org.apache.spark.util.Distribution
 
apply(InputStream, File, SparkConf) - Static method in class org.apache.spark.util.logging.FileAppender
Create the right appender based on Spark configuration
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values passed as variable-length arguments.
apply(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
apply(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
apply(int) - Method in class org.apache.spark.util.Vector
 
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Applies a schema to an RDD of Java Beans.
applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
:: DeveloperApi :: Creates a JavaSchemaRDD from an RDD containing Rows by applying a schema to this RDD.
applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Creates a SchemaRDD from an RDD containing Rows by applying a schema to this RDD.
applySchemaToPythonRDD(RDD<Object[]>, String) - Method in class org.apache.spark.sql.SQLContext
Apply a schema defined by the schemaString to an RDD.
applySchemaToPythonRDD(RDD<Object[]>, StructType) - Method in class org.apache.spark.sql.SQLContext
Apply a schema defined by the schema to an RDD.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
 
appName() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
appName() - Method in class org.apache.spark.SparkContext
 
appName() - Method in class org.apache.spark.ui.SparkUI
 
appName() - Method in class org.apache.spark.ui.SparkUITab
 
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
ApproximateActionListener<T,U,R> - Class in org.apache.spark.partial
A JobListener for an approximate single-result action, such as count() or non-parallel reduce().
ApproximateActionListener(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Constructor for class org.apache.spark.partial.ApproximateActionListener
 
ApproximateEvaluator<U,R> - Interface in org.apache.spark.partial
An object that computes a function incrementally by merging in results of type U from multiple tasks.
appUIAddress() - Method in class org.apache.spark.ui.SparkUI
 
appUIHostPort() - Method in class org.apache.spark.ui.SparkUI
Return the application UI host:port.
AreaUnderCurve - Class in org.apache.spark.mllib.evaluation
Computes the area under the curve (AUC) using the trapezoidal rule.
AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
 
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the receiver operating characteristic (ROC) curve.
areBoundsEmpty() - Method in class org.apache.spark.util.random.AcceptanceResult
 
argString() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
arr() - Method in class org.apache.spark.rdd.PartitionGroup
 
ARRAY() - Static method in class org.apache.spark.sql.hive.HiveQl
 
ARRAY_CONTAINS_NULL_BAG_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
ARRAY_ELEMENTS_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
arrayBuffer() - Method in class org.apache.spark.streaming.receiver.ArrayBufferBlock
 
ArrayBufferBlock - Class in org.apache.spark.streaming.receiver
class representing a block received as an ArrayBuffer
ArrayBufferBlock(ArrayBuffer<?>) - Constructor for class org.apache.spark.streaming.receiver.ArrayBufferBlock
 
ArrayType - Class in org.apache.spark.sql.api.java
The data type representing Lists.
ArrayValues - Class in org.apache.spark.storage
 
ArrayValues(Object[]) - Constructor for class org.apache.spark.storage.ArrayValues
 
as(Symbol) - Method in class org.apache.spark.sql.SchemaRDD
Applies a qualifier to the attributes of this relation.
asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
Read the elements of this stream through an iterator.
asJavaDataType(DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
Returns the equivalent DataType in Java for the given DataType in Scala.
asJavaStructField(StructField) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
Returns the equivalent StructField in Scala for the given StructField in Java.
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
askTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the default Spark timeout to use for Akka ask operations.
askWithReply(Object, ActorRef, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
Send a message to the given actor and get its result within a default timeout, or throw a SparkException if this fails.
askWithReply(Object, ActorRef, int, int, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
Send a message to the given actor and get its result within a default timeout, or throw a SparkException if this fails even after the specified number of retries.
asRDDId() - Method in class org.apache.spark.storage.BlockId
 
asScalaDataType(DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
Returns the equivalent DataType in Scala for the given DataType in Java.
asScalaStructField(StructField) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
Returns the equivalent StructField in Scala for the given StructField in Java.
assertValid() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Check validity of parameters.
assertValid() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Check validity of parameters.
assertValid() - Method in class org.apache.spark.rdd.BlockRDD
Check if this BlockRDD is valid.
AsyncRDDActions<T> - Class in org.apache.spark.rdd
A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
 
attachExecutor(ReceiverSupervisor) - Method in class org.apache.spark.streaming.receiver.Receiver
Attach Network Receiver executor to this receiver.
attachHandler(ServletContextHandler) - Method in class org.apache.spark.ui.WebUI
Attach a handler to this UI.
attachListener(CleanerListener) - Method in class org.apache.spark.ContextCleaner
Attach a listener object to get information of when objects are cleaned.
attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUI
Attach a page to this UI.
attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUITab
Attach a page to this tab.
attachTab(WebUITab) - Method in class org.apache.spark.ui.WebUI
Attach a tab to this UI, along with all of its attached pages.
attempt() - Method in class org.apache.spark.scheduler.TaskInfo
 
attempt() - Method in class org.apache.spark.scheduler.TaskSet
 
attemptId() - Method in class org.apache.spark.scheduler.Stage
 
attemptId() - Method in class org.apache.spark.scheduler.StageInfo
 
attemptId() - Method in class org.apache.spark.TaskContext
 
attemptId() - Method in class org.apache.spark.TaskContextImpl
 
attr() - Method in class org.apache.spark.graphx.Edge
 
attr() - Method in class org.apache.spark.graphx.EdgeContext
The attribute associated with the edge.
attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
attr() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
attribute() - Method in class org.apache.spark.sql.sources.EqualTo
 
attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
 
attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
 
attribute() - Method in class org.apache.spark.sql.sources.In
 
attribute() - Method in class org.apache.spark.sql.sources.LessThan
 
attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
 
attributeMap() - Method in class org.apache.spark.sql.hive.MetastoreRelation
An attribute map that can be used to lookup original attributes based on expression id.
attributeMap() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
attributeMap() - Method in class org.apache.spark.sql.sources.LogicalRelation
Used to lookup original attribute capitalization
attributes() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
attributes() - Method in class org.apache.spark.sql.hive.MetastoreRelation
Non-partitionKey attributes
attributes() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
attributes() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
attrs() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
autoBroadcastJoinThreshold() - Method in interface org.apache.spark.sql.SQLConf
Upper bound on the sizes (in bytes) of the tables qualified for the auto conversion to a broadcast value during the physical executions of join operations.
Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
AVG() - Static method in class org.apache.spark.sql.hive.HiveQl
 
awaitResult() - Method in class org.apache.spark.partial.ApproximateActionListener
Waits for up to timeout milliseconds since the listener was created and then returns a PartialResult with the result so far.
awaitResult() - Method in class org.apache.spark.scheduler.JobWaiter
 
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Wait the thread until the supervisor is stopped
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.util.logging.FileAppender
Wait for the appender to stop appending, either because input stream is closed or because of any error in appending
axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y += a * x

B

backend() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
BaggedPoint<Datum> - Class in org.apache.spark.mllib.tree.impl
Internal representation of a datapoint which belongs to several subsamples of the same dataset, particularly for bagging (e.g., for random forests).
BaggedPoint(Datum, double[]) - Constructor for class org.apache.spark.mllib.tree.impl.BaggedPoint
 
base() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
baseDir() - Method in class org.apache.spark.HttpFileServer
 
baseLogicalPlan() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
baseLogicalPlan() - Method in class org.apache.spark.sql.SchemaRDD
 
baseLogicalPlan() - Method in interface org.apache.spark.sql.SchemaRDDLike
 
baseOn(ParamPair<?>...) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
baseOn(ParamMap) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
baseOn(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
basePath() - Method in class org.apache.spark.ui.SparkUI
 
basePath() - Method in class org.apache.spark.ui.WebUITab
 
BaseRelation - Class in org.apache.spark.sql.sources
::DeveloperApi:: Represents a collection of tuples with a known schema.
BaseRelation() - Constructor for class org.apache.spark.sql.sources.BaseRelation
 
baseRelationToSchemaRDD(BaseRelation) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
 
baseRelationToSchemaRDD(BaseRelation) - Method in class org.apache.spark.sql.SQLContext
 
baseSchemaRDD() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
baseSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
 
baseSchemaRDD() - Method in interface org.apache.spark.sql.SchemaRDDLike
 
BasicColumnAccessor<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
BasicColumnAccessor(ByteBuffer, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.BasicColumnAccessor
 
BasicColumnBuilder<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
BasicColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.BasicColumnBuilder
 
BasicOperators() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
basicSparkPage(Function0<Seq<Node>>, String) - Static method in class org.apache.spark.ui.UIUtils
Returns a page with the spark css/js and a simple format.
BatchAllocationEvent - Class in org.apache.spark.streaming.scheduler
 
BatchAllocationEvent(Time, AllocatedBlocks) - Constructor for class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
BatchCleanupEvent - Class in org.apache.spark.streaming.scheduler
 
BatchCleanupEvent(Seq<Time>) - Constructor for class org.apache.spark.streaming.scheduler.BatchCleanupEvent
 
batchDuration() - Method in class org.apache.spark.streaming.DStreamGraph
 
batchDuration() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
BATCHES() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
BatchInfo - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, ReceivedBlockInfo[]>, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
BatchPythonEvaluation - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Uses PythonRDD to evaluate a PythonUDF, one partition of tuples at a time.
BatchPythonEvaluation(PythonUDF, Seq<Attribute>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.BatchPythonEvaluation
 
batchSize() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
batchTimeToSelectedFiles() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
BeginEvent - Class in org.apache.spark.scheduler
 
BeginEvent(Task<?>, TaskInfo) - Constructor for class org.apache.spark.scheduler.BeginEvent
 
beginTime() - Method in class org.apache.spark.streaming.Interval
 
benchmark(int) - Static method in class org.apache.spark.util.random.XORShiftRandom
 
BernoulliCellSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials for partitioning a data sequence.
BernoulliCellSampler(double, double, boolean) - Constructor for class org.apache.spark.util.random.BernoulliCellSampler
 
BernoulliSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.BernoulliSampler
 
bestModel() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
beta() - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
 
BETWEEN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
BigDecimalSerializer - Class in org.apache.spark.sql.execution
 
BigDecimalSerializer() - Constructor for class org.apache.spark.sql.execution.BigDecimalSerializer
 
Bin - Class in org.apache.spark.mllib.tree.model
Used for "binning" the feature values for faster best split calculation.
Bin(Split, Split, Enumeration.Value, double) - Constructor for class org.apache.spark.mllib.tree.model.Bin
 
BINARY - Class in org.apache.spark.sql.columnar
 
BINARY() - Constructor for class org.apache.spark.sql.columnar.BINARY
 
BinaryClassificationEvaluator - Class in org.apache.spark.ml.evaluation
:: AlphaComponent :: Evaluator for binary classification, which expects two input columns: score and label.
BinaryClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
BinaryClassificationMetricComputer - Interface in org.apache.spark.mllib.evaluation.binary
Trait for a binary classification evaluation metric computer.
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation
:: Experimental :: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
 
BinaryColumnAccessor - Class in org.apache.spark.sql.columnar
 
BinaryColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.BinaryColumnAccessor
 
BinaryColumnBuilder - Class in org.apache.spark.sql.columnar
 
BinaryColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.BinaryColumnBuilder
 
BinaryColumnStats - Class in org.apache.spark.sql.columnar
 
BinaryColumnStats() - Constructor for class org.apache.spark.sql.columnar.BinaryColumnStats
 
BinaryConfusionMatrix - Interface in org.apache.spark.mllib.evaluation.binary
Trait for a binary confusion matrix.
BinaryConfusionMatrixImpl - Class in org.apache.spark.mllib.evaluation.binary
Implementation of BinaryConfusionMatrix.
BinaryConfusionMatrixImpl(BinaryLabelCounter, BinaryLabelCounter) - Constructor for class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
BinaryFileRDD<T> - Class in org.apache.spark.rdd
 
BinaryFileRDD(SparkContext, Class<? extends StreamFileInputFormat<T>>, Class<String>, Class<T>, Configuration, int) - Constructor for class org.apache.spark.rdd.BinaryFileRDD
 
binaryFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext
:: Experimental ::
binaryFiles(String, int) - Method in class org.apache.spark.SparkContext
:: Experimental ::
BinaryLabelCounter - Class in org.apache.spark.mllib.evaluation.binary
A counter for positives and negatives.
BinaryLabelCounter(long, long) - Constructor for class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators
Function to check if labels used for classification are either zero or one.
BinaryNode - Interface in org.apache.spark.sql.execution
 
binaryRecords(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
:: Experimental ::
binaryRecords(String, int, Configuration) - Method in class org.apache.spark.SparkContext
:: Experimental ::
BinaryType - Class in org.apache.spark.sql.api.java
The data type representing byte[] values.
BinaryType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the BinaryType object.
bind() - Method in class org.apache.spark.ui.WebUI
Bind to the HTTP server behind this web interface.
binnedFeatures() - Method in class org.apache.spark.mllib.tree.impl.TreePoint
 
BinomialBounds - Class in org.apache.spark.util.random
Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample size with high confidence when sampling without replacement.
BinomialBounds() - Constructor for class org.apache.spark.util.random.BinomialBounds
 
BITS_PER_LONG() - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
BLAS - Class in org.apache.spark.mllib.linalg
BLAS routines for MLlib's vectors and matrices.
BLAS() - Constructor for class org.apache.spark.mllib.linalg.BLAS
 
BLOCK_MANAGER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
BlockAdditionEvent - Class in org.apache.spark.streaming.scheduler
 
BlockAdditionEvent(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.BlockAdditionEvent
 
BlockException - Exception in org.apache.spark.storage
 
BlockException(BlockId, String) - Constructor for exception org.apache.spark.storage.BlockException
 
BlockGenerator - Class in org.apache.spark.streaming.receiver
Generates batches of objects received by a Receiver and puts them into appropriately named blocks at regular intervals.
BlockGenerator(BlockGeneratorListener, int, SparkConf) - Constructor for class org.apache.spark.streaming.receiver.BlockGenerator
 
BlockGeneratorListener - Interface in org.apache.spark.streaming.receiver
Listener object for BlockGenerator events
blockId() - Method in class org.apache.spark.rdd.BlockRDDPartition
 
blockId() - Method in class org.apache.spark.scheduler.IndirectTaskResult
 
blockId() - Method in exception org.apache.spark.storage.BlockException
 
BlockId - Class in org.apache.spark.storage
:: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocations
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
blockId() - Method in class org.apache.spark.storage.BlockObjectWriter
 
blockId() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
blockId() - Method in interface org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchResult
 
blockId() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
blockId() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
blockId() - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedStoreResult
 
blockId() - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockStoreResult
 
blockId() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
blockIds() - Method in class org.apache.spark.rdd.BlockRDD
 
blockIds() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
 
blockIdsToBlockManagers(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockIdsToExecutorIds(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockIdsToHosts(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockifyObject(T, int, Serializer, Option<CompressionCodec>, ClassTag<T>) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
 
BlockInfo - Class in org.apache.spark.storage
 
BlockInfo(StorageLevel, boolean) - Constructor for class org.apache.spark.storage.BlockInfo
 
blockManager() - Method in class org.apache.spark.SparkEnv
 
BlockManager - Class in org.apache.spark.storage
Manager running on every node (driver and executors) which provides interfaces for putting and retrieving blocks both locally and remotely into various stores (memory, disk, and off-heap).
BlockManager(String, ActorSystem, BlockManagerMaster, Serializer, long, SparkConf, MapOutputTracker, ShuffleManager, BlockTransferService, SecurityManager, int) - Constructor for class org.apache.spark.storage.BlockManager
 
BlockManager(String, ActorSystem, BlockManagerMaster, Serializer, SparkConf, MapOutputTracker, ShuffleManager, BlockTransferService, SecurityManager, int) - Constructor for class org.apache.spark.storage.BlockManager
Construct a BlockManager with a memory limit set based on system properties.
blockManager() - Method in class org.apache.spark.storage.BlockManagerSource
 
blockManager() - Method in class org.apache.spark.storage.BlockStore
 
blockManagerAddedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerAddedToJson(SparkListenerBlockManagerAdded) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerBasedBlockHandler - Class in org.apache.spark.streaming.receiver
Implementation of a ReceivedBlockHandler which stores the received blocks into a block manager with the specified storage level.
BlockManagerBasedBlockHandler(BlockManager, StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
BlockManagerBasedStoreResult - Class in org.apache.spark.streaming.receiver
Implementation of ReceivedBlockStoreResult that stores the metadata related to storage of blocks using BlockManagerBasedBlockHandler
BlockManagerBasedStoreResult(StreamBlockId) - Constructor for class org.apache.spark.streaming.receiver.BlockManagerBasedStoreResult
 
blockManagerId() - Method in class org.apache.spark.Heartbeat
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManager
 
BlockManagerId - Class in org.apache.spark.storage
:: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerInfo
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetPeers
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
 
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
 
blockManagerIdFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
blockManagerIdToJson(BlockManagerId) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerInfo - Class in org.apache.spark.storage
 
BlockManagerInfo(BlockManagerId, long, long, ActorRef) - Constructor for class org.apache.spark.storage.BlockManagerInfo
 
BlockManagerMaster - Class in org.apache.spark.storage
 
BlockManagerMaster(ActorRef, SparkConf, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMaster
 
BlockManagerMasterActor - Class in org.apache.spark.storage
BlockManagerMasterActor is an actor on the master node to track statuses of all slaves' block managers.
BlockManagerMasterActor(boolean, SparkConf, LiveListenerBus) - Constructor for class org.apache.spark.storage.BlockManagerMasterActor
 
BlockManagerMessages - Class in org.apache.spark.storage
 
BlockManagerMessages() - Constructor for class org.apache.spark.storage.BlockManagerMessages
 
BlockManagerMessages.BlockManagerHeartbeat - Class in org.apache.spark.storage
 
BlockManagerMessages.BlockManagerHeartbeat(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
 
BlockManagerMessages.BlockManagerHeartbeat$ - Class in org.apache.spark.storage
 
BlockManagerMessages.BlockManagerHeartbeat$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
 
BlockManagerMessages.ExpireDeadHosts$ - Class in org.apache.spark.storage
 
BlockManagerMessages.ExpireDeadHosts$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.ExpireDeadHosts$
 
BlockManagerMessages.GetActorSystemHostPortForExecutor - Class in org.apache.spark.storage
 
BlockManagerMessages.GetActorSystemHostPortForExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor
 
BlockManagerMessages.GetActorSystemHostPortForExecutor$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetActorSystemHostPortForExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor$
 
BlockManagerMessages.GetBlockStatus - Class in org.apache.spark.storage
 
BlockManagerMessages.GetBlockStatus(BlockId, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
BlockManagerMessages.GetBlockStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetBlockStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
 
BlockManagerMessages.GetLocations - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocations(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations
 
BlockManagerMessages.GetLocations$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocations$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations$
 
BlockManagerMessages.GetLocationsMultipleBlockIds - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocationsMultipleBlockIds(BlockId[]) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
 
BlockManagerMessages.GetLocationsMultipleBlockIds$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocationsMultipleBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
 
BlockManagerMessages.GetMatchingBlockIds - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMatchingBlockIds(Function1<BlockId, Object>, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
BlockManagerMessages.GetMatchingBlockIds$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMatchingBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
 
BlockManagerMessages.GetMemoryStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMemoryStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
 
BlockManagerMessages.GetPeers - Class in org.apache.spark.storage
 
BlockManagerMessages.GetPeers(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers
 
BlockManagerMessages.GetPeers$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetPeers$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers$
 
BlockManagerMessages.GetStorageStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetStorageStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
 
BlockManagerMessages.RegisterBlockManager - Class in org.apache.spark.storage
 
BlockManagerMessages.RegisterBlockManager(BlockManagerId, long, ActorRef) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
BlockManagerMessages.RegisterBlockManager$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RegisterBlockManager$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
 
BlockManagerMessages.RemoveBlock - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBlock(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
 
BlockManagerMessages.RemoveBlock$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBlock$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
 
BlockManagerMessages.RemoveBroadcast - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBroadcast(long, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
BlockManagerMessages.RemoveBroadcast$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBroadcast$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
 
BlockManagerMessages.RemoveExecutor - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
 
BlockManagerMessages.RemoveExecutor$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
 
BlockManagerMessages.RemoveRdd - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveRdd(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
 
BlockManagerMessages.RemoveRdd$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveRdd$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
 
BlockManagerMessages.RemoveShuffle - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveShuffle(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
 
BlockManagerMessages.RemoveShuffle$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveShuffle$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
 
BlockManagerMessages.StopBlockManagerMaster$ - Class in org.apache.spark.storage
 
BlockManagerMessages.StopBlockManagerMaster$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
 
BlockManagerMessages.ToBlockManagerMaster - Interface in org.apache.spark.storage
 
BlockManagerMessages.ToBlockManagerSlave - Interface in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo - Class in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
BlockManagerMessages.UpdateBlockInfo() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
BlockManagerMessages.UpdateBlockInfo$ - Class in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
 
blockManagerRemovedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerRemovedToJson(SparkListenerBlockManagerRemoved) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerSlaveActor - Class in org.apache.spark.storage
An actor to take commands from the master to execute options.
BlockManagerSlaveActor(BlockManager, MapOutputTracker) - Constructor for class org.apache.spark.storage.BlockManagerSlaveActor
 
BlockManagerSource - Class in org.apache.spark.storage
 
BlockManagerSource(BlockManager) - Constructor for class org.apache.spark.storage.BlockManagerSource
 
BlockNotFoundException - Exception in org.apache.spark.storage
 
BlockNotFoundException(String) - Constructor for exception org.apache.spark.storage.BlockNotFoundException
 
BlockObjectWriter - Class in org.apache.spark.storage
An interface for writing JVM objects to some underlying storage.
BlockObjectWriter(BlockId) - Constructor for class org.apache.spark.storage.BlockObjectWriter
 
blockPushingThread() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
BlockRDD<T> - Class in org.apache.spark.rdd
 
BlockRDD(SparkContext, BlockId[], ClassTag<T>) - Constructor for class org.apache.spark.rdd.BlockRDD
 
BlockRDDPartition - Class in org.apache.spark.rdd
 
BlockRDDPartition(BlockId, int) - Constructor for class org.apache.spark.rdd.BlockRDDPartition
 
BlockResult - Class in org.apache.spark.storage
 
BlockResult(Iterator<Object>, Enumeration.Value, long) - Constructor for class org.apache.spark.storage.BlockResult
 
blocks() - Method in class org.apache.spark.storage.BlockManagerInfo
 
blocks() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
blocks() - Method in class org.apache.spark.storage.StorageStatus
Return the blocks stored in this block manager.
BlockStatus - Class in org.apache.spark.storage
 
BlockStatus(StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
 
blockStatusFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockStatusToJson(BlockStatus) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockStore - Class in org.apache.spark.storage
Abstract class to store blocks.
BlockStore(BlockManager) - Constructor for class org.apache.spark.storage.BlockStore
 
blockStoreResult() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
blockTransferService() - Method in class org.apache.spark.SparkEnv
 
BlockValues - Interface in org.apache.spark.storage
 
bmAddress() - Method in class org.apache.spark.FetchFailed
 
BOOLEAN - Class in org.apache.spark.sql.columnar
 
BOOLEAN() - Constructor for class org.apache.spark.sql.columnar.BOOLEAN
 
BooleanBitSet - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet() - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
BooleanBitSet.Decoder - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet.Decoder(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
BooleanBitSet.Encoder - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
BooleanColumnAccessor - Class in org.apache.spark.sql.columnar
 
BooleanColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.BooleanColumnAccessor
 
BooleanColumnBuilder - Class in org.apache.spark.sql.columnar
 
BooleanColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.BooleanColumnBuilder
 
BooleanColumnStats - Class in org.apache.spark.sql.columnar
 
BooleanColumnStats() - Constructor for class org.apache.spark.sql.columnar.BooleanColumnStats
 
BooleanParam - Class in org.apache.spark.ml.param
Specialized version of Param[Boolean] for Java.
BooleanParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.BooleanParam
 
BooleanType - Class in org.apache.spark.sql.api.java
The data type representing boolean and Boolean values.
BooleanType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the BooleanType object.
booleanWritableConverter() - Static method in class org.apache.spark.SparkContext
 
boolToBoolWritable(boolean) - Static method in class org.apache.spark.SparkContext
 
BoostingStrategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Configuration options for GradientBoostedTrees.
BoostingStrategy(Strategy, Loss, int, double) - Constructor for class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
Both() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges originating from *and* arriving at a vertex of interest.
BoundedDouble - Class in org.apache.spark.partial
:: Experimental :: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
 
BoundedPriorityQueue<A> - Class in org.apache.spark.util
Bounded priority queue.
BoundedPriorityQueue(int, Ordering<A>) - Constructor for class org.apache.spark.util.BoundedPriorityQueue
 
boundGenerator() - Method in class org.apache.spark.sql.execution.Generate
 
boundPort() - Method in class org.apache.spark.ui.ServerInfo
 
boundPort() - Method in class org.apache.spark.ui.WebUI
Return the actual port to which this server is bound.
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast
A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
 
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
broadcast() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
 
BROADCAST_VARS() - Static method in class org.apache.spark.util.MetadataCleanerType
 
BroadcastBlockId - Class in org.apache.spark.storage
 
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
 
broadcastCleaned(long) - Method in interface org.apache.spark.CleanerListener
 
broadcastedConf() - Method in class org.apache.spark.rdd.CheckpointRDD
 
BroadcastFactory - Interface in org.apache.spark.broadcast
:: DeveloperApi :: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
BroadcastHashJoin - Class in org.apache.spark.sql.execution.joins
:: DeveloperApi :: Performs an inner hash join of two child relations.
BroadcastHashJoin(Seq<Expression>, Seq<Expression>, org.apache.spark.sql.execution.joins.BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
broadcastId() - Method in class org.apache.spark.CleanBroadcast
 
broadcastId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
 
BroadcastManager - Class in org.apache.spark.broadcast
 
BroadcastManager(boolean, SparkConf, SecurityManager) - Constructor for class org.apache.spark.broadcast.BroadcastManager
 
broadcastManager() - Method in class org.apache.spark.SparkEnv
 
BroadcastNestedLoopJoin - Class in org.apache.spark.sql.execution.joins
:: DeveloperApi ::
BroadcastNestedLoopJoin(SparkPlan, SparkPlan, org.apache.spark.sql.execution.joins.BuildSide, JoinType, Option<Expression>) - Constructor for class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
BroadcastNestedLoopJoin() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
broadcastVars() - Method in class org.apache.spark.sql.execution.PythonUDF
 
buf() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
buffer() - Method in class org.apache.spark.storage.ArrayValues
 
buffer() - Method in class org.apache.spark.storage.ByteBufferValues
 
buffer() - Method in class org.apache.spark.util.SerializableBuffer
 
buffers() - Method in class org.apache.spark.sql.columnar.CachedBatch
 
build() - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Builds and returns all combinations of parameters specified by the param grid.
build(Node[]) - Method in class org.apache.spark.mllib.tree.model.Node
build the left node and right nodes if not leaf
build() - Method in class org.apache.spark.sql.api.java.MetadataBuilder
 
build() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
build() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Returns the final columnar byte buffer.
build() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
build() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
buildFilter() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
buildKeys() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
buildMetadata(RDD<LabeledPoint>, Strategy, int, String) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Construct a DecisionTreeMetadata instance for this dataset and parameters.
buildMetadata(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Version of buildMetadata() for DecisionTree.
buildNonNulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
buildPlan() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
buildPools() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
buildPools() - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
buildPools() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
buildProjection() - Method in class org.apache.spark.sql.execution.Project
 
buildRegistryName(Source) - Method in class org.apache.spark.metrics.MetricsSystem
Build a name that uniquely identifies each metric source.
buildScan() - Method in class org.apache.spark.sql.json.JSONRelation
 
buildScan(Seq<Attribute>, Seq<Expression>) - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
buildScan(Seq<Attribute>, Seq<Expression>) - Method in class org.apache.spark.sql.sources.CatalystScan
 
buildScan(String[], Filter[]) - Method in class org.apache.spark.sql.sources.PrunedFilteredScan
 
buildScan(String[]) - Method in class org.apache.spark.sql.sources.PrunedScan
 
buildScan() - Method in class org.apache.spark.sql.sources.TableScan
 
buildSide() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
buildSide() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
buildSide() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
buildSide() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
buildSide() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
buildSideKeyGenerator() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
BYTE - Class in org.apache.spark.sql.columnar
 
BYTE() - Constructor for class org.apache.spark.sql.columnar.BYTE
 
ByteArrayChunkOutputStream - Class in org.apache.spark.util.io
An OutputStream that writes to fixed-size chunks of byte arrays.
ByteArrayChunkOutputStream(int) - Constructor for class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
ByteArrayColumnType<T extends org.apache.spark.sql.catalyst.types.DataType> - Class in org.apache.spark.sql.columnar
 
ByteArrayColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ByteArrayColumnType
 
byteBuffer() - Method in class org.apache.spark.streaming.receiver.ByteBufferBlock
 
ByteBufferBlock - Class in org.apache.spark.streaming.receiver
class representing a block received as an ByteBuffer
ByteBufferBlock(ByteBuffer) - Constructor for class org.apache.spark.streaming.receiver.ByteBufferBlock
 
ByteBufferData - Class in org.apache.spark.streaming.receiver
 
ByteBufferData(ByteBuffer) - Constructor for class org.apache.spark.streaming.receiver.ByteBufferData
 
ByteBufferInputStream - Class in org.apache.spark.util
Reads data from a ByteBuffer, and optionally cleans it up using BlockManager.dispose() at the end of the stream (e.g.
ByteBufferInputStream(ByteBuffer, boolean) - Constructor for class org.apache.spark.util.ByteBufferInputStream
 
ByteBufferValues - Class in org.apache.spark.storage
 
ByteBufferValues(ByteBuffer) - Constructor for class org.apache.spark.storage.ByteBufferValues
 
BytecodeUtils - Class in org.apache.spark.graphx.util
Includes an utility function to test whether a function accesses a specific attribute of an object.
BytecodeUtils() - Constructor for class org.apache.spark.graphx.util.BytecodeUtils
 
ByteColumnAccessor - Class in org.apache.spark.sql.columnar
 
ByteColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.ByteColumnAccessor
 
ByteColumnBuilder - Class in org.apache.spark.sql.columnar
 
ByteColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.ByteColumnBuilder
 
ByteColumnStats - Class in org.apache.spark.sql.columnar
 
ByteColumnStats() - Constructor for class org.apache.spark.sql.columnar.ByteColumnStats
 
bytes() - Method in class org.apache.spark.streaming.receiver.ByteBufferData
 
BYTES_FOR_PRECISION() - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Compute the FIXED_LEN_BYTE_ARRAY length needed to represent a given DECIMAL precision.
bytesToBytesWritable(byte[]) - Static method in class org.apache.spark.SparkContext
 
bytesToLines(InputStream) - Static method in class org.apache.spark.streaming.dstream.SocketReceiver
This methods translates the data from an inputstream (say, from a socket) to '\n' delimited strings and returns an iterator to access the strings.
bytesToString(long) - Static method in class org.apache.spark.util.Utils
Convert a quantity in bytes to a human-readable string such as "4.0 MB".
bytesWritableConverter() - Static method in class org.apache.spark.SparkContext
 
bytesWritten(long) - Method in interface org.apache.spark.util.logging.RollingPolicy
Notify that bytes have been written
bytesWritten(long) - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Increment the bytes that have been written in the current file
bytesWritten(long) - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
ByteType - Class in org.apache.spark.sql.api.java
The data type representing byte and Byte values.
ByteType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the ByteType object.

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.graphx.Graph
Caches the vertices and edges associated with this graph at the previously-specified target storage levels, which default to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.partial.StudentTCacher
 
cache() - Method in class org.apache.spark.rdd.RDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.SchemaRDD
Overridden cache function will always use the in-memory columnar caching.
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
CachedBatch - Class in org.apache.spark.sql.columnar
 
CachedBatch(byte[][], Row) - Constructor for class org.apache.spark.sql.columnar.CachedBatch
 
cachedColumnBuffers() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
CachedData - Class in org.apache.spark.sql
Holds a cached logical plan and its data
CachedData(LogicalPlan, InMemoryRelation) - Constructor for class org.apache.spark.sql.CachedData
 
cachedData() - Method in interface org.apache.spark.sql.CacheManager
 
cachedRepresentation() - Method in class org.apache.spark.sql.CachedData
 
cacheLock() - Method in interface org.apache.spark.sql.CacheManager
 
CacheManager - Class in org.apache.spark
Spark class responsible for passing RDDs partition contents to the BlockManager and making sure a node doesn't load two copies of an RDD at once.
CacheManager(BlockManager) - Constructor for class org.apache.spark.CacheManager
 
cacheManager() - Method in class org.apache.spark.SparkEnv
 
CacheManager - Interface in org.apache.spark.sql
Provides support in a SQLContext for caching query results and automatically using these cached results when subsequent queries are executed.
cacheQuery(SchemaRDD, Option<String>, StorageLevel) - Method in interface org.apache.spark.sql.CacheManager
Caches the data produced by the logical representation of the given schema rdd.
cacheTable(String) - Method in interface org.apache.spark.sql.CacheManager
Caches the specified table in-memory.
CacheTableCommand - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
CacheTableCommand(String, Option<LogicalPlan>, boolean) - Constructor for class org.apache.spark.sql.execution.CacheTableCommand
 
cacheTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for regression
calculate() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Calculate the impurity from the stored sufficient statistics.
calculatedTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
calculateNumBatchesToRemember(Duration) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
Calculate the number of last batches to remember, such that all the files selected in at least last MIN_REMEMBER_DURATION duration can be remembered.
calculateTotalMemory(SparkContext) - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
 
call(T1) - Method in interface org.apache.spark.api.java.function.Function
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
 
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
 
call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
 
call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
 
call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
 
call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
 
call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
 
call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
 
call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
 
call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
 
callSite() - Method in class org.apache.spark.scheduler.ActiveJob
 
callSite() - Method in class org.apache.spark.scheduler.JobSubmitted
 
callSite() - Method in class org.apache.spark.scheduler.Stage
 
CallSite - Class in org.apache.spark.util
CallSite represents a place in user code.
CallSite(String, String) - Constructor for class org.apache.spark.util.CallSite
 
canBeCodeGened(Seq<AggregateExpression>) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
 
cancel() - Method in class org.apache.spark.ComplexFutureAction
 
cancel() - Method in interface org.apache.spark.FutureAction
Cancels the execution of this action.
cancel(boolean) - Method in class org.apache.spark.JavaFutureActionWrapper
 
cancel() - Method in class org.apache.spark.scheduler.JobWaiter
Sends a signal to the DAGScheduler to cancel the job.
cancel() - Method in class org.apache.spark.SimpleFutureAction
 
cancel() - Method in class org.apache.spark.util.MetadataCleaner
 
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel all jobs that are running or waiting in the queue.
cancelAllJobs() - Method in class org.apache.spark.SparkContext
Cancel all jobs that have been scheduled or are running.
cancelJob(int) - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel a job that is running or waiting in the queue.
cancelJob(int) - Method in class org.apache.spark.SparkContext
Cancel a given job if it's scheduled or running
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
Cancel active jobs for the specified group.
cancelStage(int) - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel all jobs associated with a running or scheduled stage.
cancelStage(int) - Method in class org.apache.spark.SparkContext
Cancel a given stage and all jobs associated with it
cancelTasks(int, boolean) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
cancelTasks(int, boolean) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
canEqual(Object) - Method in class org.apache.spark.sql.api.java.Row
 
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
 
canFetchMoreResults(long) - Method in class org.apache.spark.scheduler.TaskSetManager
Check whether has enough quota to fetch the result with size bytes
capacity() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
CartesianPartition - Class in org.apache.spark.rdd
 
CartesianPartition(int, RDD<?>, RDD<?>, int, int) - Constructor for class org.apache.spark.rdd.CartesianPartition
 
CartesianProduct - Class in org.apache.spark.sql.execution.joins
:: DeveloperApi ::
CartesianProduct(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.CartesianProduct
 
CartesianProduct() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
CartesianRDD<T,U> - Class in org.apache.spark.rdd
 
CartesianRDD(SparkContext, RDD<T>, RDD<U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.CartesianRDD
 
CASE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
caseSensitive() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
castChildOutput(InsertIntoTable, MetastoreRelation, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
 
CatalystArrayContainsNullConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array contains null (see ParquetTypesConverter) into an ArrayType.
CatalystArrayContainsNullConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
CatalystArrayContainsNullConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
CatalystArrayConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array (see ParquetTypesConverter) into an ArrayType.
CatalystArrayConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
 
CatalystArrayConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
 
CatalystConverter - Class in org.apache.spark.sql.parquet
 
CatalystConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystConverter
 
CatalystGroupConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that is able to convert a Parquet record to a Row object.
CatalystGroupConverter(StructField[], int, CatalystConverter, ArrayBuffer<Object>, ArrayBuffer<Row>) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
 
CatalystGroupConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
 
CatalystGroupConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
This constructor is used for the root converter only!
CatalystMapConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts two-element groups that match the characteristics of a map (see ParquetTypesConverter) into an MapType.
CatalystMapConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystMapConverter
 
CatalystNativeArrayConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array (see ParquetTypesConverter) into an ArrayType.
CatalystNativeArrayConverter(NativeType, int, CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
CatalystPrimitiveConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.PrimitiveConverter that converts Parquet types to Catalyst types.
CatalystPrimitiveConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
CatalystPrimitiveRowConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that is able to convert a Parquet record to a Row object.
CatalystPrimitiveRowConverter(StructField[], MutableRow) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
CatalystPrimitiveRowConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
CatalystScan - Class in org.apache.spark.sql.sources
::Experimental:: An interface for experimenting with a more direct connection to the query planner.
CatalystScan() - Constructor for class org.apache.spark.sql.sources.CatalystScan
 
CatalystStructConverter - Class in org.apache.spark.sql.parquet
This converter is for multi-element groups of primitive or complex types that have repetition level optional or required (so struct fields).
CatalystStructConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystStructConverter
 
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
categories() - Method in class org.apache.spark.mllib.tree.model.Split
 
category() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
category() - Method in class org.apache.spark.mllib.tree.model.Bin
 
channelFactory() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
channelFactoryExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
checkEquals(ASTNode) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Throws an error if this is not equal to other.
checkHost(String, String) - Static method in class org.apache.spark.util.Utils
 
checkHostPort(String, String) - Static method in class org.apache.spark.util.Utils
 
checkMinimalPollingPeriod(TimeUnit, int) - Static method in class org.apache.spark.metrics.MetricsSystem
 
checkModifyPermissions(String) - Method in class org.apache.spark.SecurityManager
Checks the given user against the modify acl list to see if they have authorization to modify the application.
checkOutputSpecs(JobContext) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.Graph
Mark this Graph for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
checkpoint() - Method in class org.apache.spark.rdd.CheckpointRDD
 
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
 
checkpoint() - Method in class org.apache.spark.rdd.RDD
Mark this RDD for checkpointing.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
Checkpoint - Class in org.apache.spark.streaming
 
Checkpoint(StreamingContext, Time) - Constructor for class org.apache.spark.streaming.Checkpoint
 
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Enable periodic checkpointing of RDDs of this DStream
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointBackupFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
Get the checkpoint backup file for the given checkpoint time
checkpointClock() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
 
checkpointData() - Method in class org.apache.spark.rdd.RDD
 
checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
 
checkpointDir() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
checkpointDir() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
 
checkpointDir() - Method in class org.apache.spark.SparkContext
 
checkpointDir() - Method in class org.apache.spark.streaming.Checkpoint
 
checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
 
checkpointDirToLogDir(String, int) - Static method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
checkpointDirToLogDir(String) - Static method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
 
checkpointDuration() - Method in class org.apache.spark.streaming.Checkpoint
 
checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
 
Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
 
checkpointFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
Get the checkpoint file for the given checkpoint time
CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
 
checkpointInProgress() - Method in class org.apache.spark.streaming.DStreamGraph
 
checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
checkpointInterval() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
 
checkpointPath() - Method in class org.apache.spark.rdd.CheckpointRDD
 
CheckpointRDD<T> - Class in org.apache.spark.rdd
This RDD represents a RDD checkpoint file (similar to HadoopRDD).
CheckpointRDD(SparkContext, String, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CheckpointRDD
 
checkpointRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
CheckpointRDDPartition - Class in org.apache.spark.rdd
 
CheckpointRDDPartition(int) - Constructor for class org.apache.spark.rdd.CheckpointRDDPartition
 
CheckpointReader - Class in org.apache.spark.streaming
 
CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
 
CheckpointState - Class in org.apache.spark.rdd
Enumeration to manage state transitions of an RDD through checkpointing [ Initialized --> marked for checkpointing --> checkpointing in progress --> checkpointed ]
CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
 
checkpointTime() - Method in class org.apache.spark.streaming.Checkpoint
 
CheckpointWriter - Class in org.apache.spark.streaming
Convenience class to handle the writing of graph checkpoint to file
CheckpointWriter(JobGenerator, SparkConf, String, Configuration) - Constructor for class org.apache.spark.streaming.CheckpointWriter
 
CheckpointWriter.CheckpointWriteHandler - Class in org.apache.spark.streaming
 
CheckpointWriter.CheckpointWriteHandler(Time, byte[]) - Constructor for class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.Pool
 
checkSpeculatableTasks() - Method in interface org.apache.spark.scheduler.Schedulable
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
Check for tasks to be speculated and return true if there are any.
checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
checkTimeoutInterval() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
checkUIViewPermissions(String) - Method in class org.apache.spark.SecurityManager
Checks the given user against the view acl list to see if they have authorization to view the UI.
child() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
child() - Method in class org.apache.spark.sql.execution.Aggregate
 
child() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
 
child() - Method in class org.apache.spark.sql.execution.DescribeCommand
 
child() - Method in class org.apache.spark.sql.execution.Distinct
 
child() - Method in class org.apache.spark.sql.execution.EvaluatePython
 
child() - Method in class org.apache.spark.sql.execution.Exchange
 
child() - Method in class org.apache.spark.sql.execution.ExternalSort
 
child() - Method in class org.apache.spark.sql.execution.Filter
 
child() - Method in class org.apache.spark.sql.execution.Generate
 
child() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
child() - Method in class org.apache.spark.sql.execution.Limit
 
child() - Method in class org.apache.spark.sql.execution.OutputFaker
 
child() - Method in class org.apache.spark.sql.execution.Project
 
child() - Method in class org.apache.spark.sql.execution.Sample
 
child() - Method in class org.apache.spark.sql.execution.Sort
 
child() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
child() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
children() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
children() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
 
children() - Method in class org.apache.spark.sql.execution.ExecutedCommand
 
children() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
children() - Method in class org.apache.spark.sql.execution.OutputFaker
 
children() - Method in class org.apache.spark.sql.execution.PythonUDF
 
children() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
children() - Method in class org.apache.spark.sql.execution.Union
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
children() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
children() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
children() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution.
chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform distribution, with each category having an expected frequency of 1 / observed.size.
chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Conduct Pearson's independence test on the input contingency matrix, which cannot contain negative entries or columns or rows that sum up to 0.
chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Conduct Pearson's independence test for every feature against the label across the input RDD.
ChiSqTest - Class in org.apache.spark.mllib.stat.test
Conduct the chi-squared test for the input RDDs using the specified method.
ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
 
ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.Method(String, Function2<Object, Object, Object>) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.Method$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
 
ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.NullHypothesis$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
:: Experimental :: Object containing the test results for the chi-squared hypothesis test.
ChiSqTestResult(double, int, double, String, String) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
Conduct Pearson's independence test for each feature against the label across the input RDD.
chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
chmod700(File) - Static method in class org.apache.spark.util.Utils
JDK equivalent of chmod 700 file.
classForName(String) - Static method in class org.apache.spark.util.Utils
Preferred alternative to Class.forName(className)
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
ClassificationModel - Interface in org.apache.spark.mllib.classification
:: Experimental :: Represents a classification model that predicts to which of a set of categories an example belongs.
classIsLoadable(String) - Static method in class org.apache.spark.util.Utils
Determines whether the provided class is loadable in the current thread.
classLoader() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
className() - Method in class org.apache.spark.ExceptionFailure
 
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaRDD
 
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
classTag() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
clean(F, boolean) - Method in class org.apache.spark.SparkContext
Clean a closure to make it ready to serialized and send to tasks (removes unreferenced variables in $outer's, updates REPL variables) If checkSerializable is set, clean will also proactively check to see if f is serializable and throw a SparkException if not.
clean(Object, boolean) - Static method in class org.apache.spark.util.ClosureCleaner
 
CleanBroadcast - Class in org.apache.spark
 
CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
 
cleaner() - Method in class org.apache.spark.SparkContext
 
CleanerListener - Interface in org.apache.spark
Listener class used for testing when any item has been cleaned by the Cleaner class.
CleanRDD - Class in org.apache.spark
 
CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
 
CleanShuffle - Class in org.apache.spark
 
CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
 
cleanup(long) - Method in class org.apache.spark.SparkContext
Called by MetadataCleaner to clean up the persistentRdds map periodically
cleanup(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
Cleanup old checkpoint data.
cleanup(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
cleanUpAfterSchedulerStop() - Method in class org.apache.spark.scheduler.DAGScheduler
 
cleanupOldBatches(Time, boolean) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Clean up block information of old batches.
cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
CleanupOldBlocks - Class in org.apache.spark.streaming.receiver
 
CleanupOldBlocks(Time) - Constructor for class org.apache.spark.streaming.receiver.CleanupOldBlocks
 
cleanupOldBlocks(long) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
Cleanup old blocks older than the given threshold time
cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
cleanupOldBlocksAndBatches(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Clean up the data and metadata of blocks and batches that are strictly older than the threshold time.
cleanupOldLogs(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Delete the log files that are older than the threshold time.
CleanupTask - Interface in org.apache.spark
Classes that represent cleaning tasks.
CleanupTaskWeakReference - Class in org.apache.spark
A WeakReference associated with a CleanupTask.
CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
 
clear() - Static method in class org.apache.spark.Accumulators
 
clear() - Method in interface org.apache.spark.sql.SQLConf
 
clear() - Method in class org.apache.spark.storage.BlockManagerInfo
 
clear() - Method in class org.apache.spark.storage.BlockStore
 
clear() - Method in class org.apache.spark.storage.MemoryStore
 
clear() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
CLEAR_NULL_VALUES_INTERVAL() - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
clearActiveContext() - Static method in class org.apache.spark.SparkContext
Clears the active SparkContext metadata.
clearCache() - Method in interface org.apache.spark.sql.CacheManager
 
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext
Clear the thread-local property for overriding the call sites of actions and RDDs.
clearCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
 
clearCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
ClearCheckpointData - Class in org.apache.spark.streaming.scheduler
 
ClearCheckpointData(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearCheckpointData
 
clearDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearFiles() - Method in class org.apache.spark.SparkContext
Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.SparkContext
Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext
Clear the current thread's job group ID and its description.
clearMetadata(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Clear metadata that are older than rememberDuration of this DStream.
clearMetadata(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
ClearMetadata - Class in org.apache.spark.streaming.scheduler
 
ClearMetadata(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearMetadata
 
clearNullValues() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
Remove entries with values that are no longer strongly reachable.
clearOldValues(long, Function2<A, B, BoxedUnit>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashMap
Removes old key-value pairs that have timestamp earlier than `threshTime`.
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashSet
Removes old values that have timestamp earlier than threshTime
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
Remove old key-value pairs with timestamps earlier than `threshTime`.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
:: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
:: Experimental :: Clears the threshold so that predict will output raw prediction scores.
client() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
client() - Method in class org.apache.spark.storage.TachyonBlockManager
 
client() - Method in class org.apache.spark.streaming.flume.FlumeConnection
 
Clock - Interface in org.apache.spark
An abstract clock for measuring elapsed time.
clock() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
 
clock() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
Clock - Interface in org.apache.spark.streaming.util
 
Clock - Interface in org.apache.spark.util
An interface to represent clocks, so that they can be mocked out in unit tests.
clone() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
clone() - Method in class org.apache.spark.SparkConf
Copy this object
clone(JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType
Creates a duplicated copy of the value.
clone() - Method in class org.apache.spark.storage.StorageLevel
 
clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
 
clone() - Method in class org.apache.spark.util.random.PoissonSampler
 
clone() - Method in interface org.apache.spark.util.random.RandomSampler
return a copy of the RandomSampler object
clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
Clone an object using a Spark serializer.
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
Return a sampler that is the complement of the range specified of the current sampler.
close() - Method in class org.apache.spark.api.java.JavaSparkContext
 
close() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
close() - Method in class org.apache.spark.input.PortableDataStream
Close the file (if it is currently open)
close() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
close() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
close() - Method in class org.apache.spark.serializer.DeserializationStream
 
close() - Method in class org.apache.spark.serializer.JavaDeserializationStream
 
close() - Method in class org.apache.spark.serializer.JavaSerializationStream
 
close() - Method in class org.apache.spark.serializer.KryoDeserializationStream
 
close() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
close() - Method in class org.apache.spark.serializer.SerializationStream
 
close() - Method in class org.apache.spark.SparkHadoopWriter
 
close() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
close() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
close() - Method in class org.apache.spark.storage.BlockObjectWriter
 
close() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
close() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter
 
close() - Method in class org.apache.spark.util.FileLogger
Close the writer.
closeIfNeeded() - Method in class org.apache.spark.util.NextIterator
Calls the subclass-defined close method, but only once.
ClosureCleaner - Class in org.apache.spark.util
 
ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
 
closureSerializer() - Method in class org.apache.spark.SparkEnv
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
cmd() - Method in class org.apache.spark.sql.execution.ExecutedCommand
 
cn() - Method in class org.apache.spark.mllib.feature.VocabWord
 
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
CoalescedRDD<T> - Class in org.apache.spark.rdd
Represents a coalesced RDD that has fewer partitions than its parent RDD This class uses the PartitionCoalescer class to find a good partitioning of the parent RDD so that each new partition has roughly the same number of parent partitions and that the preferred location of each new partition overlaps with as many preferred locations of its parent partitions
CoalescedRDD(RDD<T>, int, double, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CoalescedRDD
 
CoalescedRDDPartition - Class in org.apache.spark.rdd
Class that captures a coalesced RDD by essentially keeping track of parent partitions
CoalescedRDDPartition(int, RDD<?>, int[], Option<String>) - Constructor for class org.apache.spark.rdd.CoalescedRDDPartition
 
CoarseGrainedClusterMessage - Interface in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
 
CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
 
CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutors(Seq<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
 
CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
 
CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillTask(long, String, boolean) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
 
CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.LaunchTask(SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
 
CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.LaunchTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
 
CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterClusterManager$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
 
CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisteredExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
 
CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutor(String, String, int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
 
CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorFailed(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
 
CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorFailed$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
 
CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RemoveExecutor(String, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RemoveExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
 
CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RequestExecutors(int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
 
CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RequestExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
 
CoarseGrainedClusterMessages.RetrieveSparkProps$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RetrieveSparkProps$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
 
CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster
Alternate factory method that takes a ByteBuffer directly for the data field
CoarseGrainedClusterMessages.ReviveOffers$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
 
CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StatusUpdate(String, long, Enumeration.Value, SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StatusUpdate$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
 
CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
 
CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
 
CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
 
CoarseGrainedSchedulerBackend - Class in org.apache.spark.scheduler.cluster
A scheduler backend that waits for coarse grained executors to connect to it through Akka.
CoarseGrainedSchedulerBackend(TaskSchedulerImpl, ActorSystem) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
CoarseGrainedSchedulerBackend.DriverActor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedSchedulerBackend.DriverActor(Seq<Tuple2<String, String>>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
CoarseMesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos
A SchedulerBackend that runs tasks on Mesos, but uses "coarse-grained" tasks, where it holds onto each Mesos node for the duration of the Spark job instead of relinquishing cores whenever a task is done.
CoarseMesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
code() - Method in class org.apache.spark.mllib.feature.VocabWord
 
codegenEnabled() - Method in class org.apache.spark.sql.execution.SparkPlan
 
codegenEnabled() - Method in interface org.apache.spark.sql.SQLConf
When set to true, Spark SQL will use the Scala compiler at runtime to generate custom bytecode that evaluates expressions found in queries.
codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
 
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd
:: DeveloperApi :: A RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
 
CoGroupPartition - Class in org.apache.spark.rdd
 
CoGroupPartition(int, CoGroupSplitDep[]) - Constructor for class org.apache.spark.rdd.CoGroupPartition
 
cogroupResult2ToJava(RDD<Tuple2<K, Tuple3<Iterable<V>, Iterable<W1>, Iterable<W2>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
cogroupResult3ToJava(RDD<Tuple2<K, Tuple4<Iterable<V>, Iterable<W1>, Iterable<W2>, Iterable<W3>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
cogroupResultToJava(RDD<Tuple2<K, Tuple2<Iterable<V>, Iterable<W>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
CoGroupSplitDep - Interface in org.apache.spark.rdd
 
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
collect() - Method in class org.apache.spark.rdd.RDD
Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return an RDD that contains all matching values by applying f.
collect() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
collect() - Method in class org.apache.spark.sql.SchemaRDD
 
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of collect, which returns a future for retrieving an array containing all of the elements in this RDD.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for retrieving all elements of this RDD.
collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.
collectedStatistics() - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.ByteColumnStats
 
collectedStatistics() - Method in interface org.apache.spark.sql.columnar.ColumnStats
Column statistics represented as a single row, currently including closed lower bound, closed upper bound and null count.
collectedStatistics() - Method in class org.apache.spark.sql.columnar.DateColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.FloatColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.GenericColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.IntColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.LongColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.NoopColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.ShortColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.StringColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
 
CollectionsUtils - Class in org.apache.spark.util
 
CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
 
collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Collect the neighbor vertex ids for each vertex.
collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Collect the neighbor vertex attributes for each vertex.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in a specific partition of this RDD.
collectPartitions() - Method in class org.apache.spark.rdd.RDD
A private method for tests, to look at the contents of each partition
colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Computes column-wise summary statistics for the input RDD[Vector].
ColumnAccessor - Interface in org.apache.spark.sql.columnar
An Iterator like trait used to extract values from columnar byte buffer.
columnBatchSize() - Method in interface org.apache.spark.sql.SQLConf
The number of rows that will be
ColumnBuilder - Interface in org.apache.spark.sql.columnar
 
columnNameOfCorruptRecord() - Method in interface org.apache.spark.sql.SQLConf
 
columnOrdinals() - Method in class org.apache.spark.sql.hive.MetastoreRelation
An attribute map for determining the ordinal for non-partition columns.
columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Compute similarities between columns of this matrix using a sampling approach.
columnSimilaritiesDIMSUM(double[], double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Find all similar columns using the DIMSUM sampling algorithm, described in two papers
ColumnStatisticsSchema - Class in org.apache.spark.sql.columnar
 
ColumnStatisticsSchema(Attribute) - Constructor for class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
columnStats() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
columnStats() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Column statistics information
ColumnStats - Interface in org.apache.spark.sql.columnar
Used to collect statistical information when building in-memory columns.
columnStats() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
 
columnType() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
ColumnType<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
An abstract class that represents type of a column.
ColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ColumnType
 
columnType() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Combine elements of each key in DStream's RDDs using custom functions.
combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
 
combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
Command - Interface in org.apache.spark.sql.execution
 
command() - Method in class org.apache.spark.sql.execution.PythonUDF
 
commands() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
 
commit() - Method in class org.apache.spark.SparkHadoopWriter
 
commitAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter
Flush the partial writes and commit them as a single atomic block.
commitAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
commitJob() - Method in class org.apache.spark.SparkHadoopWriter
 
commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
commonHeaderNodes() - Static method in class org.apache.spark.ui.UIUtils
 
comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FairSchedulingAlgorithm
 
comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
 
comparator(Schedulable, Schedulable) - Method in interface org.apache.spark.scheduler.SchedulingAlgorithm
 
compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
 
compatibilityBlackList() - Static method in class org.apache.spark.sql.hive.HiveShim
 
compatibleType(DataType, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD
Returns the most general data type for two given data types.
completedIndices() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
completedStageIndices() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
completedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
completion() - Method in class org.apache.spark.util.CompletionIterator
 
CompletionEvent - Class in org.apache.spark.scheduler
 
CompletionEvent(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.CompletionEvent
 
CompletionIterator<A,I extends scala.collection.Iterator<A>> - Class in org.apache.spark.util
Wrapper around an iterator which calls a completion method after it successfully iterates through all the elements.
CompletionIterator(I) - Constructor for class org.apache.spark.util.CompletionIterator
 
completionTime() - Method in class org.apache.spark.scheduler.StageInfo
Time when all tasks in the stage completed or when the stage was cancelled.
ComplexColumnBuilder<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
ComplexColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.ComplexColumnBuilder
 
ComplexFutureAction<T> - Class in org.apache.spark
A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
compressCodec() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compressed() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
CompressedMapStatus - Class in org.apache.spark.scheduler
A MapStatus implementation that tracks the size of each block.
CompressedMapStatus(BlockManagerId, byte[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
 
CompressedMapStatus(BlockManagerId, long[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
 
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
compressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
CompressibleColumnAccessor<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
CompressibleColumnBuilder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
A stackable trait that builds optionally compressed byte buffer for a column.
COMPRESSION_CODEC_PREFIX() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
CompressionCodec - Interface in org.apache.spark.io
:: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compressionCodec() - Method in class org.apache.spark.scheduler.EventLoggingInfo
 
compressionCodec() - Method in class org.apache.spark.streaming.CheckpointWriter
 
compressionEncoders() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
compressionRatio() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
CompressionScheme - Interface in org.apache.spark.sql.columnar.compression
 
compressType() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
Provides the RDD[(VertexId, VD)] equivalent output.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomVectorRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.BlockRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CartesianRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CheckpointRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoalescedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.EmptyRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FilteredRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FlatMappedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FlatMappedValuesRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.GlommedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MappedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MappedValuesRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PipedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
:: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SampledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SubtractedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.sql.SchemaRDD
 
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Method that generates a RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Method that generates a RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream
Finds the files that were modified since the last time this method was called and makes a union RDD out of them.
compute(Time) - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
Generates RDDs with blocks received by the receiver of this stream.
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.StateDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
Gets the partition data by getting the corresponding block from the block manager.
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes column-wise summary statistics.
computeCorrelation(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Compute correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
Compute Spearman's correlation for two datasets.
computeCorrelationMatrix(RDD<Vector>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation matrix from the covariance matrix.
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the correlation implementation for RDD[Vector].
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the covariance matrix, treating each row as an observation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
Method to calculate loss of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
Method to calculate loss of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
Method to calculate error of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
Method to calculate loss of the base learner for the gradient boosting calculation.
computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils
Returns a sampling rate that guarantees a sample of size >= sampleSizeLowerBound 99.99% of the time.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the Gramian matrix A^T A.
computeOrReadCheckpoint(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
Compute an RDD partition or read it from a checkpoint if the RDD is checkpointing.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the top k principal components.
computeSplitSize(long, long, long) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
This input format overrides computeSplitSize() to make sure that each split only contains full records.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the singular value decomposition of this IndexedRowMatrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes singular value decomposition of this matrix.
computeSVD(int, boolean, double, int, double, String) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
The actual SVD implementation, visible for testing.
computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Given the result returned by getCounts, determine the threshold for accepting items to generate exact sample size.
condition() - Method in class org.apache.spark.sql.execution.Filter
 
condition() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
condition() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
condition() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
conditionEvaluator() - Method in class org.apache.spark.sql.execution.Filter
 
conf() - Method in class org.apache.spark.rdd.RDD
 
conf() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
conf() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
conf() - Method in class org.apache.spark.scheduler.TaskSetManager
 
conf() - Method in class org.apache.spark.SparkContext
 
conf() - Method in class org.apache.spark.SparkEnv
 
conf() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
conf() - Method in class org.apache.spark.storage.BlockManager
 
conf() - Method in class org.apache.spark.streaming.StreamingContext
 
conf() - Method in class org.apache.spark.ui.SparkUI
 
confidence() - Method in class org.apache.spark.partial.BoundedDouble
 
configFile() - Method in class org.apache.spark.metrics.MetricsConfig
 
configTestLog4j(String) - Static method in class org.apache.spark.util.Utils
config a log4j properties used for testsuite
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in "labels"
connected(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
ConnectedComponents - Class in org.apache.spark.graphx.lib
Connected components algorithm.
ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
 
CONSOLE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
ConsoleProgressBar - Class in org.apache.spark.ui
ConsoleProgressBar shows the progress of stages in the next line of the console.
ConsoleProgressBar(SparkContext) - Constructor for class org.apache.spark.ui.ConsoleProgressBar
 
ConsoleSink - Class in org.apache.spark.metrics.sink
 
ConsoleSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.ConsoleSink
 
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
An input stream that always returns the same RDD on each timestep.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
 
constructURIForAuthentication(URI, SecurityManager) - Static method in class org.apache.spark.util.Utils
Construct a URI container information used for authentication.
consumerConnector() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
Checks whether a parameter is explicitly specified.
contains(String) - Method in class org.apache.spark.SparkConf
Does the configuration contain a given parameter?
contains(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
Check if block manager master has a block.
contains(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
contains(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
contains(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
contains(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
contains(A) - Method in class org.apache.spark.util.TimeStampedHashSet
 
containsBlock(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
Check if disk block manager has a block.
containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
Return whether the given block is stored in this block manager in O(1) time.
containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
 
containsShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster
Check if the given shuffle is being tracked
contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
context() - Method in interface org.apache.spark.api.java.JavaRDDLike
The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
 
context() - Method in class org.apache.spark.rdd.RDD
The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.sql.execution.SparkStrategies.CommandStrategy
 
context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
context() - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream
Return the StreamingContext associated with this DStream
ContextCleaner - Class in org.apache.spark
An asynchronous cleaner for RDD, shuffle, and broadcast state.
ContextCleaner(SparkContext) - Constructor for class org.apache.spark.ContextCleaner
 
ContextWaiter - Class in org.apache.spark.streaming
 
ContextWaiter() - Constructor for class org.apache.spark.streaming.ContextWaiter
 
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
convert() - Method in class org.apache.spark.WritableConverter
 
convertCatalystToJava(Object) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
Converts Java objects to catalyst rows / types
convertFromAttributes(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertFromString(String) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertJavaToCatalyst(Object, DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
Converts Java objects to catalyst rows / types
convertMetastoreParquet() - Method in class org.apache.spark.sql.hive.HiveContext
When true, enables an experimental feature where metastore tables that use the parquet SerDe are automatically converted to use the Spark SQL parquet table scan, instead of the Hive SerDe.
convertSplitLocationInfo(Object[]) - Static method in class org.apache.spark.rdd.HadoopRDD
 
convertToAttributes(Type, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertToBaggedRDD(RDD<Datum>, double, int, boolean, int) - Static method in class org.apache.spark.mllib.tree.impl.BaggedPoint
Convert an input dataset into its BaggedPoint representation, choosing subsamplingRate counts for each instance.
convertToString(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertToTreeRDD(RDD<LabeledPoint>, Bin[][], DecisionTreeMetadata) - Static method in class org.apache.spark.mllib.tree.impl.TreePoint
Convert an input dataset into its TreePoint representation, binning feature values in preparation for DecisionTree training.
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
copiesRunning() - Method in class org.apache.spark.scheduler.TaskSetManager
 
copy() - Method in class org.apache.spark.ml.param.ParamMap
Make a copy of this param map.
copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y = x
copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
copy() - Method in interface org.apache.spark.mllib.linalg.Vector
Makes a deep copy of this vector.
copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the class when applicable for non-locking concurrent usage.
copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
 
copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Returns a shallow copy of this instance.
copy() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.util.StatCounter
Clone this StatCounter
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
copyField(Row, int, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Copies from(fromOrdinal) to to(toOrdinal).
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils
Copy all data from an InputStream to an OutputStream.
cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
cores() - Method in class org.apache.spark.scheduler.WorkerOffer
 
coresByTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Compute the Pearson correlation matrix for the input RDD of Vectors.
corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Compute the correlation matrix for the input RDD of Vectors using the specified method.
corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Compute the Pearson correlation for the input RDDs.
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
:: Experimental :: Compute the correlation for the input RDDs using the specified method.
Correlation - Interface in org.apache.spark.mllib.stat.correlation
Trait for correlation algorithms.
CorrelationNames - Class in org.apache.spark.mllib.stat.correlation
Maintains supported and default correlation names.
CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
Correlations - Class in org.apache.spark.mllib.stat.correlation
Delegates computation to the specific correlation object based on the input method name.
Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
 
corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
count() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the number of elements in the RDD.
count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
The number of edges in the RDD.
count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
The number of vertices in the RDD.
count() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
count() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample size.
count() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.rdd.RDD
Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
count() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
count() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
COUNT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
count() - Method in class org.apache.spark.sql.SchemaRDD
:: Experimental :: Return the number of elements in the RDD.
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.util.StatCounter
 
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental ::
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of count, which returns a future for counting the number of elements in this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
Count the number of elements for each key, collecting the results to a local Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
(Experimental) Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
(Experimental) Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
counter() - Method in class org.apache.spark.partial.MeanEvaluator
 
counter() - Method in class org.apache.spark.partial.SumEvaluator
 
CountEvaluator - Class in org.apache.spark.partial
An ApproximateEvaluator for counts.
CountEvaluator(int, double) - Constructor for class org.apache.spark.partial.CountEvaluator
 
cpFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
cpRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
cpState() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
CPUS_PER_TASK() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
CR() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Deprecated.
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Create a new StorageLevel object.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
Create a PartitionPruningRDD.
create(Object...) - Static method in class org.apache.spark.sql.api.java.Row
Creates a Row with the given values.
create(Seq<Object>) - Static method in class org.apache.spark.sql.api.java.Row
Creates a Row with the given values.
create(String, LogicalPlan, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
Creates a new ParquetRelation and underlying Parquetfile for the given LogicalPlan.
create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
 
createActorSystem(String, String, int, SparkConf, SecurityManager) - Static method in class org.apache.spark.util.AkkaUtils
Creates an ActorSystem ready for remoting, with various Spark features.
createArrayType(DataType) - Static method in class org.apache.spark.sql.api.java.DataType
Creates an ArrayType by specifying the data type of elements (elementType).
createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
Creates an ArrayType by specifying the data type of elements (elementType) and whether the array contains null values (containsNull).
createCombiner() - Method in class org.apache.spark.Aggregator
 
createCommand(Protos.Offer, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
createCompiledClass(String, File, String) - Static method in class org.apache.spark.TestUtils
Creates a compiled class with the given name.
createDecimal(BigDecimal) - Static method in class org.apache.spark.sql.hive.HiveShim
 
createDefaultDBIfNeeded(HiveContext) - Static method in class org.apache.spark.sql.hive.HiveShim
 
createDirectory(String, String) - Static method in class org.apache.spark.util.Utils
Create a directory inside the given parent directory.
createDriverEnv(SparkConf, boolean, LiveListenerBus) - Static method in class org.apache.spark.SparkEnv
Create a SparkEnv for the driver.
createDriverResultsArray() - Static method in class org.apache.spark.sql.hive.HiveShim
 
createEmpty(String, Seq<Attribute>, boolean, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
Creates an empty ParquetRelation and underlying Parquetfile that only consists of the Metadata for the given schema.
createExecutorEnv(SparkConf, String, String, int, int, boolean, ActorSystem) - Static method in class org.apache.spark.SparkEnv
Create a SparkEnv for an executor.
createExecutorInfo(String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
createFilter(Expression) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
createFunction() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
createHistoryUI(SparkConf, SparkListenerBus, SecurityManager, String, String) - Static method in class org.apache.spark.ui.SparkUI
 
createJar(Seq<File>, File) - Static method in class org.apache.spark.TestUtils
Create a jar file that contains this set of files.
createJarWithClasses(Seq<String>, String) - Static method in class org.apache.spark.TestUtils
Create a jar that defines classes with the given names.
createJobID(Date, int) - Static method in class org.apache.spark.SparkHadoopWriter
 
createLiveUI(SparkContext, SparkConf, SparkListenerBus, JobProgressListener, SecurityManager, String) - Static method in class org.apache.spark.ui.SparkUI
 
createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.api.java.DataType
Creates a MapType by specifying the data type of keys (keyType) and values (keyType).
createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
Creates a MapType by specifying the data type of keys (keyType), the data type of values (keyType), and whether values contain any null value (valueContainsNull).
createMesosTask(TaskDescription, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Turn a Spark TaskDescription into a Mesos task
createMetricsSystem(String, SparkConf, SecurityManager) - Static method in class org.apache.spark.metrics.MetricsSystem
 
createNewSparkContext(SparkConf) - Static method in class org.apache.spark.streaming.StreamingContext
 
createNewSparkContext(String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.streaming.StreamingContext
 
createParquetFile(Class<?>, String, boolean, Configuration) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
:: Experimental :: Creates an empty parquet file with the schema of class beanClass, which can be registered as a table.
createParquetFile(String, boolean, Configuration, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates an empty parquet file with the schema of class A, which can be registered as a table.
createPathFromString(String, JobConf) - Static method in class org.apache.spark.SparkHadoopWriter
 
createPathFromString(String, JobConf) - Static method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
createPlan(String) - Static method in class org.apache.spark.sql.hive.HiveQl
Creates LogicalPlan for a given HiveQL string.
createPlanForView(Table, Option<String>) - Static method in class org.apache.spark.sql.hive.HiveQl
Creates LogicalPlan for a given VIEW
createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPythonWorker(String, Map<String, String>) - Method in class org.apache.spark.SparkEnv
 
createRecordFilter(Seq<Expression>) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Create a FixedLengthBinaryRecordReader
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamFileInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler that always redirects the user to the given path
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.json.DefaultSource
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.parquet.DefaultSource
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
Returns a new base relation with the given parameters.
createRoutingTables(EdgeRDD<?>, Partitioner) - Static method in class org.apache.spark.graphx.VertexRDD
 
createSchemaRDD(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
Creates a SchemaRDD from an RDD of case classes.
createServlet(JettyUtils.ServletParams<T>, SecurityManager, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
 
createServletHandler(String, JettyUtils.ServletParams<T>, SecurityManager, String, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
Create a context handler that responds to a request with the given path prefix
createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a context handler that responds to a request with the given path prefix
createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler for serving files from a static directory
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from a Kafka Broker.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from a Kafka Broker.
createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Create an InputDStream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Create a Java-friendly InputDStream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.api.java.DataType
Creates a StructField by specifying the name (name), data type (dataType) and whether values of this field can be null values (nullable).
createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
Creates a StructField with empty metadata.
createStructType(List<StructField>) - Static method in class org.apache.spark.sql.api.java.DataType
Creates a StructType with the given list of StructFields (fields).
createStructType(StructField[]) - Static method in class org.apache.spark.sql.api.java.DataType
Creates a StructType with the given StructField array (fields).
createTable(String, boolean, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.hive.HiveContext
Creates a table using the schema of the given class.
createTable(String, String, Seq<Attribute>, boolean, Option<CreateTableDesc>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
Create table with specified database, table name, table description and schema
CreateTableAsSelect - Class in org.apache.spark.sql.hive.execution
:: Experimental :: Create table and insert the query result into it.
CreateTableAsSelect(String, String, LogicalPlan, boolean, Option<CreateTableDesc>) - Constructor for class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
CreateTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
CreateTableUsing - Class in org.apache.spark.sql.sources
 
CreateTableUsing(String, String, Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CreateTableUsing
 
createTempDir(String, String) - Static method in class org.apache.spark.util.Utils
Create a temporary directory inside the given parent directory.
createTempLocalBlock() - Method in class org.apache.spark.storage.DiskBlockManager
Produces a unique block id and File suitable for storing local intermediate results.
createTempShuffleBlock() - Method in class org.apache.spark.storage.DiskBlockManager
Produces a unique block id and File suitable for storing shuffled intermediate results.
createTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
createUsingIndex(Iterator<Product2<Object, VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Similar effect as aggregateUsingIndex((a, b) => a)
createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
 
creationSite() - Method in class org.apache.spark.rdd.RDD
User code that created this RDD (e.g.
creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
 
credentialsProvider() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
CrossValidator - Class in org.apache.spark.ml.tuning
:: AlphaComponent :: K-fold cross validation.
CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
 
CrossValidatorModel - Class in org.apache.spark.ml.tuning
:: AlphaComponent :: Model from k-fold cross validation.
CrossValidatorModel(CrossValidator, ParamMap, Model<?>) - Constructor for class org.apache.spark.ml.tuning.CrossValidatorModel
 
CrossValidatorParams - Interface in org.apache.spark.ml.tuning
CSV_DEFAULT_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CsvSink - Class in org.apache.spark.metrics.sink
 
CsvSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.CsvSink
 
currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
 
currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
 
currentInterval(Duration) - Static method in class org.apache.spark.streaming.Interval
 
currentLocalityIndex() - Method in class org.apache.spark.scheduler.TaskSetManager
 
currentResult() - Method in interface org.apache.spark.partial.ApproximateEvaluator
 
currentResult() - Method in class org.apache.spark.partial.CountEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
currentResult() - Method in class org.apache.spark.partial.MeanEvaluator
 
currentResult() - Method in class org.apache.spark.partial.SumEvaluator
 
currentTime() - Method in interface org.apache.spark.streaming.util.Clock
 
currentTime() - Method in class org.apache.spark.streaming.util.ManualClock
 
currentTime() - Method in class org.apache.spark.streaming.util.SystemClock
 
currentUnrollMemory() - Method in class org.apache.spark.storage.MemoryStore
Return the amount of memory currently occupied for unrolling blocks across all threads.
currentUnrollMemoryForThisThread() - Method in class org.apache.spark.storage.MemoryStore
Return the amount of memory currently occupied for unrolling blocks by this thread.
currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
 

D

DAGScheduler - Class in org.apache.spark.scheduler
The high-level scheduling layer that implements stage-oriented scheduling.
DAGScheduler(SparkContext, TaskScheduler, LiveListenerBus, MapOutputTrackerMaster, BlockManagerMaster, SparkEnv, Clock) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
DAGScheduler(SparkContext, TaskScheduler) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
DAGScheduler(SparkContext) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
dagScheduler() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
dagScheduler() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
dagScheduler() - Method in class org.apache.spark.SparkContext
 
DAGSchedulerActorSupervisor - Class in org.apache.spark.scheduler
 
DAGSchedulerActorSupervisor(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerActorSupervisor
 
DAGSchedulerEvent - Interface in org.apache.spark.scheduler
Types of events that can be handled by the DAGScheduler.
DAGSchedulerEventProcessActor - Class in org.apache.spark.scheduler
 
DAGSchedulerEventProcessActor(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerEventProcessActor
 
DAGSchedulerSource - Class in org.apache.spark.scheduler
 
DAGSchedulerSource(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerSource
 
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
 
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
data() - Method in class org.apache.spark.storage.BlockResult
 
data() - Method in class org.apache.spark.storage.PutResult
 
data() - Method in class org.apache.spark.util.Distribution
 
data() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
data() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
database() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
databaseName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
dataDeserialize(BlockId, ByteBuffer, Serializer) - Method in class org.apache.spark.storage.BlockManager
Deserializes a ByteBuffer into an iterator of values and disposes of it when the end of the iterator is reached.
dataIncludesKey() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
dataSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
dataSerialize(BlockId, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
Serializes into a byte buffer.
dataSerializeStream(BlockId, OutputStream, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
Serializes into a stream.
DataSinks() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
DataSourceStrategy - Class in org.apache.spark.sql.sources
A Strategy for planning scans over data sources defined using the sources API.
DataSourceStrategy() - Constructor for class org.apache.spark.sql.sources.DataSourceStrategy
 
DataType - Class in org.apache.spark.sql.api.java
The base type of all Spark SQL data types.
DataType() - Constructor for class org.apache.spark.sql.api.java.DataType
 
dataType() - Method in class org.apache.spark.sql.columnar.NativeColumnType
 
dataType() - Method in class org.apache.spark.sql.execution.PythonUDF
 
dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
DataTypeConversions - Class in org.apache.spark.sql.types.util
 
DataTypeConversions() - Constructor for class org.apache.spark.sql.types.util.DataTypeConversions
 
DataValidators - Class in org.apache.spark.mllib.util
:: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
 
DATE - Class in org.apache.spark.sql.columnar
 
DATE() - Constructor for class org.apache.spark.sql.columnar.DATE
 
DateColumnAccessor - Class in org.apache.spark.sql.columnar
 
DateColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DateColumnAccessor
 
DateColumnBuilder - Class in org.apache.spark.sql.columnar
 
DateColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DateColumnBuilder
 
DateColumnStats - Class in org.apache.spark.sql.columnar
 
DateColumnStats() - Constructor for class org.apache.spark.sql.columnar.DateColumnStats
 
DateType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the DateType object.
DateType - Class in org.apache.spark.sql.api.java
The data type representing java.sql.Date values.
datum() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
 
DDLParser - Class in org.apache.spark.sql.sources
A parser for foreign DDL commands.
DDLParser() - Constructor for class org.apache.spark.sql.sources.DDLParser
 
dead(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
decimalMetadata() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
decimalMetastoreString(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
 
DecimalType - Class in org.apache.spark.sql.api.java
The data type representing java.math.BigDecimal values.
DecimalType(int, int) - Constructor for class org.apache.spark.sql.api.java.DecimalType
 
DecimalType() - Constructor for class org.apache.spark.sql.api.java.DecimalType
 
decimalTypeInfo(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
 
decimalTypeInfoToCatalyst(PrimitiveObjectInspector) - Static method in class org.apache.spark.sql.hive.HiveShim
 
DecisionTree - Class in org.apache.spark.mllib.tree
:: Experimental :: A class which implements a decision tree learning algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
 
DecisionTreeMetadata - Class in org.apache.spark.mllib.tree.impl
Learning and dataset metadata for DecisionTree.
DecisionTreeMetadata(int, long, int, int, Map<Object, Object>, Set<Object>, int[], Impurity, Enumeration.Value, int, int, double, int, int) - Constructor for class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
:: Experimental :: Decision tree model for classification or regression.
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
decoder() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
decoder(ByteBuffer, NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
decreaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
 
deepCopy() - Method in class org.apache.spark.mllib.tree.model.Node
Returns a deep copy of the subtree rooted at this node.
DEFAULT_BUFFER_SIZE() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
 
DEFAULT_LOG_DIR() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
DEFAULT_MINIMUM_SHARE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_POOL_NAME() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_PORT() - Static method in class org.apache.spark.ui.SparkUI
 
DEFAULT_PREFIX() - Method in class org.apache.spark.metrics.MetricsConfig
 
DEFAULT_RETAINED_JOBS() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_SCHEDULER_FILE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_SCHEDULING_MODE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_WEIGHT() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
defaultCorrName() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
defaultFilter(Path) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
 
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
As of Spark 1.0.0, defaultMinSplits is deprecated, use JavaSparkContext.defaultMinPartitions() instead
defaultMinSplits() - Method in class org.apache.spark.SparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
defaultParallelism() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
defaultParallelism() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
defaultParallelism() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
defaultParallelism() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
defaultParallelism() - Method in class org.apache.spark.SparkContext
Default level of parallelism to use when not given by user (e.g.
defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Returns default configuration for the boosting algorithm
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultPartitioner(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
defaultProbabilities() - Method in class org.apache.spark.util.Distribution
 
defaultSize() - Method in class org.apache.spark.sql.columnar.ColumnType
 
defaultSizeInBytes() - Method in interface org.apache.spark.sql.SQLConf
The default size in bytes to assign to a logical operator's estimation statistics.
DefaultSource - Class in org.apache.spark.sql.json
 
DefaultSource() - Constructor for class org.apache.spark.sql.json.DefaultSource
 
DefaultSource - Class in org.apache.spark.sql.parquet
Allows creation of parquet based tables using the syntax CREATE TEMPORARY TABLE ...
DefaultSource() - Constructor for class org.apache.spark.sql.parquet.DefaultSource
 
defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
Construct a default set of parameters for DecisionTree
defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
 
defaultValue() - Method in class org.apache.spark.ml.param.Param
 
DeferredObjectAdapter - Class in org.apache.spark.sql.hive
 
DeferredObjectAdapter(ObjectInspector) - Constructor for class org.apache.spark.sql.hive.DeferredObjectAdapter
 
degrees() - Method in class org.apache.spark.graphx.GraphOps
The degree of each vertex in the graph.
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult
Returns the degree(s) of freedom of the hypothesis test.
delaySeconds() - Method in class org.apache.spark.streaming.Checkpoint
 
delegate() - Method in class org.apache.spark.InterruptibleIterator
 
deleteAllCheckpoints() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
Call this after training is finished to delete any remaining checkpoints.
deleteOldFiles() - Method in class org.apache.spark.util.logging.RollingFileAppender
Retain only last few files
deleteRecursively(File) - Static method in class org.apache.spark.util.Utils
Delete a file or directory and its contents recursively.
deleteRecursively(TachyonFile, TachyonFS) - Static method in class org.apache.spark.util.Utils
Delete a file or directory and its contents recursively.
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
Creates a column-major dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from a double array.
DenseMatrix - Class in org.apache.spark.mllib.linalg
Column-major dense matrix.
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
 
DenseVector - Class in org.apache.spark.mllib.linalg
A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
 
dependencies() - Method in class org.apache.spark.rdd.RDD
Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
Dependency<T> - Class in org.apache.spark
:: DeveloperApi :: Base class for dependencies.
Dependency() - Constructor for class org.apache.spark.Dependency
 
deps() - Method in class org.apache.spark.rdd.CoGroupPartition
 
depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Get depth of tree.
DeregisterReceiver - Class in org.apache.spark.streaming.scheduler
 
DeregisterReceiver(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
desc() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
DescribeCommand - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
DescribeCommand(SparkPlan, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.DescribeCommand
 
describedTable() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
DescribeHiveTableCommand - Class in org.apache.spark.sql.hive.execution
Implementation for "describe [extended] table".
DescribeHiveTableCommand(MetastoreRelation, Seq<Attribute>, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
description() - Method in class org.apache.spark.ExceptionFailure
 
description() - Method in class org.apache.spark.storage.StorageLevel
 
description() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
DeserializationStream - Class in org.apache.spark.serializer
:: DeveloperApi :: A stream for reading serialized objects.
DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
 
deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserialize(Object) - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
Convert a SQL datum to the user type
deserialize(Object) - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
Convert a SQL datum to the user type
deserialize(Object) - Method in class org.apache.spark.sql.api.java.UserDefinedType
Convert a SQL datum to the user type
deserialize(byte[], ClassTag<T>) - Static method in class org.apache.spark.sql.execution.SparkSqlSerializer
 
deserialize(Writable) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
deserialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
deserialize(byte[]) - Static method in class org.apache.spark.util.Utils
Deserialize an object using Java serialization
deserialize(byte[], ClassLoader) - Static method in class org.apache.spark.util.Utils
Deserialize an object using Java serialization and the given ClassLoader
deserialized() - Method in class org.apache.spark.storage.MemoryEntry
 
deserialized() - Method in class org.apache.spark.storage.StorageLevel
 
deserializeFilterExpressions(Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
Note: Inside the Hadoop API we only have access to Configuration, not to SparkContext, so we cannot use broadcasts to convey the actual filter predicate.
deserializeLongValue(byte[]) - Static method in class org.apache.spark.util.Utils
Deserialize a Long value (used for PythonPartitioner)
deserializeMapStatuses(byte[]) - Static method in class org.apache.spark.MapOutputTracker
 
deserializePlan(InputStream, Class<?>) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserializeStream(InputStream, ClassLoader) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserializeViaNestedStream(InputStream, SerializerInstance, Function1<DeserializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Deserialize via nested stream using specific serializer
deserializeWithDependencies(ByteBuffer) - Static method in class org.apache.spark.scheduler.Task
Deserialize the list of dependencies in a task serialized with serializeWithDependencies, and return the task itself as a serialized ByteBuffer.
destinationToken() - Static method in class org.apache.spark.sql.hive.HiveQl
 
destroy() - Method in class org.apache.spark.broadcast.Broadcast
Destroy all data and metadata related to this broadcast variable.
destroy(boolean) - Method in class org.apache.spark.broadcast.Broadcast
Destroy all data and metadata related to this broadcast variable.
destroyPythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
 
destTableId() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
details() - Method in class org.apache.spark.scheduler.Stage
 
details() - Method in class org.apache.spark.scheduler.StageInfo
 
determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
Determines the bounds for range partitioning from candidates with weights indicating how many items each represents.
DeveloperApi - Annotation Type in org.apache.spark.annotation
A lower-level, unstable API intended for developers.
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a diagonal matrix in DenseMatrix format from the supplied values.
dialect() - Method in class org.apache.spark.sql.hive.HiveContext
 
dialect() - Method in interface org.apache.spark.sql.SQLConf
The SQL dialect that is used when parsing queries.
DictionaryEncoding - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
DictionaryEncoding.Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
DictionaryEncoding.Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
diff(Self) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Hides vertices that are the same between this and other.
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
Hides vertices that are the same between this and other; for vertices that are different, keeps the values from other.
dir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
dir() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
DirectTaskResult<T> - Class in org.apache.spark.scheduler
A TaskResult that contains the task's return value and accumulator updates.
DirectTaskResult(ByteBuffer, Map<Object, Object>, TaskMetrics) - Constructor for class org.apache.spark.scheduler.DirectTaskResult
 
DirectTaskResult() - Constructor for class org.apache.spark.scheduler.DirectTaskResult
 
disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions
Allows for the spark.hadoop.validateOutputSpecs checks to be disabled on a case-by-case basis; see SPARK-4835 for more details.
disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
disconnected() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
 
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
 
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
 
diskBlockManager() - Method in class org.apache.spark.storage.BlockManager
 
DiskBlockManager - Class in org.apache.spark.storage
Creates and maintains the logical mapping between logical blocks and physical on-disk locations.
DiskBlockManager(BlockManager, SparkConf) - Constructor for class org.apache.spark.storage.DiskBlockManager
 
DiskBlockObjectWriter - Class in org.apache.spark.storage
BlockObjectWriter which writes directly to a file on disk.
DiskBlockObjectWriter(BlockId, File, Serializer, int, Function1<OutputStream, OutputStream>, boolean, ShuffleWriteMetrics) - Constructor for class org.apache.spark.storage.DiskBlockObjectWriter
 
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
diskSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
diskSize() - Method in class org.apache.spark.storage.BlockStatus
 
diskSize() - Method in class org.apache.spark.storage.RDDInfo
 
diskStore() - Method in class org.apache.spark.storage.BlockManager
 
DiskStore - Class in org.apache.spark.storage
Stores BlockManager blocks on disk.
DiskStore(BlockManager, DiskBlockManager) - Constructor for class org.apache.spark.storage.DiskStore
 
diskUsed() - Method in class org.apache.spark.storage.StorageStatus
Return the disk space used by this block manager.
diskUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
Return the disk space used by the given RDD in this block manager in O(1) time.
dispose(ByteBuffer) - Static method in class org.apache.spark.storage.BlockManager
Attempt to clean up a ByteBuffer if it is memory-mapped.
dist(Vector) - Method in class org.apache.spark.util.Vector
 
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD containing the distinct elements in this RDD.
Distinct - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Computes the set of distinct input rows using a HashSet.
Distinct(boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Distinct
 
distinct() - Method in class org.apache.spark.sql.SchemaRDD
 
distinct(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
Represents a distributively stored matrix backed by one or more RDDs.
Distribution - Class in org.apache.spark.util
Util for getting some stats from a small sample of numeric values, with some handy summary functions.
Distribution(double[], int, int) - Constructor for class org.apache.spark.util.Distribution
 
Distribution(Traversable<Object>) - Constructor for class org.apache.spark.util.Distribution
 
DIV() - Static method in class org.apache.spark.sql.hive.HiveQl
 
div(Duration) - Method in class org.apache.spark.streaming.Duration
 
divide(double) - Method in class org.apache.spark.util.Vector
 
doc() - Method in class org.apache.spark.ml.param.Param
 
doCancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
doCheckpoint() - Method in class org.apache.spark.rdd.RDD
Performs the checkpointing of this RDD by saving this.
doCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
DoCheckpoint - Class in org.apache.spark.streaming.scheduler
 
DoCheckpoint(Time) - Constructor for class org.apache.spark.streaming.scheduler.DoCheckpoint
 
doCleanupBroadcast(long, boolean) - Method in class org.apache.spark.ContextCleaner
Perform broadcast cleanup.
doCleanupRDD(int, boolean) - Method in class org.apache.spark.ContextCleaner
Perform RDD cleanup.
doCleanupShuffle(int, boolean) - Method in class org.apache.spark.ContextCleaner
Perform shuffle cleanup, asynchronously.
doesDirectoryContainAnyNewFiles(File, long) - Static method in class org.apache.spark.util.Utils
Determines if a directory contains any files newer than cutoff seconds.
doKillExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
Request that the ApplicationMaster kill the specified executors.
doRequestTotalExecutors(int) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
Request executors from the ApplicationMaster by specifying the total number desired.
dot(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
dot(x, y)
dot(Vector) - Method in class org.apache.spark.util.Vector
 
DOUBLE - Class in org.apache.spark.sql.columnar
 
DOUBLE() - Constructor for class org.apache.spark.sql.columnar.DOUBLE
 
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
DoubleColumnAccessor - Class in org.apache.spark.sql.columnar
 
DoubleColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DoubleColumnAccessor
 
DoubleColumnBuilder - Class in org.apache.spark.sql.columnar
 
DoubleColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnBuilder
 
DoubleColumnStats - Class in org.apache.spark.sql.columnar
 
DoubleColumnStats() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnStats
 
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function
A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleParam - Class in org.apache.spark.ml.param
Specialized version of Param[Double] for Java.
DoubleParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
 
DoubleRDDFunctions - Class in org.apache.spark.rdd
Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
 
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
 
doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
 
doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
 
DoubleType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the DoubleType object.
DoubleType - Class in org.apache.spark.sql.api.java
The data type representing double and Double values.
doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
 
driver() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
driver() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
DRIVER_AKKA_ACTOR_NAME() - Method in class org.apache.spark.storage.BlockManagerMaster
 
DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
 
driverActor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
driverActor() - Method in class org.apache.spark.storage.BlockManagerMaster
 
driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
 
driverSideSetup() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
dropFromMemory(BlockId, Either<Object[], ByteBuffer>) - Method in class org.apache.spark.storage.BlockManager
Drop a block from memory, possibly putting it on disk if applicable.
droppedBlocks() - Method in class org.apache.spark.storage.PutResult
 
droppedBlocks() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
 
DropTable - Class in org.apache.spark.sql.hive
 
DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.DropTable
 
DropTable - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi :: Drops a table from the metastore and removes it if it is cached.
DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DropTable
 
dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
Drops the temporary table with the given table name in the catalog.
Dst - Static variable in class org.apache.spark.graphx.TripletFields
Expose the destination and edge fields but not the source field.
dstAttr() - Method in class org.apache.spark.graphx.EdgeContext
The vertex attribute of the edge's destination vertex.
dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
The destination vertex attribute
dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
dstId() - Method in class org.apache.spark.graphx.Edge
 
dstId() - Method in class org.apache.spark.graphx.EdgeContext
The vertex id of the edge's destination vertex.
dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
dstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
DStream<T> - Class in org.apache.spark.streaming.dstream
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
 
DStreamCheckpointData<T> - Class in org.apache.spark.streaming.dstream
 
DStreamCheckpointData(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStreamCheckpointData
 
DStreamGraph - Class in org.apache.spark.streaming
 
DStreamGraph() - Constructor for class org.apache.spark.streaming.DStreamGraph
 
DTStatsAggregator - Class in org.apache.spark.mllib.tree.impl
DecisionTree statistics aggregator for a node.
DTStatsAggregator(DecisionTreeMetadata, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.impl.DTStatsAggregator
 
DummyCategoricalSplit - Class in org.apache.spark.mllib.tree.model
Split with no acceptable feature values for categorical features.
DummyCategoricalSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyCategoricalSplit
 
DummyHighSplit - Class in org.apache.spark.mllib.tree.model
Split with maximum threshold for continuous features.
DummyHighSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyHighSplit
 
DummyLowSplit - Class in org.apache.spark.mllib.tree.model
Split with minimum threshold for continuous features.
DummyLowSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyLowSplit
 
dumpTree(Node, StringBuilder, int) - Static method in class org.apache.spark.sql.hive.HiveQl
 
duration() - Method in class org.apache.spark.scheduler.TaskInfo
 
Duration - Class in org.apache.spark.streaming
 
Duration(long) - Constructor for class org.apache.spark.streaming.Duration
 
duration() - Method in class org.apache.spark.streaming.Interval
 
Durations - Class in org.apache.spark.streaming
 
Durations() - Constructor for class org.apache.spark.streaming.Durations
 

E

e() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
e() - Method in class org.apache.spark.streaming.scheduler.ErrorReported
 
Edge<ED> - Class in org.apache.spark.graphx
A single directed edge consisting of a source id, target id, and the data associated with the edge.
Edge(long, long, ED) - Constructor for class org.apache.spark.graphx.Edge
 
EdgeActiveness - Enum in org.apache.spark.graphx.impl
Criteria for filtering edges based on activeness.
edgeArraySortDataFormat() - Static method in class org.apache.spark.graphx.Edge
 
edgeArraySortDataFormat() - Static method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
EdgeContext<VD,ED,A> - Class in org.apache.spark.graphx
Represents an edge along with its neighboring vertices and allows sending messages along the edge.
EdgeContext() - Constructor for class org.apache.spark.graphx.EdgeContext
 
EdgeDirection - Class in org.apache.spark.graphx
The direction of a directed edge relative to a vertex.
edgeListFile(SparkContext, String, boolean, int, StorageLevel, StorageLevel) - Static method in class org.apache.spark.graphx.GraphLoader
Loads a graph from an edge list formatted file where each line contains two integers: a source id and a target id.
EdgeOnly - Static variable in class org.apache.spark.graphx.TripletFields
Expose only the edge field and not the source or destination field.
EdgePartition<ED,VD> - Class in org.apache.spark.graphx.impl
A collection of edges, along with referenced vertex attributes and an optional active vertex set for filtering computation on the edges.
EdgePartition(int[], int[], Object, GraphXPrimitiveKeyOpenHashMap<Object, Object>, GraphXPrimitiveKeyOpenHashMap<Object, Object>, long[], Object, Option<OpenHashSet<Object>>, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgePartition
 
EdgePartitionBuilder<ED,VD> - Class in org.apache.spark.graphx.impl
Constructs an EdgePartition from scratch.
EdgePartitionBuilder(int, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgePartitionBuilder
 
edgePartitionToMsgs(int, EdgePartition<?, ?>) - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition
Generate a `RoutingTableMessage` for each vertex referenced in `edgePartition`.
EdgeRDD<ED> - Class in org.apache.spark.graphx
EdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each partition for performance.
EdgeRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.EdgeRDD
 
EdgeRDDImpl<ED,VD> - Class in org.apache.spark.graphx.impl
 
EdgeRDDImpl(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, StorageLevel, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgeRDDImpl
 
edges() - Method in class org.apache.spark.graphx.Graph
An RDD containing the edges and their associated attributes.
edges() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
edges() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
EdgeTriplet<VD,ED> - Class in org.apache.spark.graphx
An edge triplet represents an edge along with the vertex attributes of its neighboring vertices.
EdgeTriplet() - Constructor for class org.apache.spark.graphx.EdgeTriplet
 
EdgeWithLocalIds<ED> - Class in org.apache.spark.graphx.impl
Add a new edge to the partition.
EdgeWithLocalIds(long, long, int, int, ED) - Constructor for class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
EigenValueDecomposition - Class in org.apache.spark.mllib.linalg
:: Experimental :: Compute eigen-decomposition.
EigenValueDecomposition() - Constructor for class org.apache.spark.mllib.linalg.EigenValueDecomposition
 
Either() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges originating from *or* arriving at a vertex of interest.
elementClassTag() - Method in class org.apache.spark.rdd.RDD
 
elementIds() - Method in class org.apache.spark.mllib.recommendation.InLinkBlock
 
elementIds() - Method in class org.apache.spark.mllib.recommendation.OutLinkBlock
 
elements() - Method in class org.apache.spark.util.Vector
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
emittedTaskSizeWarning() - Method in class org.apache.spark.scheduler.TaskSetManager
 
empty() - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition
 
empty() - Static method in class org.apache.spark.ml.param.ParamMap
Returns an empty param map.
empty() - Static method in class org.apache.spark.scheduler.EventLoggingInfo
 
empty() - Static method in class org.apache.spark.storage.BlockStatus
 
empty() - Method in class org.apache.spark.util.TimeStampedHashMap
 
empty() - Method in class org.apache.spark.util.TimeStampedHashSet
 
empty() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
emptyJson() - Static method in class org.apache.spark.util.Utils
Return an empty JSON object
emptyNode(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return a node with the given node id (but nothing else set).
emptyRDD() - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD that has no partitions or elements.
EmptyRDD<T> - Class in org.apache.spark.rdd
An RDD that has no partitions and no elements.
EmptyRDD(SparkContext, ClassTag<T>) - Constructor for class org.apache.spark.rdd.EmptyRDD
 
emptyRDD(ClassTag<T>) - Method in class org.apache.spark.SparkContext
Get an RDD that has no partitions or elements.
enableLogForwarding() - Static method in class org.apache.spark.sql.parquet.ParquetRelation
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
encoder(NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
end() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystStructConverter
 
endIdx() - Method in class org.apache.spark.util.Distribution
 
endTime() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
endTime() - Method in class org.apache.spark.streaming.Interval
 
endTime() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
endTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
enforceCorrectType(Object, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD
 
enqueueFailedTask(TaskSetManager, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskResultGetter
 
enqueueSuccessfulTask(TaskSetManager, long, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskResultGetter
 
EnsembleCombiningStrategy - Class in org.apache.spark.mllib.tree.configuration
Enum to select ensemble combining strategy for base learners
EnsembleCombiningStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
entries() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
Entropy - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating entropy during binary classification.
Entropy() - Constructor for class org.apache.spark.mllib.tree.impurity.Entropy
 
EntropyAggregator - Class in org.apache.spark.mllib.tree.impurity
Class for updating views of a vector of sufficient statistics, in order to compute impurity from a sample.
EntropyAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.EntropyAggregator
 
EntropyCalculator - Class in org.apache.spark.mllib.tree.impurity
Stores statistics for one (node, feature, bin) for calculating impurity.
EntropyCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.EntropyCalculator
 
entrySet() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
env() - Method in class org.apache.spark.api.java.JavaSparkContext
 
env() - Method in class org.apache.spark.scheduler.TaskSetManager
 
env() - Method in class org.apache.spark.SparkContext
 
env() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
env() - Method in class org.apache.spark.streaming.StreamingContext
 
environmentDetails() - Method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
environmentDetails(SparkConf, String, Seq<String>, Seq<String>) - Static method in class org.apache.spark.SparkEnv
Return a map representation of jvm information, Spark properties, system properties, and class paths.
EnvironmentListener - Class in org.apache.spark.ui.env
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the EnvironmentTab
EnvironmentListener() - Constructor for class org.apache.spark.ui.env.EnvironmentListener
 
environmentListener() - Method in class org.apache.spark.ui.SparkUI
 
EnvironmentPage - Class in org.apache.spark.ui.env
 
EnvironmentPage(EnvironmentTab) - Constructor for class org.apache.spark.ui.env.EnvironmentPage
 
EnvironmentTab - Class in org.apache.spark.ui.env
 
EnvironmentTab(SparkUI) - Constructor for class org.apache.spark.ui.env.EnvironmentTab
 
environmentUpdateFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
environmentUpdateToJson(SparkListenerEnvironmentUpdate) - Static method in class org.apache.spark.util.JsonProtocol
 
envVars() - Method in class org.apache.spark.sql.execution.PythonUDF
 
epoch() - Method in class org.apache.spark.scheduler.Task
 
epoch() - Method in class org.apache.spark.scheduler.TaskSetManager
 
EPSILON() - Static method in class org.apache.spark.mllib.util.MLUtils
 
equals(Object) - Method in class org.apache.spark.graphx.EdgeDirection
 
equals(Object) - Method in class org.apache.spark.HashPartitioner
 
equals(Object) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
equals(Object) - Method in interface org.apache.spark.mllib.linalg.Vector
 
equals(IndexedSeq<Object>, double[], IndexedSeq<Object>, double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Check equality between sparse/dense vectors
equals(Object) - Method in class org.apache.spark.mllib.recommendation.ALSPartitioner
 
equals(Object) - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
equals(Object) - Method in class org.apache.spark.RangePartitioner
 
equals(Object) - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
equals(Object) - Method in class org.apache.spark.scheduler.AccumulableInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.InputFormatInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.SplitInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.Stage
 
equals(Object) - Method in class org.apache.spark.sql.api.java.ArrayType
 
equals(Object) - Method in class org.apache.spark.sql.api.java.DecimalType
 
equals(Object) - Method in class org.apache.spark.sql.api.java.MapType
 
equals(Object) - Method in class org.apache.spark.sql.api.java.Row
 
equals(Object) - Method in class org.apache.spark.sql.api.java.StructField
 
equals(Object) - Method in class org.apache.spark.sql.api.java.StructType
 
equals(Object) - Method in class org.apache.spark.sql.api.java.UserDefinedType
 
equals(Object) - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
equals(Object) - Method in class org.apache.spark.sql.sources.LogicalRelation
 
equals(Object) - Method in class org.apache.spark.storage.BlockId
 
equals(Object) - Method in class org.apache.spark.storage.BlockManagerId
 
equals(Object) - Method in class org.apache.spark.storage.StorageLevel
 
EqualTo - Class in org.apache.spark.sql.sources
 
EqualTo(String, Object) - Constructor for class org.apache.spark.sql.sources.EqualTo
 
error(SchedulerDriver, String) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
error(SchedulerDriver, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
error(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
error() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
error() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
errorMessage() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
ErrorReported - Class in org.apache.spark.streaming.scheduler
 
ErrorReported(String, Throwable) - Constructor for class org.apache.spark.streaming.scheduler.ErrorReported
 
estimate(Object) - Static method in class org.apache.spark.util.SizeEstimator
 
Estimator<M extends Model<M>> - Class in org.apache.spark.ml
:: AlphaComponent :: Abstract class for estimators that fit models to data.
Estimator() - Constructor for class org.apache.spark.ml.Estimator
 
estimator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
param for the estimator to be cross-validated
estimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
param for estimator param maps
eval(Row) - Method in class org.apache.spark.sql.execution.PythonUDF
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
evaluate(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
evaluate(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Evaluator
Evaluates the output.
EvaluatePython - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Evaluates a PythonUDF, appending the result to the end of the input tuple.
EvaluatePython(PythonUDF, LogicalPlan, AttributeReference) - Constructor for class org.apache.spark.sql.execution.EvaluatePython
 
Evaluator - Class in org.apache.spark.ml
:: AlphaComponent :: Abstract class for evaluators that compute metrics from predictions.
Evaluator() - Constructor for class org.apache.spark.ml.Evaluator
 
evaluator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
param for the evaluator for selection
event() - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
eventLogDir() - Method in class org.apache.spark.SparkContext
 
eventLogger() - Method in class org.apache.spark.SparkContext
 
EventLoggingInfo - Class in org.apache.spark.scheduler
Information needed to process the event logs associated with an application.
EventLoggingInfo(Seq<Path>, String, Option<CompressionCodec>, boolean) - Constructor for class org.apache.spark.scheduler.EventLoggingInfo
 
EventLoggingListener - Class in org.apache.spark.scheduler
A SparkListener that logs events to persistent storage.
EventLoggingListener(String, String, SparkConf, Configuration) - Constructor for class org.apache.spark.scheduler.EventLoggingListener
 
EventLoggingListener(String, String, SparkConf) - Constructor for class org.apache.spark.scheduler.EventLoggingListener
 
eventProcessActor() - Method in class org.apache.spark.scheduler.DAGScheduler
 
EventTransformer - Class in org.apache.spark.streaming.flume
A simple object that provides the implementation of readExternal and writeExternal for both the wrapper classes for Flume-style Events.
EventTransformer() - Constructor for class org.apache.spark.streaming.flume.EventTransformer
 
ExamplePoint - Class in org.apache.spark.sql.test
An example class to demonstrate UDT in Scala, Java, and Python.
ExamplePoint(double, double) - Constructor for class org.apache.spark.sql.test.ExamplePoint
 
ExamplePointUDT - Class in org.apache.spark.sql.test
User-defined type for ExamplePoint.
ExamplePointUDT() - Constructor for class org.apache.spark.sql.test.ExamplePointUDT
 
Except - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Returns a table with the elements from left that are not in right using the built-in spark subtract function.
Except(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Except
 
except(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD
Performs a relational except on two SchemaRDDs
exception() - Method in class org.apache.spark.scheduler.JobFailed
 
EXCEPTION_PRINT_INTERVAL() - Method in class org.apache.spark.scheduler.TaskSetManager
 
ExceptionFailure - Class in org.apache.spark
:: DeveloperApi :: Task failed due to a runtime exception.
ExceptionFailure(String, String, StackTraceElement[], String, Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
 
ExceptionFailure(Throwable, Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
 
exceptionFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
exceptionString(Throwable) - Static method in class org.apache.spark.util.Utils
Return a nice string representation of the exception.
exceptionToJson(Exception) - Static method in class org.apache.spark.util.JsonProtocol
 
Exchange - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Exchange(Partitioning, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Exchange
 
execArgs() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
execId() - Method in class org.apache.spark.ExecutorLostFailure
 
execId() - Method in class org.apache.spark.scheduler.ExecutorAdded
 
execId() - Method in class org.apache.spark.scheduler.ExecutorLost
 
execId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
execId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
 
execute() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
execute() - Method in class org.apache.spark.sql.execution.Aggregate
Substituted version of aggregateExpressions expressions which are used to compute final output rows given a group and the result of all aggregate computations.
execute() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
 
execute() - Method in interface org.apache.spark.sql.execution.Command
 
execute() - Method in class org.apache.spark.sql.execution.Distinct
 
execute() - Method in class org.apache.spark.sql.execution.Except
 
execute() - Method in class org.apache.spark.sql.execution.Exchange
 
execute() - Method in class org.apache.spark.sql.execution.ExecutedCommand
 
execute() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
execute() - Method in class org.apache.spark.sql.execution.ExternalSort
 
execute() - Method in class org.apache.spark.sql.execution.Filter
 
execute() - Method in class org.apache.spark.sql.execution.Generate
 
execute() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
execute() - Method in class org.apache.spark.sql.execution.Intersect
 
execute() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
execute() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
execute() - Method in class org.apache.spark.sql.execution.joins.CartesianProduct
 
execute() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
execute() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
execute() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
execute() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
execute() - Method in class org.apache.spark.sql.execution.Limit
 
execute() - Method in class org.apache.spark.sql.execution.OutputFaker
 
execute() - Method in class org.apache.spark.sql.execution.PhysicalRDD
 
execute() - Method in class org.apache.spark.sql.execution.Project
 
execute() - Method in class org.apache.spark.sql.execution.Sample
 
execute() - Method in class org.apache.spark.sql.execution.Sort
 
execute() - Method in class org.apache.spark.sql.execution.SparkPlan
Runs this query returning the result as an RDD.
execute() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
execute() - Method in class org.apache.spark.sql.execution.Union
 
execute() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
execute() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
execute() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
execute() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
Inserts all rows into the Parquet file.
execute() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
execute(Seq<String>, File) - Static method in class org.apache.spark.util.Utils
Execute a command in the given working directory, throwing an exception if it completes with an exit code other than 0.
executeAndGetOutput(Seq<String>, File, Map<String, String>) - Static method in class org.apache.spark.util.Utils
Execute a command and get its output, throwing an exception if it yields a code other than 0.
executeCollect() - Method in interface org.apache.spark.sql.execution.Command
 
executeCollect() - Method in class org.apache.spark.sql.execution.ExecutedCommand
 
executeCollect() - Method in class org.apache.spark.sql.execution.Limit
A custom implementation modeled after the take function on RDDs but which never runs any job locally.
executeCollect() - Method in class org.apache.spark.sql.execution.SparkPlan
Runs this query returning the result as an array.
executeCollect() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
ExecutedCommand - Class in org.apache.spark.sql.execution
 
ExecutedCommand(RunnableCommand) - Constructor for class org.apache.spark.sql.execution.ExecutedCommand
 
executePlan(LogicalPlan) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
executor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
executor() - Method in class org.apache.spark.scheduler.local.LocalActor
 
executor() - Method in class org.apache.spark.streaming.CheckpointWriter
 
executor_() - Method in class org.apache.spark.streaming.receiver.Receiver
Handler object that runs the receiver.
executorActor() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
executorActorSystemName() - Static method in class org.apache.spark.SparkEnv
 
executorAdded(String, String, String, int, int) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
executorAdded(String, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
ExecutorAdded - Class in org.apache.spark.scheduler
 
ExecutorAdded(String, String) - Constructor for class org.apache.spark.scheduler.ExecutorAdded
 
executorAdded(String, String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
executorAdded() - Method in class org.apache.spark.scheduler.TaskSetManager
 
executorAddress() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
ExecutorAllocationClient - Interface in org.apache.spark
A client that communicates with the cluster manager to request or kill executors.
ExecutorAllocationManager - Class in org.apache.spark
An agent that dynamically allocates and removes executors based on the workload.
ExecutorAllocationManager(ExecutorAllocationClient, LiveListenerBus, SparkConf) - Constructor for class org.apache.spark.ExecutorAllocationManager
 
executorAllocationManager() - Method in class org.apache.spark.SparkContext
 
ExecutorCacheTaskLocation - Class in org.apache.spark.scheduler
A location that includes both a host and an executor id on that host.
ExecutorCacheTaskLocation(String, String) - Constructor for class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
ExecutorData - Class in org.apache.spark.scheduler.cluster
Grouping of data for an executor used by CoarseGrainedSchedulerBackend.
ExecutorData(ActorRef, Address, String, int, int) - Constructor for class org.apache.spark.scheduler.cluster.ExecutorData
 
executorEnvs() - Method in class org.apache.spark.SparkContext
 
ExecutorExited - Class in org.apache.spark.scheduler
 
ExecutorExited(int) - Constructor for class org.apache.spark.scheduler.ExecutorExited
 
executorHeartbeatReceived(String, Tuple4<Object, Object, Object, TaskMetrics>[], BlockManagerId) - Method in class org.apache.spark.scheduler.DAGScheduler
Update metrics for in-progress tasks and let the master know that the BlockManager is still alive.
executorHeartbeatReceived(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Method in interface org.apache.spark.scheduler.TaskScheduler
Update metrics for in-progress tasks and let the master know that the BlockManager is still alive.
executorHeartbeatReceived(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
Update metrics for in-progress tasks and let the master know that the BlockManager is still alive.
executorHost() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
executorId() - Method in class org.apache.spark.Heartbeat
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
executorId() - Method in class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
executorId() - Method in class org.apache.spark.scheduler.TaskDescription
 
executorId() - Method in class org.apache.spark.scheduler.TaskInfo
 
executorId() - Method in class org.apache.spark.scheduler.WorkerOffer
 
executorId() - Method in class org.apache.spark.SparkEnv
 
executorId() - Method in class org.apache.spark.storage.BlockManagerId
 
executorId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor
 
executorIds() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
 
executorIdToBlockManagerId() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
executorIdToStorageStatus() - Method in class org.apache.spark.storage.StorageStatusListener
 
ExecutorLossReason - Class in org.apache.spark.scheduler
Represents an explanation for a executor or whole slave failing or exiting.
ExecutorLossReason(String) - Constructor for class org.apache.spark.scheduler.ExecutorLossReason
 
executorLost(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
executorLost(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, int) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
executorLost(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
ExecutorLost - Class in org.apache.spark.scheduler
 
ExecutorLost(String) - Constructor for class org.apache.spark.scheduler.ExecutorLost
 
executorLost(String, String) - Method in class org.apache.spark.scheduler.Pool
 
executorLost(String, String) - Method in interface org.apache.spark.scheduler.Schedulable
 
executorLost(String, ExecutorLossReason) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
executorLost(String, String) - Method in class org.apache.spark.scheduler.TaskSetManager
Called by TaskScheduler when an executor is lost so we can re-enqueue our tasks
ExecutorLostFailure - Class in org.apache.spark
:: DeveloperApi :: The task failed because the executor that it was running on was lost.
ExecutorLostFailure(String) - Constructor for class org.apache.spark.ExecutorLostFailure
 
executorMemory() - Method in class org.apache.spark.SparkContext
 
executorPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
executorRemoved(String, String, Option<Object>) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
executorRunTime() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
executorSideSetup(int, int, int) - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
ExecutorsListener - Class in org.apache.spark.ui.exec
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the ExecutorsTab
ExecutorsListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.exec.ExecutorsListener
 
executorsListener() - Method in class org.apache.spark.ui.SparkUI
 
ExecutorsPage - Class in org.apache.spark.ui.exec
 
ExecutorsPage(ExecutorsTab, boolean) - Constructor for class org.apache.spark.ui.exec.ExecutorsPage
 
ExecutorsTab - Class in org.apache.spark.ui.exec
 
ExecutorsTab(SparkUI) - Constructor for class org.apache.spark.ui.exec.ExecutorsTab
 
executorSummary() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
ExecutorSummaryInfo - Class in org.apache.spark.ui.exec
Summary information about an executor to display in the UI.
ExecutorSummaryInfo(String, String, int, long, long, int, int, int, int, long, long, long, long, long) - Constructor for class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
ExecutorTable - Class in org.apache.spark.ui.jobs
Stage summary grouped by executors.
ExecutorTable(int, int, StagesTab) - Constructor for class org.apache.spark.ui.jobs.ExecutorTable
 
ExecutorThreadDumpPage - Class in org.apache.spark.ui.exec
 
ExecutorThreadDumpPage(ExecutorsTab) - Constructor for class org.apache.spark.ui.exec.ExecutorThreadDumpPage
 
executorToDuration() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToOutputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksActive() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksComplete() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksFailed() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
ExistingEdgePartitionBuilder<ED,VD> - Class in org.apache.spark.graphx.impl
Constructs an EdgePartition from an existing EdgePartition with the same vertex set.
ExistingEdgePartitionBuilder(GraphXPrimitiveKeyOpenHashMap<Object, Object>, long[], Object, Option<OpenHashSet<Object>>, int, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
 
ExistingRdd - Class in org.apache.spark.sql.execution
 
ExistingRdd(Seq<Attribute>, RDD<Row>) - Constructor for class org.apache.spark.sql.execution.ExistingRdd
 
exitCode() - Method in class org.apache.spark.scheduler.ExecutorExited
 
Experimental - Annotation Type in org.apache.spark.annotation
An experimental user-facing API.
ExplainCommand - Class in org.apache.spark.sql.execution
An explain command for users to see how a command will be executed.
ExplainCommand(LogicalPlan, Seq<Attribute>, boolean, SQLContext) - Constructor for class org.apache.spark.sql.execution.ExplainCommand
 
explainedVariance() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns the explained variance regression score.
explainParams() - Method in interface org.apache.spark.ml.param.Params
Returns the documentation of all params.
explode() - Static method in class org.apache.spark.sql.hive.HiveQl
 
exprs() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
extended() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
ExtendedHiveQlParser - Class in org.apache.spark.sql.hive
A parser that recognizes all HiveQL constructs together with Spark SQL specific extensions.
ExtendedHiveQlParser() - Constructor for class org.apache.spark.sql.hive.ExtendedHiveQlParser
 
externalShuffleServiceEnabled() - Method in class org.apache.spark.storage.BlockManager
 
ExternalSort - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Performs a sort, spilling to disk as needed.
ExternalSort(Seq<SortOrder>, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.ExternalSort
 
externalSortEnabled() - Method in interface org.apache.spark.sql.SQLConf
When true the planner will use the external sort, which may spill to disk.
extraCoresPerSlave() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
extract(ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
extract(ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
Extracts a value out of the buffer at the buffer's current position.
extract(ByteBuffer, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Extracts a value out of the buffer at the buffer's current position and stores in row(ordinal).
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
extractDistribution(Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
extractDoubleDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
extractFn() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
extractLongDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
extractMultiClassCategories(int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Nested method to extract list of eligible categories given an index.
ExtractPythonUdfs - Class in org.apache.spark.sql.execution
Extracts PythonUDFs from operators, rewriting the query plan so that the UDF can be evaluated alone in a batch.
ExtractPythonUdfs() - Constructor for class org.apache.spark.sql.execution.ExtractPythonUdfs
 
extractSingle(MutableRow, int) - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
extractSingle(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
extractTo(MutableRow, int) - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
extractTo(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
extractTo(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
extraStrategies() - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Allows extra strategies to be injected into the query planner at runtime.
eye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate an Identity Matrix in DenseMatrix format.

F

f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns document-based f1-measure averaged by the number of documents
f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns f1-measure for a given label (category)
failed() - Method in class org.apache.spark.scheduler.TaskInfo
 
FAILED() - Static method in class org.apache.spark.TaskState
 
failedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
failedStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
FailedStageTable - Class in org.apache.spark.ui.jobs
 
FailedStageTable(Seq<StageInfo>, String, JobProgressListener, boolean) - Constructor for class org.apache.spark.ui.jobs.FailedStageTable
 
failedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
failedTasks() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
failure() - Method in class org.apache.spark.partial.ApproximateActionListener
 
failureReason() - Method in class org.apache.spark.scheduler.StageInfo
If the stage failed, the reason why.
failuresBySlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
FAIR_SCHEDULER_PROPERTIES() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
FairSchedulableBuilder - Class in org.apache.spark.scheduler
 
FairSchedulableBuilder(Pool, SparkConf) - Constructor for class org.apache.spark.scheduler.FairSchedulableBuilder
 
FairSchedulingAlgorithm - Class in org.apache.spark.scheduler
 
FairSchedulingAlgorithm() - Constructor for class org.apache.spark.scheduler.FairSchedulingAlgorithm
 
fakeClassTag() - Static method in class org.apache.spark.api.java.JavaSparkContext
Produces a ClassTag[T], which is actually just a casted ClassTag[AnyRef].
fakeOutput(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.PhysicalPlanHacks
 
FakeParquetSerDe - Class in org.apache.spark.sql.hive.parquet
A placeholder that allows Spark SQL users to create metastore tables that are stored as parquet files.
FakeParquetSerDe() - Constructor for class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
FALSE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
FalsePositiveRate - Class in org.apache.spark.mllib.evaluation.binary
False positive rate.
FalsePositiveRate() - Constructor for class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
 
falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns false positive rate for a given label (category)
fastSquaredDistance(VectorWithNorm, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
fastSquaredDistance(Vector, double, Vector, double, double) - Static method in class org.apache.spark.mllib.util.MLUtils
Returns the squared Euclidean distance between two vectors.
feature() - Method in class org.apache.spark.mllib.tree.model.Split
 
featureArity() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
featuresCol() - Method in interface org.apache.spark.ml.param.HasFeaturesCol
param for features column name
featureSubset() - Method in class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
FeatureType - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum to describe whether a feature is "continuous" or "categorical"
FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
 
featureType() - Method in class org.apache.spark.mllib.tree.model.Bin
 
featureType() - Method in class org.apache.spark.mllib.tree.model.Split
 
featureUpdate(int, int, double, double) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Faster version of update.
FetchFailed - Class in org.apache.spark
:: DeveloperApi :: Task failed to fetch shuffle data from a remote node.
FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
 
fetchFile(String, File, SparkConf, SecurityManager, Configuration, long, boolean) - Static method in class org.apache.spark.util.Utils
Download a file to target directory.
fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
field() - Method in class org.apache.spark.storage.BroadcastBlockId
 
FieldAccessFinder - Class in org.apache.spark.util
 
FieldAccessFinder(Map<Class<?>, Set<String>>) - Constructor for class org.apache.spark.util.FieldAccessFinder
 
FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
FIFOSchedulableBuilder - Class in org.apache.spark.scheduler
 
FIFOSchedulableBuilder(Pool) - Constructor for class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
FIFOSchedulingAlgorithm - Class in org.apache.spark.scheduler
 
FIFOSchedulingAlgorithm() - Constructor for class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
 
file() - Method in class org.apache.spark.storage.FileSegment
 
file() - Method in class org.apache.spark.storage.TachyonFileSegment
 
FileAppender - Class in org.apache.spark.util.logging
Continuously appends the data from an input stream into the given file.
FileAppender(InputStream, File, int) - Constructor for class org.apache.spark.util.logging.FileAppender
 
fileDir() - Method in class org.apache.spark.HttpFileServer
 
fileExists(TachyonFile) - Method in class org.apache.spark.storage.TachyonBlockManager
 
fileIndex() - Method in class org.apache.spark.util.FileLogger
 
FileInputDStream<K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> - Class in org.apache.spark.streaming.dstream
This class represents an input stream that monitors a Hadoop-compatible filesystem for new files and creates a stream out of them.
FileInputDStream(StreamingContext, String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Constructor for class org.apache.spark.streaming.dstream.FileInputDStream
 
FileInputDStream.FileInputDStreamCheckpointData - Class in org.apache.spark.streaming.dstream
A custom version of the DStreamCheckpointData that stores names of Hadoop files as checkpoint data.
FileInputDStream.FileInputDStreamCheckpointData() - Constructor for class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
FileLogger - Class in org.apache.spark.util
A generic class for logging information to file.
FileLogger(String, SparkConf, Configuration, int, boolean, boolean, Option<FsPermission>) - Constructor for class org.apache.spark.util.FileLogger
 
FileLogger(String, SparkConf, boolean, boolean) - Constructor for class org.apache.spark.util.FileLogger
 
FileLogger(String, SparkConf, boolean) - Constructor for class org.apache.spark.util.FileLogger
 
FileLogger(String, SparkConf) - Constructor for class org.apache.spark.util.FileLogger
 
fileName() - Method in class org.apache.spark.sql.json.JSONRelation
 
filePath() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
filePath() - Method in class org.apache.spark.sql.hive.AddFile
 
files() - Method in class org.apache.spark.SparkContext
 
files() - Method in class org.apache.spark.sql.parquet.Partition
 
fileSegment() - Method in class org.apache.spark.storage.BlockObjectWriter
Returns the file segment of committed data that this Writer has written.
fileSegment() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
FileSegment - Class in org.apache.spark.storage
References a particular segment of a file (potentially the entire file), based off an offset and a length.
FileSegment(File, long, long) - Constructor for class org.apache.spark.storage.FileSegment
 
fileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
FileSystemHelper - Class in org.apache.spark.sql.parquet
 
FileSystemHelper() - Constructor for class org.apache.spark.sql.parquet.FileSystemHelper
 
fillObject(Iterator<Writable>, Deserializer, Seq<Tuple2<Attribute, Object>>, MutableRow) - Static method in class org.apache.spark.sql.hive.HadoopTableReader
Transform all given raw Writables into Rows.
filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps
Filter the graph by computing some values to filter on, and applying the predicates.
filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Construct a new edge partition containing only the edges matching epred and where both vertices match vpred.
filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
filter(Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Restrict the vertex set to the set of vertices satisfying the given predicate.
filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD
Restricts the vertex set to the set of vertices satisfying the given predicate.
filter(Params) - Method in class org.apache.spark.ml.param.ParamMap
Filters this param map for the given parent.
filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Row, Boolean>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD containing only the elements that satisfy a predicate.
Filter - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Filter(Expression, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Filter
 
filter(Function1<Row, Object>) - Method in class org.apache.spark.sql.SchemaRDD
 
Filter - Class in org.apache.spark.sql.sources
 
Filter() - Constructor for class org.apache.spark.sql.sources.Filter
 
filter() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<Tuple2<A, B>, Object>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
filter(Function1<Tuple2<A, B>, Object>) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
FilteredDStream<T> - Class in org.apache.spark.streaming.dstream
 
FilteredDStream(DStream<T>, Function1<T, Object>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.FilteredDStream
 
FilteredRDD<T> - Class in org.apache.spark.rdd
 
FilteredRDD(RDD<T>, Function1<T, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.FilteredRDD
 
FilteringParquetRowInputFormat - Class in org.apache.spark.sql.parquet
We extend ParquetInputFormat in order to have more control over which RecordFilter we want to use.
FilteringParquetRowInputFormat() - Constructor for class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
filterName() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
filterParams() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD
Filters this RDD with p, where p takes an additional parameter of type A.
finalRDD() - Method in class org.apache.spark.scheduler.JobSubmitted
 
finalStage() - Method in class org.apache.spark.scheduler.ActiveJob
 
findBestSplits(RDD<BaggedPoint<TreePoint>>, DecisionTreeMetadata, Node[], Map<Object, Node[]>, Map<Object, Map<Object, RandomForest.NodeIndexInfo>>, Split[][], Bin[][], Queue<Tuple2<Object, Node>>, TimeTracker, Option<NodeIdCache>) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Given a group of nodes, this finds the best split for each node.
findClass(String) - Method in class org.apache.spark.util.ParentClassLoader
 
findClosest(TraversableOnce<VectorWithNorm>, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
Returns the index of the closest center to the given point, as well as the squared distance.
findMaxTaskId(String, Configuration) - Static method in class org.apache.spark.sql.parquet.FileSystemHelper
Finds the maximum taskid in the output file names at the given path.
findSplitsForContinuousFeature(double[], DecisionTreeMetadata, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Find splits for a continuous feature NOTE: Returned number of splits is set based on featureSamples and could be different from the specified numSplits.
findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
Find synonyms of a word
findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
Find synonyms of the vector representation of a word
finishAll() - Method in class org.apache.spark.ui.ConsoleProgressBar
Mark all the stages as finished, clear the progress bar if showed, then the progress will not interweave with output of jobs.
finished() - Method in class org.apache.spark.scheduler.ActiveJob
 
finished() - Method in class org.apache.spark.scheduler.TaskInfo
 
FINISHED() - Static method in class org.apache.spark.TaskState
 
FINISHED_STATES() - Static method in class org.apache.spark.TaskState
 
finishedTasks() - Method in class org.apache.spark.partial.ApproximateActionListener
 
finishTime() - Method in class org.apache.spark.scheduler.TaskInfo
The time when the task has completed successfully (including the time to remotely fetch results, if necessary).
first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
first() - Method in class org.apache.spark.api.java.JavaPairRDD
 
first() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the first element in this RDD.
first() - Method in class org.apache.spark.rdd.RDD
Return the first element in this RDD.
FIRST_DELAY() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
firstAvailableClass(String, String) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
firstAvailableClass(String, String) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
fit(SchemaRDD, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with optional parameters.
fit(JavaSchemaRDD, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with optional parameters.
fit(SchemaRDD, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with optional parameters.
fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with provided parameter map.
fit(SchemaRDD, ParamMap[]) - Method in class org.apache.spark.ml.Estimator
Fits multiple models to the input data with multiple sets of parameters.
fit(JavaSchemaRDD, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with optional parameters.
fit(JavaSchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with provided parameter map.
fit(JavaSchemaRDD, ParamMap[]) - Method in class org.apache.spark.ml.Estimator
Fits multiple models to the input data with multiple sets of parameters.
fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
 
fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Pipeline
Fits the pipeline to the input dataset with additional parameters.
fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
Computes the inverse document frequency.
fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
Computes the inverse document frequency.
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler
Computes the mean and variance and stores as a model to be used for later scaling.
fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
Computes the vector representation of each word in vocabulary.
fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
Computes the vector representation of each word in vocabulary (Java version).
fittingParamMap() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
fittingParamMap() - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
fittingParamMap() - Method in class org.apache.spark.ml.Model
Fitting parameters, such that parent.fit(..., fittingParamMap) could reproduce the model.
fittingParamMap() - Method in class org.apache.spark.ml.PipelineModel
 
fittingParamMap() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
FixedLengthBinaryInputFormat - Class in org.apache.spark.input
 
FixedLengthBinaryInputFormat() - Constructor for class org.apache.spark.input.FixedLengthBinaryInputFormat
 
FixedLengthBinaryRecordReader - Class in org.apache.spark.input
FixedLengthBinaryRecordReader is returned by FixedLengthBinaryInputFormat.
FixedLengthBinaryRecordReader() - Constructor for class org.apache.spark.input.FixedLengthBinaryRecordReader
 
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMap(Function1<T, Traversable<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function
A function that returns zero or more output records from each input record.
FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function
A function that takes two inputs and returns zero or more output records.
FlatMappedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
FlatMappedDStream(DStream<T>, Function1<T, Traversable<U>>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.FlatMappedDStream
 
FlatMappedRDD<U,T> - Class in org.apache.spark.rdd
 
FlatMappedRDD(RDD<T>, Function1<T, TraversableOnce<U>>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.FlatMappedRDD
 
FlatMappedValuesRDD<K,V,U> - Class in org.apache.spark.rdd
 
FlatMappedValuesRDD(RDD<? extends Product2<K, V>>, Function1<V, TraversableOnce<U>>) - Constructor for class org.apache.spark.rdd.FlatMappedValuesRDD
 
flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapValuedDStream<K,V,U> - Class in org.apache.spark.streaming.dstream
 
FlatMapValuedDStream(DStream<Tuple2<K, V>>, Function1<V, TraversableOnce<U>>, ClassTag<K>, ClassTag<V>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD
Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
FlatMaps f over this RDD, where f takes an additional parameter of type A.
FLOAT - Class in org.apache.spark.sql.columnar
 
FLOAT() - Constructor for class org.apache.spark.sql.columnar.FLOAT
 
FloatColumnAccessor - Class in org.apache.spark.sql.columnar
 
FloatColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.FloatColumnAccessor
 
FloatColumnBuilder - Class in org.apache.spark.sql.columnar
 
FloatColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.FloatColumnBuilder
 
FloatColumnStats - Class in org.apache.spark.sql.columnar
 
FloatColumnStats() - Constructor for class org.apache.spark.sql.columnar.FloatColumnStats
 
FloatParam - Class in org.apache.spark.ml.param
Specialized version of Param[Float] for Java.
FloatParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
 
floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
 
FloatType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the FloatType object.
FloatType - Class in org.apache.spark.sql.api.java
The data type representing float and Float values.
floatWritableConverter() - Static method in class org.apache.spark.SparkContext
 
floor(Duration) - Method in class org.apache.spark.streaming.Time
 
FlumeBatchFetcher - Class in org.apache.spark.streaming.flume
This class implements the core functionality of FlumePollingReceiver.
FlumeBatchFetcher(FlumePollingReceiver) - Constructor for class org.apache.spark.streaming.flume.FlumeBatchFetcher
 
FlumeConnection - Class in org.apache.spark.streaming.flume
A wrapper around the transceiver and the Avro IPC API.
FlumeConnection(NettyTransceiver, SparkFlumeProtocol.Callback) - Constructor for class org.apache.spark.streaming.flume.FlumeConnection
 
FlumeEventServer - Class in org.apache.spark.streaming.flume
A simple server that implements Flume's Avro protocol.
FlumeEventServer(FlumeReceiver) - Constructor for class org.apache.spark.streaming.flume.FlumeEventServer
 
FlumeInputDStream<T> - Class in org.apache.spark.streaming.flume
 
FlumeInputDStream(StreamingContext, String, int, StorageLevel, boolean, ClassTag<T>) - Constructor for class org.apache.spark.streaming.flume.FlumeInputDStream
 
FlumePollingInputDStream<T> - Class in org.apache.spark.streaming.flume
A ReceiverInputDStream that can be used to read data from several Flume agents running SparkSinks.
FlumePollingInputDStream(StreamingContext, Seq<InetSocketAddress>, int, int, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
FlumePollingReceiver - Class in org.apache.spark.streaming.flume
 
FlumePollingReceiver(Seq<InetSocketAddress>, int, int, StorageLevel) - Constructor for class org.apache.spark.streaming.flume.FlumePollingReceiver
 
FlumeReceiver - Class in org.apache.spark.streaming.flume
A NetworkReceiver which listens for events using the Flume Avro interface.
FlumeReceiver(String, int, StorageLevel, boolean) - Constructor for class org.apache.spark.streaming.flume.FlumeReceiver
 
FlumeReceiver.CompressionChannelPipelineFactory - Class in org.apache.spark.streaming.flume
A Netty Pipeline factory that will decompress incoming data from and the Netty client and compress data going back to the client.
FlumeReceiver.CompressionChannelPipelineFactory() - Constructor for class org.apache.spark.streaming.flume.FlumeReceiver.CompressionChannelPipelineFactory
 
FlumeUtils - Class in org.apache.spark.streaming.flume
 
FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
 
flush() - Method in class org.apache.spark.serializer.JavaSerializationStream
 
flush() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
flush() - Method in class org.apache.spark.serializer.SerializationStream
 
flush() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
flush() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
flush() - Method in class org.apache.spark.util.FileLogger
Flush the writer to disk manually.
FMeasure - Class in org.apache.spark.mllib.evaluation.binary
F-Measure.
FMeasure(double) - Constructor for class org.apache.spark.mllib.evaluation.binary.FMeasure
 
fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns f-measure for a given label (category)
fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns f1-measure for a given label (category)
fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns f-measure (equals to precision and recall because precision equals recall)
fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, F-Measure) curve.
fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, F-Measure) curve with beta = 1.0.
fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
foldable() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
foldable() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
forAttribute() - Method in class org.apache.spark.sql.columnar.PartitionStatistics
 
foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Applies a function f to all elements of this RDD.
foreach(Function1<Edge<ED>, BoxedUnit>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Apply the function f to all edges in this partition.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies a function f to all elements of this RDD.
foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreach(Function1<Tuple2<A, B>, U>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
foreach(Function1<A, U>) - Method in class org.apache.spark.util.TimeStampedHashSet
 
foreach(Function1<Tuple2<A, B>, U>) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseVector
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseVector
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector
Applies a function f to all the active elements of dense and sparse vector.
foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of the foreach action, which applies a function f to all the elements of this RDD.
foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
Applies a function f to all elements of this RDD.
ForEachDStream<T> - Class in org.apache.spark.streaming.dstream
 
ForEachDStream(DStream<T>, Function2<RDD<T>, Time, BoxedUnit>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ForEachDStream
 
foreachListener(Function1<SparkListener, BoxedUnit>) - Method in interface org.apache.spark.scheduler.SparkListenerBus
Apply the given function to all attached listeners, catching and logging any exception.
foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies a function f to each partition of this RDD.
foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of the foreachPartition action, which applies a function f to each partition of this RDD.
foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
Applies a function f to each partition of this RDD.
foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Apply a function to each RDD in this DStream.
foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Apply a function to each RDD in this DStream.
foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies f to each element of this RDD, where f takes an additional parameter of type A.
foreachWithinEdgePartition(int, boolean, boolean, Function1<Object, BoxedUnit>) - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
Runs f on each vertex id to be sent to the specified edge partition.
formatDate(Date) - Static method in class org.apache.spark.ui.UIUtils
 
formatDate(long) - Static method in class org.apache.spark.ui.UIUtils
 
formatDuration(long) - Static method in class org.apache.spark.ui.UIUtils
 
formatDurationVerbose(long) - Static method in class org.apache.spark.ui.UIUtils
Generate a verbose human-readable string representing a duration such as "5 second 35 ms"
formatNumber(double) - Static method in class org.apache.spark.ui.UIUtils
Generate a human-readable string representing a number (e.g.
formatter() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
formatWindowsPath(String) - Static method in class org.apache.spark.util.Utils
Format a Windows path such that it can be safely passed to a URI.
fraction() - Method in class org.apache.spark.sql.execution.Sample
 
framework() - Method in class org.apache.spark.streaming.Checkpoint
 
frameworkMessage(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, byte[]) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
frameworkMessage(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, byte[]) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
freeCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
freeMemory() - Method in class org.apache.spark.storage.MemoryStore
Free memory not occupied by existing blocks.
fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
fromBreeze(Matrix<Object>) - Static method in class org.apache.spark.mllib.linalg.Matrices
Creates a Matrix instance from a breeze matrix.
fromBreeze(Vector<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a vector instance from a breeze vector.
fromDataType(DataType, String, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Converts a given Catalyst DataType into the corresponding Parquet Type.
fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
Convert a scala DStream to a Java-friendly JavaDStream.
fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
Creates an EdgeRDD from already-constructed edge partitions.
fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from EdgePartitions, setting referenced vertices to `defaultVertexAttr`.
fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
Creates an EdgeRDD from a set of edges.
fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
Construct a graph from a collection of edges.
fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD containing all vertices referred to in edges.
fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph
Construct a graph from a collection of edges encoded as vertex id pairs.
fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the vertices.
fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
Convert a scala InputDStream to a Java-friendly JavaInputDStream.
fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
Convert a scala InputDStream of pairs to a Java-friendly JavaPairInputDStream.
fromJava(Object, DataType) - Static method in class org.apache.spark.sql.execution.EvaluatePython
 
fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD
Convert a JavaRDD of key-value pairs to JavaPairRDD.
fromMesos(Protos.TaskState) - Static method in class org.apache.spark.TaskState
 
fromMsgs(int, Iterator<Tuple2<Object, Object>>) - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition
Build a `RoutingTablePartition` from `RoutingTableMessage`s.
fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
fromPrimitiveDataType(DataType) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
For a given Catalyst DataType return the name of the corresponding Parquet primitive type or None if the given type is not primitive.
fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
 
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions
Implicit conversion from an RDD to RDDFunctions.
fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
 
fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
 
fromStage(Stage, Option<Object>) - Static method in class org.apache.spark.scheduler.StageInfo
Construct a StageInfo from a Stage.
fromString(String) - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
fromString(String) - Static method in class org.apache.spark.mllib.tree.impurity.Impurities
 
fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
 
fromString(String) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Return the StorageLevel object with the specified name.
fromWeakReference(WeakReference<V>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceIterator(Iterator<Tuple2<K, WeakReference<V>>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceMap(Map<K, WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceOption(Option<WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceTuple(Tuple2<K, WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fs() - Method in class org.apache.spark.rdd.CheckpointRDD
 
fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a full outer join of this and other.
fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a full outer join of this and other.
fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a full outer join of this and other.
fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
 
func() - Method in class org.apache.spark.scheduler.ActiveJob
 
func() - Method in class org.apache.spark.scheduler.JobSubmitted
 
Function<T1,R> - Interface in org.apache.spark.api.java.function
Base interface for functions whose return types do not create special RDDs.
function() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
function() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function
A two-argument function that takes arguments of type T1 and T2 and returns an R.
Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function
A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
functionClassName() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
FutureAction<T> - Interface in org.apache.spark
A future for the result of an action to support cancellation.

G

gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
GapSamplingIterator<T> - Class in org.apache.spark.util.random
 
GapSamplingIterator(Iterator<T>, double, Random, double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.GapSamplingIterator
 
GapSamplingReplacementIterator<T> - Class in org.apache.spark.util.random
advance to first sample as part of object construction.
GapSamplingReplacementIterator(Iterator<T>, double, Random, double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.GapSamplingReplacementIterator
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
gatherCompressibilityStats(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
gatherCompressibilityStats(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.ByteColumnStats
 
gatherStats(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnStats
Gathers statistics information from row(ordinal).
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.DateColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.FloatColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.GenericColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.IntColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.LongColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.NoopColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.ShortColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.StringColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
 
GC_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
gemm(boolean, boolean, double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS
C := alpha * A * B + beta * C
gemm(double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS
C := alpha * A * B + beta * C
gemv(boolean, double, Matrix, DenseVector, double, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y := alpha * A * x + beta * y
gemv(double, Matrix, DenseVector, double, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y := alpha * A * x + beta * y
GeneralHashedRelation - Class in org.apache.spark.sql.execution.joins
A general HashedRelation backed by a hash map that maps the key into a sequence of values.
GeneralHashedRelation(HashMap<Row, CompactBuffer<Row>>) - Constructor for class org.apache.spark.sql.execution.joins.GeneralHashedRelation
 
GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression
:: DeveloperApi :: GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
 
GeneralizedLinearModel - Class in org.apache.spark.mllib.regression
:: DeveloperApi :: GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm.
GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
generate(String, String, int, int) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
Generate - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Applies a Generator to a stream of input rows, combining the output of each into a new stream of rows.
Generate(Generator, boolean, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Generate
 
generate(Generator, boolean, boolean, Option<String>) - Method in class org.apache.spark.sql.SchemaRDD
:: Experimental :: Applies the given Generator, or table generating function, to this relation.
GeneratedAggregate - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Alternate version of aggregation that leverages projection and thus code generation.
GeneratedAggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.GeneratedAggregate
 
generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
 
generateJob(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Generate a SparkStreaming job for the given time.
generateJob(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
generateJobs(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
GenerateJobs - Class in org.apache.spark.streaming.scheduler
 
GenerateJobs(Time) - Constructor for class org.apache.spark.streaming.scheduler.GenerateJobs
 
generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
Generate an RDD containing test data for KMeans.
generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
 
generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
Return a Java List of synthetic data randomly generated according to a multi collinear model.
generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and uregularized variants.
generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
Generate an RDD containing test data for LogisticRegression.
generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
generateRolledOverFileSuffix() - Method in interface org.apache.spark.util.logging.RollingPolicy
Get the desired name of the rollover file
generateRolledOverFileSuffix() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Get the desired name of the rollover file
generateRolledOverFileSuffix() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
generator() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
generator() - Method in class org.apache.spark.sql.execution.Generate
 
GENERIC - Class in org.apache.spark.sql.columnar
 
GENERIC() - Constructor for class org.apache.spark.sql.columnar.GENERIC
 
GenericColumnAccessor - Class in org.apache.spark.sql.columnar
 
GenericColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.GenericColumnAccessor
 
GenericColumnBuilder - Class in org.apache.spark.sql.columnar
 
GenericColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.GenericColumnBuilder
 
GenericColumnStats - Class in org.apache.spark.sql.columnar
 
GenericColumnStats() - Constructor for class org.apache.spark.sql.columnar.GenericColumnStats
 
get(Object) - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
get() - Method in interface org.apache.spark.FutureAction
Blocks and returns the result of this job.
get() - Method in class org.apache.spark.JavaFutureActionWrapper
 
get(long, TimeUnit) - Method in class org.apache.spark.JavaFutureActionWrapper
 
get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
Optionally returns the value associated with a param or its default.
get(Param<T>) - Method in interface org.apache.spark.ml.param.Params
Gets the value of a parameter in the embedded param map.
get(long) - Method in class org.apache.spark.partial.StudentTCacher
 
get(String) - Method in class org.apache.spark.SparkConf
Get a parameter; throws a NoSuchElementException if it's not set
get(String, String) - Method in class org.apache.spark.SparkConf
Get a parameter, falling back to a default if not set
get() - Static method in class org.apache.spark.SparkEnv
Returns the SparkEnv.
get(String) - Static method in class org.apache.spark.SparkFiles
Get the absolute path of a file added through SparkContext.addFile().
get(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column `i`.
get(Row) - Method in class org.apache.spark.sql.execution.joins.GeneralHashedRelation
 
get(Row) - Method in interface org.apache.spark.sql.execution.joins.HashedRelation
 
get(Row) - Method in class org.apache.spark.sql.execution.joins.UniqueKeyHashedRelation
 
get() - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
get(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get a block from the block manager (either local or remote).
get() - Static method in class org.apache.spark.TaskContext
Return the currently active TaskContext.
get(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
get(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getAcceptanceResults(RDD<Tuple2<K, V>>, boolean, Map<K, Object>, Option<Map<K, Object>>, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Count the number of items instantly accepted and generate the waitlist for each stratum.
getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Returns an array containing the ids of all active jobs.
getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker
Returns an array containing the ids of all active jobs.
getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Returns an array containing the ids of all active stages.
getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker
Returns an array containing the ids of all active stages.
getActorSystemHostPortForExecutor(String) - Method in class org.apache.spark.storage.BlockManagerMaster
 
getAddressHostName(String) - Static method in class org.apache.spark.util.Utils
 
getAkkaConf() - Method in class org.apache.spark.SparkConf
Get all akka conf variables set on this SparkConf
getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getAll() - Method in class org.apache.spark.SparkConf
Get all parameters as a list of pairs
getAllBlocks() - Method in class org.apache.spark.storage.DiskBlockManager
List all the blocks currently stored on disk by the disk manager.
getAllConfs() - Method in interface org.apache.spark.sql.SQLConf
Return all the configuration properties that have been set (i.e.
getAllFiles() - Method in class org.apache.spark.storage.DiskBlockManager
List all the files currently stored on disk by the disk manager.
getAllPartitionsOf(Hive, Table) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getAllPools() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return pools for fair scheduler
getAppId() - Method in class org.apache.spark.SparkConf
Returns the Spark application id, valid in the Driver after TaskScheduler registration and from the start in the Executor.
getAppName() - Method in class org.apache.spark.ui.SparkUI
 
getAst(String) - Static method in class org.apache.spark.sql.hive.HiveQl
Returns the AST for the given SQL string.
getBasePath() - Method in class org.apache.spark.ui.WebUI
 
getBernoulliSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Return the per partition sampling function used for sampling without replacement.
getBinaryWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
Return the given block stored in this block manager in O(1) time.
getBlockData(BlockId) - Method in class org.apache.spark.storage.BlockManager
Interface to get local block data.
getBlocksOfBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Get the blocks allocated to the given batch.
getBlocksOfBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Get the blocks for the given batch and all input streams.
getBlocksOfBatchAndStream(Time, int) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Get the blocks allocated to the given batch and stream.
getBlocksOfBatchAndStream(Time, int) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Get the blocks allocated to the given batch and stream.
getBlocksOfStream(int) - Method in class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
getBlockStatus(BlockId, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Return the block's status on all block managers, if any.
getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf
Get a parameter as a boolean, falling back to a default if not set
getBoolean(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a bool.
getBooleanWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getByte(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a byte.
getBytes(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getBytes(FileSegment) - Method in class org.apache.spark.storage.DiskStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getByteWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
 
getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
The three methods below are helpers for accessing the local map, a property of the SparkEnv of the local process.
getCachedStorageLevel(StorageLevel) - Static method in class org.apache.spark.storage.StorageLevel
 
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.EntropyAggregator
Get an ImpurityCalculator for a (node, feature, bin).
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.GiniAggregator
Get an ImpurityCalculator for a (node, feature, bin).
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
Get an ImpurityCalculator for a (node, feature, bin).
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.VarianceAggregator
Get an ImpurityCalculator for a (node, feature, bin).
getCallSite() - Method in class org.apache.spark.SparkContext
Capture the current user callsite and return a formatted version for printing.
getCallSite(Function1<String, Object>) - Static method in class org.apache.spark.util.Utils
When called inside a class in the spark package, returns the name of the user code class (outside the spark package) that called into Spark, as well as which Spark method they called.
getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
 
getCheckpointDir() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getCheckpointDir() - Method in class org.apache.spark.SparkContext
 
getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike
Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.rdd.RDD
Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getCheckpointFiles(String, FileSystem) - Static method in class org.apache.spark.streaming.Checkpoint
Get checkpoint files present in the give directory, ordered by oldest-first
getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getClause(String, Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
getClauseOption(String, Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
getClientSideSplits(Configuration, List<Footer>, Long, Long, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getCombOp() - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Returns the function used combine results returned by seqOp from different partitions.
getCommandProcessor(String[], HiveConf) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getConf() - Method in class org.apache.spark.api.java.JavaSparkContext
Return a copy of this JavaSparkContext's configuration.
getConf() - Method in class org.apache.spark.input.WholeCombineFileRecordReader
 
getConf() - Method in class org.apache.spark.input.WholeTextFileInputFormat
 
getConf() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getConf() - Method in class org.apache.spark.rdd.HadoopRDD
 
getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getConf() - Method in class org.apache.spark.SparkContext
Return a copy of this SparkContext's configuration.
getConf(String) - Method in interface org.apache.spark.sql.SQLConf
Return the value of Spark SQL configuration property for the given key.
getConf(String, String) - Method in interface org.apache.spark.sql.SQLConf
Return the value of Spark SQL configuration property for the given key.
getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
 
getConnections() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
getContextOrSparkClassLoader() - Static method in class org.apache.spark.util.Utils
Get the Context ClassLoader on this thread or, if not present, the ClassLoader that loaded Spark.
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
getCorrelationFromName(String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
getCreationSite() - Method in class org.apache.spark.rdd.RDD
 
getCreationSite() - Static method in class org.apache.spark.streaming.dstream.DStream
Get the creation site of a DStream from the stack trace of when the DStream is created.
getCurrentKey() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getCurrentKey() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getCurrentKey() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystConverter
Should only be called in the root (group) converter!
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.RowRecordMaterializer
 
getCurrentValue() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getCurrentValue() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getCurrentValue() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getDataLocationPath(Partition) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDataType() - Method in class org.apache.spark.sql.api.java.StructField
 
getDateWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDecimalWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDefaultPropertiesFile(Map<String, String>) - Static method in class org.apache.spark.util.Utils
Return the path of the default Spark properties file.
getDefaultWorkFile(TaskAttemptContext, String) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
getDelaySeconds(SparkConf) - Static method in class org.apache.spark.util.MetadataCleaner
 
getDelaySeconds(SparkConf, Enumeration.Value) - Static method in class org.apache.spark.util.MetadataCleaner
 
getDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
 
getDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
getDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
 
getDirName() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
getDiskWriter(BlockId, File, Serializer, int, ShuffleWriteMetrics) - Method in class org.apache.spark.storage.BlockManager
A short circuited method to get a block writer that can write data directly to disk.
getDouble(String, double) - Method in class org.apache.spark.SparkConf
Get a parameter as a double, falling back to a default if not set
getDouble(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a double.
getDoubleWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getElementType() - Method in class org.apache.spark.sql.api.java.ArrayType
 
getEntrySet() - Method in class org.apache.spark.util.TimeStampedHashMap
 
getenv(String) - Method in class org.apache.spark.SparkConf
By using this instead of System.getenv(), environment variables can be mocked in unit tests.
getEpoch() - Method in class org.apache.spark.MapOutputTracker
Called to get current epoch number.
getEstimator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getEstimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getEvaluator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getExecutorEnv() - Method in class org.apache.spark.SparkConf
Get all executor environment variables set on this SparkConf
getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext
Return a map from the slave to the max memory available for caching and the remaining memory available for caching.
getExecutorsAliveOnHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return information about blocks stored in all of the slaves
getExecutorThreadDump(String) - Method in class org.apache.spark.SparkContext
Called by the web UI to obtain executor thread dumps.
getExternalTmpPath(Context, Path) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getFeatureOffset(int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Pre-compute feature offset for use with featureUpdate.
getFeaturesCol() - Method in interface org.apache.spark.ml.param.HasFeaturesCol
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BINARY
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
getField(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Returns row(ordinal).
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.DATE
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.GENERIC
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.INT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
getFields() - Method in class org.apache.spark.sql.api.java.StructType
 
getFile(long) - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
getFile(String) - Method in class org.apache.spark.storage.DiskBlockManager
Looks up a file by hashing it into one of our local subdirectories.
getFile(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
 
getFile(String) - Method in class org.apache.spark.storage.TachyonBlockManager
 
getFile(BlockId) - Method in class org.apache.spark.storage.TachyonBlockManager
 
getFilePath(File, String) - Static method in class org.apache.spark.util.Utils
Return the absolute path of a file in the given directory.
getFileSegmentLocations(String, long, long, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
Get the locations of the HDFS blocks containing the given file segment.
getFileSystemForPath(Path, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getFinalValue() - Method in class org.apache.spark.partial.PartialResult
Blocking method to wait for and return the final value.
getFloat(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a float.
getFloatWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getFooters(JobContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getFormattedClassName(Object) - Static method in class org.apache.spark.util.Utils
Return the class name of the given object, removing all dollar signs
getFunctionInfo(String) - Method in class org.apache.spark.sql.hive.HiveFunctionRegistry
 
getHadoopFileSystem(URI, Configuration) - Static method in class org.apache.spark.util.Utils
Return a Hadoop FileSystem with the scheme encoded in the given path.
getHadoopFileSystem(String, Configuration) - Static method in class org.apache.spark.util.Utils
Return a Hadoop FileSystem with the scheme encoded in the given path.
getHandlers() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
getHandlers() - Method in class org.apache.spark.ui.WebUI
 
getHiveFile(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
getHttpUser() - Method in class org.apache.spark.SecurityManager
Gets the user used for authenticating HTTP connections.
getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getImpurityCalculator(int, int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Get an ImpurityCalculator for a given (node, feature, bin).
getInputCol() - Method in interface org.apache.spark.ml.param.HasInputCol
 
getInputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getInputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getInstance(String) - Method in class org.apache.spark.metrics.MetricsConfig
 
getInt(String, int) - Method in class org.apache.spark.SparkConf
Get a parameter as an integer, falling back to a default if not set
getInt(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as an int.
getIntWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getIteratorSize(Iterator<T>) - Static method in class org.apache.spark.util.Utils
Counts the number of elements of an iterator using a while loop rather than calling TraversableOnce.size() because it uses a for loop, which is slightly slower in the current version of Scala.
getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Return a list of all known jobs in a particular job group.
getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker
Return a list of all known jobs in a particular job group.
getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Returns job information, or null if the job info could not be found or was garbage collected.
getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker
Returns job information, or None if the job info could not be found or was garbage collected.
getKeyType() - Method in class org.apache.spark.sql.api.java.MapType
 
getLabelCol() - Method in interface org.apache.spark.ml.param.HasLabelCol
 
getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getLeastGroupHash(String) - Method in class org.apache.spark.rdd.PartitionCoalescer
Sorts and gets the least element of the list associated with key in groupHash The returned PartitionGroup is the least loaded of all groups that represent the machine "key"
getLeftRightFeatureOffsets(int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Pre-compute feature offset for use with featureUpdate.
getLocal(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get block from local block manager.
getLocalBytes(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get block from the local block manager as serialized bytes.
getLocalDir(SparkConf) - Static method in class org.apache.spark.util.Utils
Get the path of a temporary directory.
getLocalFileWriter(Row) - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
getLocalFileWriter(Row) - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
getLocalityIndex(Enumeration.Value) - Method in class org.apache.spark.scheduler.TaskSetManager
Find the index in myLocalityLevels for a given locality.
getLocalProperties() - Method in class org.apache.spark.SparkContext
 
getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Get a local property set in this thread, or null if it is missing.
getLocalProperty(String) - Method in class org.apache.spark.SparkContext
Get a local property set in this thread, or null if it is missing.
getLocation() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
getLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
getLocations(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
Get locations of the blockId from the driver
getLocations(BlockId[]) - Method in class org.apache.spark.storage.BlockManagerMaster
Get locations of multiple blockIds from the driver
getLogDirPath(String, String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
Return a file-system-safe path to the log directory for the given application.
getLong(String, long) - Method in class org.apache.spark.SparkConf
Get a parameter as a long, falling back to a default if not set
getLong(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a long.
getLongWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getLowerBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds
Returns a threshold p such that if we conduct n Bernoulli trials with success rate = p, it is very unlikely to have more than fraction * n successes.
getLowerBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds
Returns a lambda such that Pr[X > s] is very small, where X ~ Pois(lambda).
GetMapOutputStatuses - Class in org.apache.spark
 
GetMapOutputStatuses(int) - Constructor for class org.apache.spark.GetMapOutputStatuses
 
getMatchingBlockIds(Function1<BlockId, Object>) - Method in class org.apache.spark.storage.BlockManager
Get the ids of existing blocks that match the given filter.
getMatchingBlockIds(Function1<BlockId, Object>, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Return a list of ids of existing blocks such that the ids match the given filter.
getMaxBatchSize() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxInputStreamRememberDuration() - Method in class org.apache.spark.streaming.DStreamGraph
Get the maximum remember duration across all the input streams.
getMaxIter() - Method in interface org.apache.spark.ml.param.HasMaxIter
 
getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxResultSize(SparkConf) - Static method in class org.apache.spark.util.Utils
 
getMemoryStatus() - Method in class org.apache.spark.storage.BlockManagerMaster
Return the memory status for each block manager, in the form of a map from the block manager's id to two long values.
getMessage() - Method in exception org.apache.spark.util.TaskCompletionListenerException
 
getMetadata() - Method in class org.apache.spark.sql.api.java.StructField
 
getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
getMetricsSnapshot(HttpServletRequest) - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getModel(Estimator<M>) - Method in class org.apache.spark.ml.PipelineModel
Gets the model produced by the input estimator.
getModifyAcls() - Method in class org.apache.spark.SecurityManager
 
getName() - Method in class org.apache.spark.sql.api.java.StructField
 
getNarrowAncestors() - Method in class org.apache.spark.rdd.RDD
Return the ancestors of the given RDD that are related to it only through a sequence of narrow dependencies.
getNewReceiverStreamId() - Method in class org.apache.spark.streaming.StreamingContext
 
getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node
Traces down from a root node to get the node with the given node index.
getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
 
getNumFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getObjectInspector() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
getOption(String) - Method in class org.apache.spark.SparkConf
Get a parameter as an Option
getOrCompute(RDD<T>, Partition, TaskContext, StorageLevel) - Method in class org.apache.spark.CacheManager
Gets or computes an RDD partition.
getOrCompute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Get the RDD corresponding to the given time; either retrieve it from cache or compute-and-cache it.
getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreateLocalRootDirs(SparkConf) - Static method in class org.apache.spark.util.Utils
Gets or creates the directories listed in spark.local.dir or SPARK_LOCAL_DIRS, and returns only the directories that exist / could be created.
getOutputCol() - Method in interface org.apache.spark.ml.param.HasOutputCol
 
getOutputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getOutputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getParam(String) - Method in interface org.apache.spark.ml.param.Params
Gets a param by its name.
getParents(int) - Method in class org.apache.spark.NarrowDependency
Get the parent partitions for a child partition.
getParents(int) - Method in class org.apache.spark.OneToOneDependency
 
getParents(int) - Method in class org.apache.spark.RangeDependency
 
getParents(int) - Method in class org.apache.spark.rdd.PruneDependency
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
 
getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy
Returns the partition number for a given edge.
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
 
getPartition(Object) - Method in class org.apache.spark.HashPartitioner
 
getPartition(Object) - Method in class org.apache.spark.mllib.recommendation.ALSPartitioner
 
getPartition(Object) - Method in class org.apache.spark.Partitioner
 
getPartition(Object) - Method in class org.apache.spark.RangePartitioner
 
getPartitions() - Method in class org.apache.spark.mllib.rdd.RandomRDD
 
getPartitions() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
getPartitions() - Method in class org.apache.spark.rdd.BinaryFileRDD
 
getPartitions() - Method in class org.apache.spark.rdd.BlockRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CartesianRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CheckpointRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CoalescedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.EmptyRDD
 
getPartitions() - Method in class org.apache.spark.rdd.FilteredRDD
 
getPartitions() - Method in class org.apache.spark.rdd.FlatMappedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.FlatMappedValuesRDD
 
getPartitions() - Method in class org.apache.spark.rdd.GlommedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
 
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
 
getPartitions() - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
getPartitions() - Method in class org.apache.spark.rdd.MappedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.MappedValuesRDD
 
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PipedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getPartitions() - Method in class org.apache.spark.rdd.SampledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.SubtractedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.WholeTextFileRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
getPartitions() - Method in class org.apache.spark.sql.SchemaRDD
 
getPartitions() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
 
getPath() - Method in class org.apache.spark.input.PortableDataStream
 
getPeers(BlockManagerId) - Method in class org.apache.spark.storage.BlockManagerMaster
Get ids of other nodes in the cluster from the driver
getPendingTimes() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
getPersistentRDDs() - Method in class org.apache.spark.SparkContext
Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
getPipeEnvVars() - Method in class org.apache.spark.rdd.HadoopPartition
Get any environment variables that should be added to the users environment when running pipes
getPipeline() - Method in class org.apache.spark.streaming.flume.FlumeReceiver.CompressionChannelPipelineFactory
 
getPointIterator(RandomRDDPartition<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RandomRDD
 
getPoissonSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Return the per partition sampling function used for sampling with replacement.
getPoolForName(String) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return the pool associated with the given name, if one exists
getPrecision() - Method in class org.apache.spark.sql.api.java.DecimalType
Return the precision, or -1 if no precision is set
getPredictionCol() - Method in interface org.apache.spark.ml.param.HasPredictionCol
 
getPreferredLocations(Partition) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.BlockRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CartesianRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CheckpointRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CoalescedRDD
Returns the preferred machine for the partition.
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.SampledRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
Get the preferred location of the partition.
getPreferredLocs(RDD<?>, int) - Method in class org.apache.spark.scheduler.DAGScheduler
Synchronized method that might be called from other threads.
getPreferredLocs(RDD<?>, int) - Method in class org.apache.spark.SparkContext
Gets the locality information associated with the partition in a particular rdd
getPrimitiveNullWritableConstantObjectInspector() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getProgress() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getProgress() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getProgress() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getPropertiesFromFile(String) - Static method in class org.apache.spark.util.Utils
Load properties present in the given file.
getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getQuantiles(Traversable<Object>) - Method in class org.apache.spark.util.Distribution
Get the value of the distribution at the given probabilities.
getRackForHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
getRddBlockLocations(int, Seq<StorageStatus>) - Static method in class org.apache.spark.storage.StorageUtils
Return a mapping from block ID to its locations for each block that belongs to the given RDD.
getRDDStorageInfo() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return information about what RDDs are cached, if they are in mem or on disk, how much space they take, etc.
getReceiver() - Method in class org.apache.spark.streaming.dstream.PluggableInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.dstream.RawInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
Gets the receiver object that will be sent to the worker nodes to receive data.
getReceiver() - Method in class org.apache.spark.streaming.dstream.SocketInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.flume.FlumeInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.kafka.KafkaInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.mqtt.MQTTInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.twitter.TwitterInputDStream
 
getReceiverInputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getRecordLength(JobContext) - Static method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Retrieves the record length property from a Hadoop configuration
getReference(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getRegParam() - Method in interface org.apache.spark.ml.param.HasRegParam
 
getRemote(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get block from remote block managers.
getRemoteBytes(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get block from remote block managers as serialized bytes.
getResource(List<Protos.Resource>, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Helper function to pull out a resource from a Mesos Resources protobuf
getRestartTime(long) - Method in class org.apache.spark.streaming.util.RecurringTimer
Get the time when the timer will fire if it is restarted right now.
getRootConverter() - Method in class org.apache.spark.sql.parquet.RowRecordMaterializer
 
getRootDirectory() - Static method in class org.apache.spark.SparkFiles
Get the root directory that contains files added through SparkContext.addFile().
getSaslUser() - Method in class org.apache.spark.SecurityManager
Gets the user used for authenticating SASL connections.
getSaslUser(String) - Method in class org.apache.spark.SecurityManager
 
getScale() - Method in class org.apache.spark.sql.api.java.DecimalType
Return the scale, or -1 if no precision is set
getSchedulableByName(String) - Method in class org.apache.spark.scheduler.Pool
 
getSchedulableByName(String) - Method in interface org.apache.spark.scheduler.Schedulable
 
getSchedulableByName(String) - Method in class org.apache.spark.scheduler.TaskSetManager
 
getSchedulingMode() - Method in class org.apache.spark.SparkContext
Return current scheduling mode
getSchema(Configuration) - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
getScoreCol() - Method in interface org.apache.spark.ml.param.HasScoreCol
 
getSecretKey() - Method in class org.apache.spark.SecurityManager
Gets the secret key.
getSecretKey(String) - Method in class org.apache.spark.SecurityManager
 
getSecurityManager() - Method in class org.apache.spark.ui.WebUI
 
getSeqOp(boolean, Map<K, Object>, StratifiedSamplingUtils.RandomDataGenerator, Option<Map<K, Object>>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Returns the function used by aggregate to collect sampling statistics for each partition.
getSerDeStats() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
getSerializedClass() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
getSerializedMapOutputStatuses(int) - Method in class org.apache.spark.MapOutputTrackerMaster
 
getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
 
getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
 
getServerStatuses(int, int) - Method in class org.apache.spark.MapOutputTracker
Called from executors to get the server URIs and output sizes of the map outputs of a given shuffle.
getServletHandlers() - Method in class org.apache.spark.metrics.MetricsSystem
Get any UI handlers used by this metrics system; can only be called after start().
getShort(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a short.
getShortWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getSingle(BlockId) - Method in class org.apache.spark.storage.BlockManager
Read a block consisting of a single object.
getSize(BlockId) - Method in class org.apache.spark.storage.BlockStore
Return the size of a block in bytes.
getSize(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getSize(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getSize(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getSizeForBlock(int) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
getSizeForBlock(int) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
getSizeForBlock(int) - Method in interface org.apache.spark.scheduler.MapStatus
Estimated size for the reduce block, in bytes.
getSizesOfActiveStateTrackingCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSizesOfHardSizeLimitedCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSizesOfSoftSizeLimitedCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSortedRolledOverFiles(String, String) - Static method in class org.apache.spark.util.logging.RollingFileAppender
Get the sorted list of rolled over files.
getSortedTaskSetQueue() - Method in class org.apache.spark.scheduler.Pool
 
getSortedTaskSetQueue() - Method in interface org.apache.spark.scheduler.Schedulable
 
getSortedTaskSetQueue() - Method in class org.apache.spark.scheduler.TaskSetManager
 
getSparkClassLoader() - Static method in class org.apache.spark.util.Utils
Get the ClassLoader which loaded Spark.
getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext
Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getSparkHome() - Method in class org.apache.spark.SparkContext
Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getSparkOrYarnConfig(SparkConf, String, String) - Static method in class org.apache.spark.util.Utils
Return the value of a config either through the SparkConf or the Hadoop configuration if this is Yarn mode.
getSparkUI(StreamingContext) - Static method in class org.apache.spark.streaming.ui.StreamingTab
 
getSplits(Configuration, List<Footer>) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Returns stage information, or null if the stage info could not be found or was garbage collected.
getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker
Returns stage information, or None if the stage info could not be found or was garbage collected.
getStages() - Method in class org.apache.spark.ml.Pipeline
 
getStartTime() - Method in class org.apache.spark.streaming.util.RecurringTimer
Get the time when this timer will fire if it is started right now.
getStatsSetupConstRawDataSize() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getStatsSetupConstTotalSize() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getStatus(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get the BlockStatus for the block identified by the given ID, if it exists.
getStatus(BlockId) - Method in class org.apache.spark.storage.BlockManagerInfo
 
getStderr(Process, long) - Static method in class org.apache.spark.util.Utils
Return the stderr of a process after waiting for the process to terminate.
getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageLevel() - Method in class org.apache.spark.rdd.RDD
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageStatus() - Method in class org.apache.spark.storage.BlockManagerMaster
 
getString(int) - Method in class org.apache.spark.sql.api.java.Row
Returns the value of column i as a String.
getStringWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getSystemProperties() - Static method in class org.apache.spark.util.Utils
Returns the system properties map that is thread-safe to iterator over.
getTableDesc(Class<? extends Deserializer>, Class<? extends InputFormat<?, ?>>, Class<?>, Properties) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getTabs() - Method in class org.apache.spark.ui.WebUI
 
getTaskSideSplits(Configuration, List<Footer>, Long, Long, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getThreadDump() - Static method in class org.apache.spark.util.Utils
Return a thread dump of all threads' stacktraces.
getThreadLocal() - Static method in class org.apache.spark.SparkEnv
Returns the ThreadLocal SparkEnv.
getThreshold() - Method in interface org.apache.spark.ml.param.HasThreshold
 
getTime() - Method in interface org.apache.spark.util.Clock
 
getTime() - Static method in class org.apache.spark.util.SystemClock
 
getTimeMillis() - Method in interface org.apache.spark.Clock
 
getTimeMillis() - Method in class org.apache.spark.RealClock
 
getTimeMillis() - Method in class org.apache.spark.TestClock
 
getTimestamp(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
getTimestamp(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getTimeStampedValue(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
getTimestampWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
 
GettingResultEvent - Class in org.apache.spark.scheduler
 
GettingResultEvent(TaskInfo) - Constructor for class org.apache.spark.scheduler.GettingResultEvent
 
gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo
The time when the task started remotely getting the result.
getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getUIPort(SparkConf) - Static method in class org.apache.spark.ui.SparkUI
 
getUnallocatedBlocks(int) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Get blocks that have been added but not yet allocated to any batch.
getUpperBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds
Returns a threshold p such that if we conduct n Bernoulli trials with success rate = p, it is very unlikely to have less than fraction * n successes.
getUpperBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds
Returns a lambda such that Pr[X < s] is very small, where X ~ Pois(lambda).
getUsedTimeMs(long) - Static method in class org.apache.spark.util.Utils
Return the string to tell how long has passed in milliseconds.
getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getValue(Row) - Method in class org.apache.spark.sql.execution.joins.UniqueKeyHashedRelation
 
getValues(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
getValues(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getValues(BlockId, Serializer) - Method in class org.apache.spark.storage.DiskStore
A version of getValues that allows a custom serializer.
getValues(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getValues(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getValueType() - Method in class org.apache.spark.sql.api.java.MapType
 
getVectorIterator(RandomRDDPartition<Object>, int) - Static method in class org.apache.spark.mllib.rdd.RandomRDD
 
getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel
Returns a map of words to their vector representations.
getViewAcls() - Method in class org.apache.spark.SecurityManager
 
Gini - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating the Gini impurity during binary classification.
Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
 
GiniAggregator - Class in org.apache.spark.mllib.tree.impurity
Class for updating views of a vector of sufficient statistics, in order to compute impurity from a sample.
GiniAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.GiniAggregator
 
GiniCalculator - Class in org.apache.spark.mllib.tree.impurity
Stores statistics for one (node, feature, bin) for calculating impurity.
GiniCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.GiniCalculator
 
global() - Method in class org.apache.spark.sql.execution.ExternalSort
 
global() - Method in class org.apache.spark.sql.execution.Sort
 
glom() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in class org.apache.spark.rdd.RDD
Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
glom() - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
GlommedDStream<T> - Class in org.apache.spark.streaming.dstream
 
GlommedDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.GlommedDStream
 
GlommedRDD<T> - Class in org.apache.spark.rdd
 
GlommedRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.GlommedRDD
 
goodnessOfFit() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
grad() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
Gradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to compute the gradient for a loss function, given a single data point.
Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
 
gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
Method to calculate the gradients for the gradient boosting calculation for least absolute error calculation.
gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
Method to calculate the loss gradients for the gradient boosting calculation for binary classification The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x)))
gradient(TreeEnsembleModel, LabeledPoint) - Method in interface org.apache.spark.mllib.tree.loss.Loss
Method to calculate the gradients for the gradient boosting calculation.
gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
Method to calculate the gradients for the gradient boosting calculation for least squares error calculation.
GradientBoostedTrees - Class in org.apache.spark.mllib.tree
:: Experimental :: A class that implements Stochastic Gradient Boosting for regression and binary classification.
GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
 
GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model
:: Experimental :: Represents a gradient boosted trees model.
GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
GradientDescent - Class in org.apache.spark.mllib.optimization
Class used to solve an optimization problem using Gradient Descent.
GradientDescent(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.GradientDescent
 
Graph<VD,ED> - Class in org.apache.spark.graphx
The Graph abstractly represents a graph with arbitrary objects associated with vertices and edges.
graph() - Method in class org.apache.spark.streaming.Checkpoint
 
graph() - Method in class org.apache.spark.streaming.dstream.DStream
 
graph() - Method in class org.apache.spark.streaming.StreamingContext
 
GraphGenerators - Class in org.apache.spark.graphx.util
A collection of graph generating functions.
GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
 
GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl
An implementation of Graph to support computation on graphs.
graphite() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_PREFIX() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_HOST() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PORT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PREFIX() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GraphiteSink - Class in org.apache.spark.metrics.sink
 
GraphiteSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.GraphiteSink
 
GraphKryoRegistrator - Class in org.apache.spark.graphx
Registers GraphX classes with Kryo for improved performance.
GraphKryoRegistrator() - Constructor for class org.apache.spark.graphx.GraphKryoRegistrator
 
GraphLoader - Class in org.apache.spark.graphx
Provides utilities for loading Graphs from files.
GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
 
GraphOps<VD,ED> - Class in org.apache.spark.graphx
Contains additional functionality for Graph.
GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
 
graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
Implicitly extracts the GraphOps member from a graph.
GraphXUtils - Class in org.apache.spark.graphx
 
GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
 
greater(Duration) - Method in class org.apache.spark.streaming.Duration
 
greater(Time) - Method in class org.apache.spark.streaming.Time
 
greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
 
greaterEq(Time) - Method in class org.apache.spark.streaming.Time
 
GreaterThan - Class in org.apache.spark.sql.sources
 
GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
 
GreaterThanOrEqual - Class in org.apache.spark.sql.sources
 
GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
 
gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
Create rows by cols grid graph with each vertex connected to its row+1 and col+1 neighbors.
groupArr() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD of grouped elements.
groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD of grouped elements.
groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped items.
groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped elements.
groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped items.
groupBy(Seq<Expression>, Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
Performs a grouping followed by an aggregation.
groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey on each RDD of this DStream.
groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey on each RDD.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Create a new DStream by applying groupByKey over a sliding window on this DStream.
groupByResultToJava(RDD<Tuple2<K, Iterable<T>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
GroupedCountEvaluator<T> - Class in org.apache.spark.partial
An ApproximateEvaluator for counts by key.
GroupedCountEvaluator(int, double, ClassTag<T>) - Constructor for class org.apache.spark.partial.GroupedCountEvaluator
 
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph
Merges multiple edges between two vertices into a single edge.
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Merge all the edges with the same src and dest id into a single edge using the merge function
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
GroupedMeanEvaluator<T> - Class in org.apache.spark.partial
An ApproximateEvaluator for means by key.
GroupedMeanEvaluator(int, double) - Constructor for class org.apache.spark.partial.GroupedMeanEvaluator
 
GroupedSumEvaluator<T> - Class in org.apache.spark.partial
An ApproximateEvaluator for sums by key.
GroupedSumEvaluator(int, double) - Constructor for class org.apache.spark.partial.GroupedSumEvaluator
 
groupHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
groupId() - Method in class org.apache.spark.scheduler.JobGroupCancelled
 
groupingExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
 
groupingExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
Alias for cogroup.
groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for cogroup.
groupWriter() - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
GrowableAccumulableParam<R,T> - Class in org.apache.spark
 
GrowableAccumulableParam(Function1<R, Growable<T>>, ClassTag<R>) - Constructor for class org.apache.spark.GrowableAccumulableParam
 

H

hadoopConfiguration() - Method in class org.apache.spark.api.java.JavaSparkContext
Returns the Hadoop configuration used for the Hadoop code (e.g.
hadoopConfiguration() - Method in class org.apache.spark.SparkContext
A default Hadoop Configuration for the Hadoop code (e.g.
hadoopFile(String, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat.
hadoopFile(String, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, int, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFiles() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
hadoopJobMetadata() - Method in class org.apache.spark.SparkEnv
 
HadoopPartition - Class in org.apache.spark.rdd
A Spark split class that wraps around a Hadoop InputSplit.
HadoopPartition(int, int, InputSplit) - Constructor for class org.apache.spark.rdd.HadoopPartition
 
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
HadoopRDD<K,V> - Class in org.apache.spark.rdd
:: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the older MapReduce API (org.apache.hadoop.mapred).
HadoopRDD(SparkContext, Broadcast<SerializableWritable<Configuration>>, Option<Function1<JobConf, BoxedUnit>>, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
 
HadoopRDD(SparkContext, JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
 
hadoopRDD(JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other necessary info (e.g.
HadoopRDD.HadoopMapPartitionsWithSplitRDD<U,T> - Class in org.apache.spark.rdd
Analogous to MapPartitionsRDD, but passes in an InputSplit to the given function rather than the index of the partition.
HadoopRDD.HadoopMapPartitionsWithSplitRDD(RDD<T>, Function2<InputSplit, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
HadoopRDD.HadoopMapPartitionsWithSplitRDD$ - Class in org.apache.spark.rdd
 
HadoopRDD.HadoopMapPartitionsWithSplitRDD$() - Constructor for class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$
 
HadoopRDD.SplitInfoReflections - Class in org.apache.spark.rdd
 
HadoopRDD.SplitInfoReflections() - Constructor for class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
HadoopTableReader - Class in org.apache.spark.sql.hive
Helper class for scanning tables stored in Hadoop - e.g., to read Hive tables that reside in the data warehouse directory.
HadoopTableReader(Seq<Attribute>, MetastoreRelation, HiveContext, HiveConf) - Constructor for class org.apache.spark.sql.hive.HadoopTableReader
 
hammingLoss() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns Hamming-loss
handle() - Method in class org.apache.spark.rdd.ShuffleCoGroupSplitDep
 
handle(Signal) - Method in class org.apache.spark.util.SignalLoggerHandler
 
handleBeginEvent(Task<?>, TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleExecutorAdded(String, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleExecutorLost(String, boolean, Option<Object>) - Method in class org.apache.spark.scheduler.DAGScheduler
Responds to an executor being lost.
handleFailedTask(TaskSetManager, long, Enumeration.Value, TaskEndReason) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleFailedTask(long, Enumeration.Value, TaskEndReason) - Method in class org.apache.spark.scheduler.TaskSetManager
Marks the task as failed, re-adds it to the list of pending tasks, and notifies the DAG Scheduler.
handleGetTaskResult(TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobCancellation(int, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobCompletion(Job) - Method in class org.apache.spark.streaming.scheduler.JobSet
 
handleJobGroupCancelled(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobStart(Job) - Method in class org.apache.spark.streaming.scheduler.JobSet
 
handleJobSubmitted(int, RDD<?>, Function2<TaskContext, Iterator<Object>, ?>, int[], boolean, CallSite, JobListener, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleKillRequest(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.StagesTab
 
handleStageCancellation(int) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleSuccessfulTask(TaskSetManager, long, DirectTaskResult<?>) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleSuccessfulTask(long, DirectTaskResult<?>) - Method in class org.apache.spark.scheduler.TaskSetManager
Marks the task as successful and notifies the DAGScheduler that a task has ended.
handleTaskCompletion(CompletionEvent) - Method in class org.apache.spark.scheduler.DAGScheduler
Responds to a task finishing.
handleTaskGettingResult(TaskSetManager, long) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleTaskGettingResult(long) - Method in class org.apache.spark.scheduler.TaskSetManager
Marks the task as getting result and notifies the DAG Scheduler
handleTaskSetFailed(TaskSet, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
hasCompleted() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
hasDstId() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
hasExecutorsAliveOnHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
HasFeaturesCol - Interface in org.apache.spark.ml.param
 
HashAggregation() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
hashCode() - Method in class org.apache.spark.graphx.EdgeDirection
 
hashCode() - Method in class org.apache.spark.HashPartitioner
 
hashCode() - Method in interface org.apache.spark.mllib.linalg.Vector
 
hashCode() - Method in interface org.apache.spark.Partition
 
hashCode() - Method in class org.apache.spark.RangePartitioner
 
hashCode() - Method in class org.apache.spark.rdd.CoGroupPartition
 
hashCode() - Method in class org.apache.spark.rdd.HadoopPartition
 
hashCode() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
hashCode() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
hashCode() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
hashCode() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
hashCode() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
hashCode() - Method in class org.apache.spark.scheduler.SplitInfo
 
hashCode() - Method in class org.apache.spark.scheduler.Stage
 
hashCode() - Method in class org.apache.spark.sql.api.java.ArrayType
 
hashCode() - Method in class org.apache.spark.sql.api.java.DecimalType
 
hashCode() - Method in class org.apache.spark.sql.api.java.MapType
 
hashCode() - Method in class org.apache.spark.sql.api.java.Row
 
hashCode() - Method in class org.apache.spark.sql.api.java.StructField
 
hashCode() - Method in class org.apache.spark.sql.api.java.StructType
 
hashCode() - Method in class org.apache.spark.storage.BlockId
 
hashCode() - Method in class org.apache.spark.storage.BlockManagerId
 
hashCode() - Method in class org.apache.spark.storage.StorageLevel
 
HashedRelation - Interface in org.apache.spark.sql.execution.joins
Interface for a hashed relation by some key.
HashingTF - Class in org.apache.spark.ml.feature
:: AlphaComponent :: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF() - Constructor for class org.apache.spark.ml.feature.HashingTF
 
HashingTF - Class in org.apache.spark.mllib.feature
:: Experimental :: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF(int) - Constructor for class org.apache.spark.mllib.feature.HashingTF
 
HashingTF() - Constructor for class org.apache.spark.mllib.feature.HashingTF
 
HashJoin - Interface in org.apache.spark.sql.execution.joins
 
hashJoin(Iterator<Row>, HashedRelation) - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
HashJoin() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
hasHostAliveOnRack(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
HashOuterJoin - Class in org.apache.spark.sql.execution.joins
:: DeveloperApi :: Performs a hash based outer join for two child relations by shuffling the data using the join keys.
HashOuterJoin(Seq<Expression>, Seq<Expression>, JoinType, Option<Expression>, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.HashOuterJoin
 
HashPartitioner - Class in org.apache.spark
A Partitioner that implements hash-based partitioning using Java's Object.hashCode.
HashPartitioner(int) - Constructor for class org.apache.spark.HashPartitioner
 
HasInputCol - Interface in org.apache.spark.ml.param
 
HasLabelCol - Interface in org.apache.spark.ml.param
 
HasMaxIter - Interface in org.apache.spark.ml.param
 
hasNext() - Method in class org.apache.spark.InterruptibleIterator
 
hasNext() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
hasNext() - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
hasNext() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
hasNext() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
hasNext() - Method in interface org.apache.spark.sql.columnar.compression.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
hasNext() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
hasNext() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
hasNext() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
hasNext() - Method in class org.apache.spark.util.CompletionIterator
 
hasNext() - Method in class org.apache.spark.util.NextIterator
 
hasNext() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
hasNext() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
HasOutputCol - Interface in org.apache.spark.ml.param
 
HasPredictionCol - Interface in org.apache.spark.ml.param
 
HasRegParam - Interface in org.apache.spark.ml.param
 
hasRootAsShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
hasRootAsShutdownDeleteDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
HasScoreCol - Interface in org.apache.spark.ml.param
 
hasShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
hasShutdownDeleteTachyonDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
hasSrcId() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
hasStarted() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
HasThreshold - Interface in org.apache.spark.ml.param
 
hasUnallocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Check if any blocks are left to be processed
hasUnallocatedReceivedBlocks() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Check if any blocks are left to be allocated to batches.
HDFSCacheTaskLocation - Class in org.apache.spark.scheduler
A location on a host that is cached by HDFS.
HDFSCacheTaskLocation(String) - Constructor for class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
HdfsUtils - Class in org.apache.spark.streaming.util
 
HdfsUtils() - Constructor for class org.apache.spark.streaming.util.HdfsUtils
 
headerSparkPage(String, Function0<Seq<Node>>, SparkUITab, Option<Object>, Option<String>) - Static method in class org.apache.spark.ui.UIUtils
Returns a spark page with correctly formatted headers
headerTabs() - Method in class org.apache.spark.ui.WebUITab
Get a list of header tabs from the parent UI.
Heartbeat - Class in org.apache.spark
A heartbeat from executors to the driver.
Heartbeat(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Constructor for class org.apache.spark.Heartbeat
 
HeartbeatReceiver - Class in org.apache.spark
Lives in the driver to receive heartbeats from executors..
HeartbeatReceiver(TaskScheduler) - Constructor for class org.apache.spark.HeartbeatReceiver
 
HeartbeatResponse - Class in org.apache.spark
 
HeartbeatResponse(boolean) - Constructor for class org.apache.spark.HeartbeatResponse
 
hiccups() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
high() - Method in class org.apache.spark.partial.BoundedDouble
 
HighlyCompressedMapStatus - Class in org.apache.spark.scheduler
A MapStatus implementation that only stores the average size of non-empty blocks, plus a bitmap for tracking which blocks are empty.
highSplit() - Method in class org.apache.spark.mllib.tree.model.Bin
 
HingeGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.
HingeGradient() - Constructor for class org.apache.spark.mllib.optimization.HingeGradient
 
histogram(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[]) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute a histogram using the provided buckets.
histogram(Double[], boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
histogram(int) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[], boolean) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute a histogram using the provided buckets.
hiveContext() - Method in class org.apache.spark.sql.hive.execution.AddFile
 
hiveContext() - Method in class org.apache.spark.sql.hive.execution.AddJar
 
hiveContext() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
 
hiveContext() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
HiveContext - Class in org.apache.spark.sql.hive
An instance of the Spark SQL execution engine that integrates with data stored in Hive.
HiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext
 
hiveContext() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
hiveDevHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
The location of the hive source code.
hiveFilesTemp() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
HiveFunctionRegistry - Class in org.apache.spark.sql.hive
 
HiveFunctionRegistry() - Constructor for class org.apache.spark.sql.hive.HiveFunctionRegistry
 
HiveFunctionWrapper - Class in org.apache.spark.sql.hive
This class provides the UDF creation and also the UDF instance serialization and de-serialization cross process boundary.
HiveFunctionWrapper(String) - Constructor for class org.apache.spark.sql.hive.HiveFunctionWrapper
 
HiveFunctionWrapper() - Constructor for class org.apache.spark.sql.hive.HiveFunctionWrapper
 
HiveGenericUdaf - Class in org.apache.spark.sql.hive
 
HiveGenericUdaf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdaf
 
HiveGenericUdf - Class in org.apache.spark.sql.hive
 
HiveGenericUdf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdf
 
HiveGenericUdtf - Class in org.apache.spark.sql.hive
Converts a Hive Generic User Defined Table Generating Function (UDTF) to a Generator.
HiveGenericUdtf(HiveFunctionWrapper, Seq<String>, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdtf
 
hiveHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
The location of the compiled hive distribution
HiveInspectors - Interface in org.apache.spark.sql.hive
 
HiveInspectors.typeInfoConversions - Class in org.apache.spark.sql.hive
 
HiveInspectors.typeInfoConversions(DataType) - Constructor for class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
 
HiveMetastoreCatalog - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog(HiveContext) - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
HiveMetastoreCatalog.CreateTables - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog.CreateTables() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables
Creates any tables required for query execution.
HiveMetastoreCatalog.PreInsertionCasts - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog.PreInsertionCasts() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
Casts input data to correct data types according to table definition before inserting into that table.
HiveMetastoreTypes - Class in org.apache.spark.sql.hive
:: DeveloperApi :: Provides conversions between Spark SQL data types and Hive Metastore types.
HiveMetastoreTypes() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreTypes
 
hivePlanner() - Method in class org.apache.spark.sql.hive.HiveContext
 
hiveql(String) - Method in class org.apache.spark.sql.hive.HiveContext
 
HiveQl - Class in org.apache.spark.sql.hive
Provides a mapping from HiveQL statements to catalyst logical plans and expression trees.
HiveQl() - Constructor for class org.apache.spark.sql.hive.HiveQl
 
HiveQl.ParseException - Exception in org.apache.spark.sql.hive
Throws an error if this is not equal to other.
HiveQl.ParseException(String, Throwable) - Constructor for exception org.apache.spark.sql.hive.HiveQl.ParseException
 
HiveQl.SemanticException - Exception in org.apache.spark.sql.hive
 
HiveQl.SemanticException(String) - Constructor for exception org.apache.spark.sql.hive.HiveQl.SemanticException
 
HiveQl.Token$ - Class in org.apache.spark.sql.hive
Extractor for matching Hive's AST Tokens.
HiveQl.Token$() - Constructor for class org.apache.spark.sql.hive.HiveQl.Token$
 
HiveQl.TransformableNode - Class in org.apache.spark.sql.hive
A set of implicit transformations that allow Hive ASTNodes to be rewritten by transformations similar to catalyst.trees.TreeNode.
HiveQl.TransformableNode(ASTNode) - Constructor for class org.apache.spark.sql.hive.HiveQl.TransformableNode
 
hiveQlPartitions() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
hiveQlTable() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
hiveQTestUtilTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
HiveShim - Class in org.apache.spark.sql.hive
A compatibility layer for interacting with Hive version 0.13.1.
HiveShim() - Constructor for class org.apache.spark.sql.hive.HiveShim
 
HiveSimpleUdf - Class in org.apache.spark.sql.hive
 
HiveSimpleUdf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveSimpleUdf
 
HiveStrategies - Interface in org.apache.spark.sql.hive
 
HiveStrategies.DataSinks - Class in org.apache.spark.sql.hive
 
HiveStrategies.DataSinks() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.DataSinks
 
HiveStrategies.HiveCommandStrategy - Class in org.apache.spark.sql.hive
 
HiveStrategies.HiveCommandStrategy(HiveContext) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
HiveStrategies.HiveTableScans - Class in org.apache.spark.sql.hive
 
HiveStrategies.HiveTableScans() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
Retrieves data using a HiveTableScan.
HiveStrategies.ParquetConversion - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion
:: Experimental :: Finds table scans that would use the Hive SerDe and replaces them with our own native parquet table scan operator.
HiveStrategies.ParquetConversion.LogicalPlanHacks - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion.LogicalPlanHacks(SchemaRDD) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
HiveStrategies.ParquetConversion.PhysicalPlanHacks - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion.PhysicalPlanHacks(SparkPlan) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.PhysicalPlanHacks
 
HiveStrategies.Scripts - Class in org.apache.spark.sql.hive
 
HiveStrategies.Scripts() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.Scripts
 
hiveString() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
HiveTableScan - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi :: The Hive table scan operator.
HiveTableScan(Seq<Attribute>, MetastoreRelation, Option<Expression>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.HiveTableScan
 
HiveTableScans() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
HiveUdaf - Class in org.apache.spark.sql.hive
It is used as a wrapper for the hive functions which uses UDAF interface
HiveUdaf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveUdaf
 
HiveUdafFunction - Class in org.apache.spark.sql.hive
 
HiveUdafFunction(HiveFunctionWrapper, Seq<Expression>, AggregateExpression, boolean) - Constructor for class org.apache.spark.sql.hive.HiveUdafFunction
 
HiveUdafFunction() - Constructor for class org.apache.spark.sql.hive.HiveUdafFunction
 
host() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
host() - Method in class org.apache.spark.scheduler.ExecutorAdded
 
host() - Method in class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
host() - Method in class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
host() - Method in class org.apache.spark.scheduler.HostTaskLocation
 
host() - Method in class org.apache.spark.scheduler.TaskInfo
 
host() - Method in interface org.apache.spark.scheduler.TaskLocation
 
host() - Method in class org.apache.spark.scheduler.WorkerOffer
 
host() - Method in class org.apache.spark.storage.BlockManagerId
 
host() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
hostLocation() - Method in class org.apache.spark.scheduler.SplitInfo
 
hostPort() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
hostPort() - Method in class org.apache.spark.storage.BlockManagerId
 
hostPort() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
HostTaskLocation - Class in org.apache.spark.scheduler
A location on a host.
HostTaskLocation(String) - Constructor for class org.apache.spark.scheduler.HostTaskLocation
 
hours() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
hql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
DEPRECATED: Use sql(...) Instead
hql(String) - Method in class org.apache.spark.sql.hive.HiveContext
 
htmlResponderToServlet(Function1<HttpServletRequest, Seq<Node>>) - Static method in class org.apache.spark.ui.JettyUtils
 
HTTP_BROADCAST() - Static method in class org.apache.spark.util.MetadataCleanerType
 
HttpBroadcast<T> - Class in org.apache.spark.broadcast
A Broadcast implementation that uses HTTP server as a broadcast mechanism.
HttpBroadcast(T, boolean, long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.HttpBroadcast
 
HttpBroadcastFactory - Class in org.apache.spark.broadcast
A BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism.
HttpBroadcastFactory() - Constructor for class org.apache.spark.broadcast.HttpBroadcastFactory
 
HttpFileServer - Class in org.apache.spark
 
HttpFileServer(SparkConf, SecurityManager, int) - Constructor for class org.apache.spark.HttpFileServer
 
httpFileServer() - Method in class org.apache.spark.SparkEnv
 
httpServer() - Method in class org.apache.spark.HttpFileServer
 
HttpServer - Class in org.apache.spark
An HTTP server for static content used to allow worker nodes to access JARs added to SparkContext as well as classes created by the interpreter when the user types in code.
HttpServer(SparkConf, File, SecurityManager, int, String) - Constructor for class org.apache.spark.HttpServer
 
HyperLogLogSerializer - Class in org.apache.spark.sql.execution
 
HyperLogLogSerializer() - Constructor for class org.apache.spark.sql.execution.HyperLogLogSerializer
 

I

i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
id() - Method in class org.apache.spark.Accumulable
 
id() - Method in interface org.apache.spark.api.java.JavaRDDLike
A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.broadcast.Broadcast
 
id() - Method in class org.apache.spark.mllib.tree.model.Node
 
id() - Method in class org.apache.spark.rdd.RDD
A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
id() - Method in class org.apache.spark.scheduler.Stage
 
id() - Method in class org.apache.spark.scheduler.TaskInfo
 
id() - Method in class org.apache.spark.scheduler.TaskSet
 
id() - Method in class org.apache.spark.storage.RDDInfo
 
id() - Method in class org.apache.spark.storage.TempLocalBlockId
 
id() - Method in class org.apache.spark.storage.TempShuffleBlockId
 
id() - Method in class org.apache.spark.storage.TestBlockId
 
id() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
This is an unique identifier for the receiver input stream.
id() - Method in class org.apache.spark.streaming.scheduler.Job
 
id() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
Identifiable - Interface in org.apache.spark.ml
Object with a unique id.
IDF - Class in org.apache.spark.mllib.feature
:: Experimental :: Inverse document frequency (IDF).
IDF(int) - Constructor for class org.apache.spark.mllib.feature.IDF
 
IDF() - Constructor for class org.apache.spark.mllib.feature.IDF
 
idf() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
Returns the current IDF vector.
idf() - Method in class org.apache.spark.mllib.feature.IDFModel
 
IDF.DocumentFrequencyAggregator - Class in org.apache.spark.mllib.feature
Document frequency aggregator.
IDF.DocumentFrequencyAggregator(int) - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
IDF.DocumentFrequencyAggregator() - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
IDFModel - Class in org.apache.spark.mllib.feature
:: Experimental :: Represents an IDF model that can transform term frequency vectors.
IDFModel(Vector) - Constructor for class org.apache.spark.mllib.feature.IDFModel
 
IdGenerator - Class in org.apache.spark.util
A util used to get a unique generation ID.
IdGenerator() - Constructor for class org.apache.spark.util.IdGenerator
 
idx() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
idx() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
idx() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
ifExists() - Method in class org.apache.spark.sql.hive.DropTable
 
ifExists() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
Impurities - Class in org.apache.spark.mllib.tree.impurity
Factory for Impurity instances.
Impurities() - Constructor for class org.apache.spark.mllib.tree.impurity.Impurities
 
impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
impurity() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
Impurity - Interface in org.apache.spark.mllib.tree.impurity
:: Experimental :: Trait for calculating information gain.
impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
impurity() - Method in class org.apache.spark.mllib.tree.model.Node
 
impurityAggregator() - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
ImpurityAggregator instance specifying the impurity type.
ImpurityAggregator - Class in org.apache.spark.mllib.tree.impurity
Interface for updating views of a vector of sufficient statistics, in order to compute impurity from a sample.
ImpurityAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
 
ImpurityCalculator - Class in org.apache.spark.mllib.tree.impurity
Stores statistics for one (node, feature, bin) for calculating impurity.
ImpurityCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
 
In() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges arriving at a vertex.
IN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
In - Class in org.apache.spark.sql.sources
 
In(String, Object[]) - Constructor for class org.apache.spark.sql.sources.In
 
increaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
 
incrementEpoch() - Method in class org.apache.spark.MapOutputTrackerMaster
 
inDegrees() - Method in class org.apache.spark.graphx.GraphOps
The in-degree of each vertex in the graph.
independence() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
index() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
index() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
index() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
index(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
index(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
Return the index for the (i, j)-th element in the backing array.
index(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
index() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
index() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
index() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
index() - Method in interface org.apache.spark.Partition
Get the partition's index within its parent RDD
index() - Method in class org.apache.spark.rdd.BlockRDDPartition
 
index() - Method in class org.apache.spark.rdd.CartesianPartition
 
index() - Method in class org.apache.spark.rdd.CheckpointRDDPartition
 
index() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
index() - Method in class org.apache.spark.rdd.CoGroupPartition
 
index() - Method in class org.apache.spark.rdd.HadoopPartition
 
index() - Method in class org.apache.spark.rdd.JdbcPartition
 
index() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
index() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
index() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
index() - Method in class org.apache.spark.rdd.PartitionPruningRDDPartition
 
index() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
index() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
index() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
index() - Method in class org.apache.spark.rdd.UnionPartition
 
index() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
index() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
index() - Method in class org.apache.spark.scheduler.TaskDescription
 
index() - Method in class org.apache.spark.scheduler.TaskInfo
 
index() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
index() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
IndexedRow - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row of IndexedRowMatrix.
IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row-oriented DistributedMatrix with indexed rows.
IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
indexOf(Object) - Method in class org.apache.spark.mllib.feature.HashingTF
Returns the index of the input term.
indexSize() - Method in class org.apache.spark.graphx.impl.EdgePartition
The number of unique source vertices in the partition.
indexToLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the level of a tree which the given node is in.
indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
IndirectTaskResult<T> - Class in org.apache.spark.scheduler
A reference to a DirectTaskResult that has been stored in the worker's BlockManager.
IndirectTaskResult(BlockId, int) - Constructor for class org.apache.spark.scheduler.IndirectTaskResult
 
inferSchema(RDD<String>, double, String) - Static method in class org.apache.spark.sql.json.JsonRDD
 
InformationGainStats - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Information gain statistics for each split
InformationGainStats(double, double, double, double, Predict, Predict) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
 
init(RDD<BaggedPoint<TreePoint>>, int, Option<String>, int, int) - Static method in class org.apache.spark.mllib.tree.impl.NodeIdCache
Initialize the node Id cache with initial node Id values.
init(Configuration, Map<String, String>, MessageType) - Method in class org.apache.spark.sql.parquet.RowReadSupport
 
init(Configuration) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
init(Configuration) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
initFrom(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartitionBase
Construct the constituents of a VertexPartitionBase from the given vertices, merging duplicate entries arbitrarily.
initFrom(Iterator<Tuple2<Object, VD>>, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartitionBase
Construct the constituents of a VertexPartitionBase from the given vertices, merging duplicate entries using mergeFunc.
INITIAL_ARRAY_SIZE() - Static method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
initialCheckpoint() - Method in class org.apache.spark.streaming.StreamingContext
 
initialHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
initialize(boolean, SparkConf, SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
initialize(boolean, SparkConf, SecurityManager) - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
initialize(boolean, SparkConf, SecurityManager) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
initialize(boolean, SparkConf, SecurityManager) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
initialize() - Method in class org.apache.spark.HttpFileServer
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamBasedRecordReader
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
initialize() - Method in class org.apache.spark.metrics.MetricsConfig
 
initialize(SchedulerBackend) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
initialize(int, String, boolean) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
initialize() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Initializes with an approximate lower bound on the expected number of elements in this column.
initialize() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
initialize() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
initialize(Configuration, Properties) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
initialize(String) - Method in class org.apache.spark.storage.BlockManager
Initializes the BlockManager with the given appId.
initialize(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Initialize the DStream by setting the "zero" time, based on which the validity of future times is calculated.
initialize(String) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
The Kinesis Client Library calls this method during IRecordProcessor initialization.
initialize() - Method in class org.apache.spark.ui.SparkUI
Initialize all components of the server.
initialize() - Method in class org.apache.spark.ui.WebUI
Initialize all components of the server.
Initialized() - Static method in class org.apache.spark.rdd.CheckpointState
 
Initialized() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Initialized() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
initializeIfNecessary() - Method in interface org.apache.spark.Logging
 
initializeLocalJobConfFunc(String, TableDesc, JobConf) - Static method in class org.apache.spark.sql.hive.HadoopTableReader
Curried.
initializeLogging() - Method in interface org.apache.spark.Logging
 
initialValue() - Method in class org.apache.spark.partial.PartialResult
 
initialValues() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
 
initLocalProperties() - Method in class org.apache.spark.SparkContext
 
initNextRecordReader() - Method in class org.apache.spark.input.WholeCombineFileRecordReader
 
InLinkBlock - Class in org.apache.spark.mllib.recommendation
In-link information for a user (or product) block.
InLinkBlock(int[], Tuple2<int[], double[]>[][]) - Constructor for class org.apache.spark.mllib.recommendation.InLinkBlock
 
InMemoryColumnarTableScan - Class in org.apache.spark.sql.columnar
 
InMemoryColumnarTableScan(Seq<Attribute>, Seq<Expression>, InMemoryRelation) - Constructor for class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
inMemoryPartitionPruning() - Method in interface org.apache.spark.sql.SQLConf
When set to true, partition pruning for in-memory columnar tables is enabled.
InMemoryRelation - Class in org.apache.spark.sql.columnar
 
InMemoryRelation(Seq<Attribute>, boolean, int, StorageLevel, SparkPlan, Option<String>, RDD<CachedBatch>, Statistics) - Constructor for class org.apache.spark.sql.columnar.InMemoryRelation
 
InMemoryScans() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
InnerClosureFinder - Class in org.apache.spark.util
 
InnerClosureFinder(Set<Class<?>>) - Constructor for class org.apache.spark.util.InnerClosureFinder
 
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.EdgeRDD
Inner joins this EdgeRDD with another EdgeRDD, assuming both are partitioned using the same PartitionStrategy.
innerJoin(EdgePartition<ED2, ?>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Apply f to all edges present in both this and other and return a new EdgePartition containing the resulting edges.
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
innerJoin(Self, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Inner join another VertexPartition.
innerJoin(Iterator<Product2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Inner join an iterator of messages.
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
innerJoinKeepLeft(Iterator<Product2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Similar to innerJoin, but vertices from the left side that don't appear in iter will remain in the partition, hidden by the bitmask.
innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
INPUT() - Static method in class org.apache.spark.ui.ToolTips
 
inputBytes() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
inputBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
inputCol() - Method in interface org.apache.spark.ml.param.HasInputCol
param for input column name
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
InputDStream<T> - Class in org.apache.spark.streaming.dstream
This is the abstract base class for all input streams.
InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
 
inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
 
InputFormatInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Parses and holds information about inputFormat (and files) specified as a parameter.
InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
 
inputMetrics() - Method in class org.apache.spark.storage.BlockResult
 
inputMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
inputMetricsToJson(InputMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
inputProjection() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
inputSplit() - Method in class org.apache.spark.rdd.HadoopPartition
 
inputSplitWithLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
inRepoTests() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
insertInto(String, boolean) - Method in interface org.apache.spark.sql.SchemaRDDLike
:: Experimental :: Adds the rows from this RDD to the specified table, optionally overwriting the existing data.
insertInto(String) - Method in interface org.apache.spark.sql.SchemaRDDLike
:: Experimental :: Appends the rows from this RDD to the specified table.
InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi ::
InsertIntoHiveTable(MetastoreRelation, Map<String, Option<String>>, SparkPlan, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
InsertIntoHiveTable - Class in org.apache.spark.sql.hive
A logical plan representing insertion into Hive table.
InsertIntoHiveTable(LogicalPlan, Map<String, Option<String>>, LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.hive.InsertIntoHiveTable
 
InsertIntoParquetTable - Class in org.apache.spark.sql.parquet
:: DeveloperApi :: Operator that acts as a sink for queries on RDDs and can be used to store the output inside a directory of Parquet files.
InsertIntoParquetTable(ParquetRelation, SparkPlan, boolean) - Constructor for class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
inShutdown() - Static method in class org.apache.spark.util.Utils
Detect whether this thread might be executing a shutdown hook.
inspectorToDataType(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
instance() - Method in class org.apache.spark.metrics.MetricsSystem
 
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Gini
Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Variance
Get this impurity instance.
INSTANCE_REGEX() - Method in class org.apache.spark.metrics.MetricsConfig
 
INT - Class in org.apache.spark.sql.columnar
 
INT() - Constructor for class org.apache.spark.sql.columnar.INT
 
intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
intAccumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
IntColumnAccessor - Class in org.apache.spark.sql.columnar
 
IntColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.IntColumnAccessor
 
IntColumnBuilder - Class in org.apache.spark.sql.columnar
 
IntColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.IntColumnBuilder
 
IntColumnStats - Class in org.apache.spark.sql.columnar
 
IntColumnStats() - Constructor for class org.apache.spark.sql.columnar.IntColumnStats
 
IntDelta - Class in org.apache.spark.sql.columnar.compression
 
IntDelta() - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta
 
IntDelta.Decoder - Class in org.apache.spark.sql.columnar.compression
 
IntDelta.Decoder(ByteBuffer, NativeColumnType<IntegerType$>) - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
IntDelta.Encoder - Class in org.apache.spark.sql.columnar.compression
 
IntDelta.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
IntegerHashSetSerializer - Class in org.apache.spark.sql.execution
 
IntegerHashSetSerializer() - Constructor for class org.apache.spark.sql.execution.IntegerHashSetSerializer
 
IntegerType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the IntegerType object.
IntegerType - Class in org.apache.spark.sql.api.java
The data type representing int and Integer values.
INTER_JOB_WAIT_MS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
 
intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
 
intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
internalMap() - Method in class org.apache.spark.util.TimeStampedHashSet
 
InterruptibleIterator<T> - Class in org.apache.spark
:: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality.
InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
 
interruptThread() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
interruptThread() - Method in class org.apache.spark.scheduler.local.KillTask
 
Intersect - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Returns the rows in left that also appear in right using the built in spark intersection function.
Intersect(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Intersect
 
intersect(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD
Performs a relational intersect on two SchemaRDDs
intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the intersection of this RDD and another one.
intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the intersection of this RDD and another one.
intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return the intersection of this RDD and another one.
intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return the intersection of this RDD and another one.
intersection(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return the intersection of this RDD and another one.
intersection(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
intersection(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
intersection(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
 
Interval - Class in org.apache.spark.streaming
 
Interval(Time, Time) - Constructor for class org.apache.spark.streaming.Interval
 
Interval(long, long) - Constructor for class org.apache.spark.streaming.Interval
 
INTERVAL_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
INTERVAL_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
IntParam - Class in org.apache.spark.ml.param
Specialized version of Param[Int] for Java.
IntParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.IntParam
 
IntParam - Class in org.apache.spark.util
An extractor object for parsing strings into integers.
IntParam() - Constructor for class org.apache.spark.util.IntParam
 
intToIntWritable(int) - Static method in class org.apache.spark.SparkContext
 
intWritableConverter() - Static method in class org.apache.spark.SparkContext
 
invalidateCache(LogicalPlan) - Method in interface org.apache.spark.sql.CacheManager
Invalidates the cache of any data that contains plan.
invalidInformationGainStats() - Static method in class org.apache.spark.mllib.tree.model.InformationGainStats
An InformationGainStats object to denote that current split doesn't satisfies minimum info gain or minimum number of instances per node.
invoke(Class<?>, Object, String, Seq<Tuple2<Class<?>, Object>>) - Static method in class org.apache.spark.util.Utils
 
invokedMethod(Object, Class<?>, String) - Static method in class org.apache.spark.graphx.util.BytecodeUtils
Test whether the given closure invokes the specified method in the specified class.
isActive(long) - Method in class org.apache.spark.graphx.impl.EdgePartition
Look up vid in activeSet, throwing an exception if it is None.
isAkkaConf(String) - Static method in class org.apache.spark.SparkConf
Return whether the given config is an akka config (e.g.
isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
 
isApplicationCompleteFile(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
isAuthenticationEnabled() - Method in class org.apache.spark.SecurityManager
Check to see if authentication for the Spark communication protocols is enabled
isAvailable() - Method in class org.apache.spark.scheduler.Stage
 
isBindCollision(Throwable) - Static method in class org.apache.spark.util.Utils
Return whether the exception is caused by an address-port collision when binding.
isBroadcast() - Method in class org.apache.spark.storage.BlockId
 
isCached(String) - Method in interface org.apache.spark.sql.CacheManager
Returns true if the table is currently cached in-memory.
isCached() - Method in class org.apache.spark.storage.BlockStatus
 
isCached() - Method in class org.apache.spark.storage.RDDInfo
 
isCancelled() - Method in class org.apache.spark.ComplexFutureAction
 
isCancelled() - Method in interface org.apache.spark.FutureAction
Returns whether the action has been cancelled.
isCancelled() - Method in class org.apache.spark.JavaFutureActionWrapper
 
isCancelled() - Method in class org.apache.spark.SimpleFutureAction
 
isCategorical(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.rdd.RDD
Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
isCheckpointPresent() - Method in class org.apache.spark.streaming.StreamingContext
 
isClassification() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isCompleted() - Method in class org.apache.spark.ComplexFutureAction
 
isCompleted() - Method in interface org.apache.spark.FutureAction
Returns whether the action has already been completed with a value or an exception.
isCompleted() - Method in class org.apache.spark.SimpleFutureAction
 
isCompleted() - Method in class org.apache.spark.TaskContext
Whether the task has completed.
isCompleted() - Method in class org.apache.spark.TaskContextImpl
 
isCompressionCodecFile(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
isContainsNull() - Method in class org.apache.spark.sql.api.java.ArrayType
 
isContinuous(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isDefined(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
isDone() - Method in class org.apache.spark.JavaFutureActionWrapper
 
isDriver() - Method in class org.apache.spark.broadcast.BroadcastManager
 
isDriver() - Method in class org.apache.spark.storage.BlockManagerId
 
isEmpty() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
isEventLogEnabled() - Method in class org.apache.spark.SparkContext
 
isEventLogFile(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
isExecutorAlive(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
isExecutorStartupConf(String) - Static method in class org.apache.spark.SparkConf
Return whether the given config should be passed to an executor on start-up.
isExtended() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
isFairScheduler() - Method in class org.apache.spark.ui.jobs.JobsTab
 
isFairScheduler() - Method in class org.apache.spark.ui.jobs.StagesTab
 
isFatalError(Throwable) - Static method in class org.apache.spark.util.Utils
Returns true if the given exception was fatal.
isFinished(Protos.TaskState) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Check whether a Mesos task state represents a finished task
isFinished(Enumeration.Value) - Static method in class org.apache.spark.TaskState
 
isFixed() - Method in class org.apache.spark.sql.api.java.DecimalType
 
isInitialized() - Method in class org.apache.spark.streaming.dstream.DStream
 
isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
 
isInMemory() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
isInterrupted() - Method in class org.apache.spark.TaskContext
Whether the task has been killed.
isInterrupted() - Method in class org.apache.spark.TaskContextImpl
 
isLazy() - Method in class org.apache.spark.sql.execution.CacheTableCommand
 
isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
 
isLeftChild(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Returns true if this is a left child.
isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
 
isLocal() - Method in class org.apache.spark.SparkContext
 
isLocal() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
isLogManagerEnabled() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Check if the log manager is enabled.
isMac() - Static method in class org.apache.spark.util.Utils
Whether the underlying operating system is Mac OS X.
isMulticlass() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isMulticlassClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
 
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
 
isNullable() - Method in class org.apache.spark.sql.api.java.StructField
 
isNullAt(int) - Method in class org.apache.spark.sql.api.java.Row
Returns true if value at column `i` is NULL.
isOpen() - Method in class org.apache.spark.storage.BlockObjectWriter
 
isOpen() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
isParquetBinaryAsString() - Method in interface org.apache.spark.sql.SQLConf
When set to true, we always treat byte arrays in Parquet files as strings.
isPrimitiveType(DataType) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
isRDD() - Method in class org.apache.spark.storage.BlockId
 
isReady() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
isReady() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
isReceiverStarted() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Check if receiver has been marked for stopping
isReceiverStopped() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Check if receiver has been marked for stopping
isRegistered() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
isRegistered() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
isRunningInYarnContainer(SparkConf) - Static method in class org.apache.spark.util.Utils
 
isRunningLocally() - Method in class org.apache.spark.TaskContext
 
isRunningLocally() - Method in class org.apache.spark.TaskContextImpl
 
isSet(Param<?>) - Method in interface org.apache.spark.ml.param.Params
Checks whether a param is explicitly set.
isShuffle() - Method in class org.apache.spark.storage.BlockId
 
isShuffleMap() - Method in class org.apache.spark.scheduler.Stage
 
isSparkPortConf(String) - Static method in class org.apache.spark.SparkConf
Return true if the given config matches either spark.*.port or spark.port.*.
isSparkVersionFile(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
isSplitable(JobContext, Path) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Override of isSplitable to ensure initial computation of the record length
isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver
Check if the receiver has started or not.
isStopped() - Method in class org.apache.spark.SparkEnv
 
isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver
Check if receiver has been marked for stopping.
isSymlink(File) - Static method in class org.apache.spark.util.Utils
Check to see if file is a symbolic link.
isTesting() - Static method in class org.apache.spark.util.Utils
Indicates whether Spark is currently running unit tests.
isTimeValid(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Checks whether the 'time' is valid wrt slideDuration for generating RDD
isTimeValid(Time) - Method in class org.apache.spark.streaming.dstream.InputDStream
Checks whether the 'time' is valid wrt slideDuration for generating RDD.
isTraceEnabled() - Method in interface org.apache.spark.Logging
 
isUDAFBridgeRequired() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
isUnlimited() - Method in class org.apache.spark.sql.api.java.DecimalType
 
isUnordered(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isValid() - Method in class org.apache.spark.broadcast.Broadcast
Whether this Broadcast is actually usable.
isValid() - Method in class org.apache.spark.rdd.BlockRDD
Whether this BlockRDD is actually usable.
isValid() - Method in class org.apache.spark.storage.StorageLevel
 
isValueContainsNull() - Method in class org.apache.spark.sql.api.java.MapType
 
isWindows() - Static method in class org.apache.spark.util.Utils
Whether the underlying operating system is Windows.
isWorthCompressing(Encoder<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
isZero() - Method in class org.apache.spark.streaming.Duration
 
isZombie() - Method in class org.apache.spark.scheduler.TaskSetManager
 
it() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
item() - Method in class org.apache.spark.streaming.receiver.SingleItemData
 
iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator() - Method in class org.apache.spark.graphx.impl.EdgePartition
Get an iterator over the edges in this partition.
iterator() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
Returns an iterator over all vertex ids stored in this `RoutingTablePartition`.
iterator() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
iterator() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
iterator() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator() - Method in class org.apache.spark.storage.IteratorValues
 
iterator() - Method in class org.apache.spark.streaming.receiver.IteratorBlock
 
iterator() - Method in class org.apache.spark.streaming.receiver.IteratorData
 
iterator() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
iterator() - Method in class org.apache.spark.util.TimeStampedHashMap
 
iterator() - Method in class org.apache.spark.util.TimeStampedHashSet
 
iterator() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
IteratorBlock - Class in org.apache.spark.streaming.receiver
class representing a block received as an Iterator
IteratorBlock(Iterator<Object>) - Constructor for class org.apache.spark.streaming.receiver.IteratorBlock
 
IteratorData<T> - Class in org.apache.spark.streaming.receiver
 
IteratorData(Iterator<T>) - Constructor for class org.apache.spark.streaming.receiver.IteratorData
 
IteratorValues - Class in org.apache.spark.storage
 
IteratorValues(Iterator<Object>) - Constructor for class org.apache.spark.storage.IteratorValues
 

J

j() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
jarDir() - Method in class org.apache.spark.HttpFileServer
 
jarOfClass(Class<?>) - Static method in class org.apache.spark.api.java.JavaSparkContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.SparkContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.StreamingContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfObject(Object) - Static method in class org.apache.spark.api.java.JavaSparkContext
Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jarOfObject(Object) - Static method in class org.apache.spark.SparkContext
Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jars() - Method in class org.apache.spark.api.java.JavaSparkContext
 
jars() - Method in class org.apache.spark.SparkContext
 
jars() - Method in class org.apache.spark.streaming.Checkpoint
 
javaClassToDataType(Class<?>) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
JavaDeserializationStream - Class in org.apache.spark.serializer
 
JavaDeserializationStream(InputStream, ClassLoader) - Constructor for class org.apache.spark.serializer.JavaDeserializationStream
 
JavaDoubleRDD - Class in org.apache.spark.api.java
 
JavaDoubleRDD(RDD<Object>) - Constructor for class org.apache.spark.api.java.JavaDoubleRDD
 
JavaDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data.
JavaDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaDStream
 
JavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Interface in org.apache.spark.streaming.api.java
 
JavaFutureAction<T> - Interface in org.apache.spark.api.java
 
JavaFutureActionWrapper<S,T> - Class in org.apache.spark
 
JavaFutureActionWrapper(FutureAction<S>, Function1<S, T>) - Constructor for class org.apache.spark.JavaFutureActionWrapper
 
JavaHadoopRDD<K,V> - Class in org.apache.spark.api.java
 
JavaHadoopRDD(HadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaHadoopRDD
 
JavaHiveContext - Class in org.apache.spark.sql.hive.api.java
The entry point for executing Spark SQL queries from a Java program.
JavaHiveContext(SQLContext) - Constructor for class org.apache.spark.sql.hive.api.java.JavaHiveContext
 
JavaHiveContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.hive.api.java.JavaHiveContext
 
JavaInputDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to InputDStream.
JavaInputDStream(InputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaInputDStream
 
JavaIterableWrapperSerializer - Class in org.apache.spark.serializer
A Kryo serializer for serializing results returned by asJavaIterable.
JavaIterableWrapperSerializer() - Constructor for class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
JavaKinesisWordCountASL - Class in org.apache.spark.examples.streaming
Java-friendly Kinesis Spark Streaming WordCount example See http://spark.apache.org/docs/latest/streaming-kinesis.html for more details on the Kinesis Spark Streaming integration.
JavaNewHadoopRDD<K,V> - Class in org.apache.spark.api.java
 
JavaNewHadoopRDD(NewHadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaNewHadoopRDD
 
JavaPairDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join.
JavaPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairDStream
 
JavaPairInputDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to InputDStream of key-value pairs.
JavaPairInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
JavaPairRDD<K,V> - Class in org.apache.spark.api.java
 
JavaPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaPairRDD
 
JavaPairReceiverInputDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaPairReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
JavaRDD<T> - Class in org.apache.spark.api.java
 
JavaRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.api.java.JavaRDD
 
JavaRDDLike<T,This extends JavaRDDLike<T,This>> - Interface in org.apache.spark.api.java
Defines operations common to several Java RDD implementations.
JavaReceiverInputDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
JavaSchemaRDD - Class in org.apache.spark.sql.api.java
An RDD of Row objects that is returned as the result of a Spark SQL query.
JavaSchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.api.java.JavaSchemaRDD
 
JavaSerializationStream - Class in org.apache.spark.serializer
 
JavaSerializationStream(OutputStream, int) - Constructor for class org.apache.spark.serializer.JavaSerializationStream
 
JavaSerializer - Class in org.apache.spark.serializer
:: DeveloperApi :: A Spark serializer that uses Java's built-in serialization.
JavaSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.JavaSerializer
 
JavaSerializerInstance - Class in org.apache.spark.serializer
 
JavaSerializerInstance(int, ClassLoader) - Constructor for class org.apache.spark.serializer.JavaSerializerInstance
 
JavaSparkContext - Class in org.apache.spark.api.java
A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones.
JavaSparkContext(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext() - Constructor for class org.apache.spark.api.java.JavaSparkContext
Create a JavaSparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
JavaSparkContext(SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String[]) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String[], Map<String, String>) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkStatusTracker - Class in org.apache.spark.api.java
Low-level status reporting APIs for monitoring job and stage progress.
JavaSparkStatusTracker(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkStatusTracker
 
JavaSQLContext - Class in org.apache.spark.sql.api.java
The entry point for executing Spark SQL queries from a Java program.
JavaSQLContext(SQLContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
 
JavaSQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
 
JavaStreamingContext - Class in org.apache.spark.streaming.api.java
A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality.
JavaStreamingContext(StreamingContext) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
 
JavaStreamingContext(String, String, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[]) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[], Map<String, String>) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(JavaSparkContext, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a JavaStreamingContext using an existing JavaSparkContext.
JavaStreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a JavaStreamingContext using a SparkConf configuration.
JavaStreamingContext(String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Recreate a JavaStreamingContext from a checkpoint file.
JavaStreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Re-creates a JavaStreamingContext from a checkpoint file.
JavaStreamingContextFactory - Interface in org.apache.spark.streaming.api.java
Factory interface for creating a new JavaStreamingContext
javaToPython() - Method in class org.apache.spark.sql.SchemaRDD
Converts a JavaRDD to a PythonRDD.
JavaToScalaUDTWrapper<UserType> - Class in org.apache.spark.sql.api.java
Scala wrapper for a Java UserDefinedType
JavaToScalaUDTWrapper(UserDefinedType<UserType>) - Constructor for class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
 
javaUDT() - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
 
JavaUtils - Class in org.apache.spark.api.java
 
JavaUtils() - Constructor for class org.apache.spark.api.java.JavaUtils
 
JavaUtils.SerializableMapWrapper<A,B> - Class in org.apache.spark.api.java
 
JavaUtils.SerializableMapWrapper(Map<A, B>) - Constructor for class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
JdbcPartition - Class in org.apache.spark.rdd
 
JdbcPartition(int, long, long) - Constructor for class org.apache.spark.rdd.JdbcPartition
 
JdbcRDD<T> - Class in org.apache.spark.rdd
An RDD that executes an SQL query on a JDBC connection and reads results.
JdbcRDD(SparkContext, Function0<Connection>, String, long, long, int, Function1<ResultSet, T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.JdbcRDD
 
JdbcRDD.ConnectionFactory - Interface in org.apache.spark.rdd
 
JettyUtils - Class in org.apache.spark.ui
Utilities for launching a web server using Jetty's HTTP Server class
JettyUtils() - Constructor for class org.apache.spark.ui.JettyUtils
 
JettyUtils.ServletParams<T> - Class in org.apache.spark.ui
 
JettyUtils.ServletParams(Function1<HttpServletRequest, T>, String, Function1<T, String>, Function1<T, Object>) - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams
 
JettyUtils.ServletParams$ - Class in org.apache.spark.ui
 
JettyUtils.ServletParams$() - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams$
 
JmxSink - Class in org.apache.spark.metrics.sink
 
JmxSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.JmxSink
 
Job - Class in org.apache.spark.streaming.scheduler
Class representing a Spark computation.
Job(Time, Function0<?>) - Constructor for class org.apache.spark.streaming.scheduler.Job
 
job() - Method in class org.apache.spark.streaming.scheduler.JobCompleted
 
job() - Method in class org.apache.spark.streaming.scheduler.JobStarted
 
JobCancelled - Class in org.apache.spark.scheduler
 
JobCancelled(int) - Constructor for class org.apache.spark.scheduler.JobCancelled
 
JobCompleted - Class in org.apache.spark.streaming.scheduler
 
JobCompleted(Job) - Constructor for class org.apache.spark.streaming.scheduler.JobCompleted
 
jobEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobEndToJson(SparkListenerJobEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
JobExecutionStatus - Enum in org.apache.spark
 
jobFailed(Exception) - Method in class org.apache.spark.partial.ApproximateActionListener
 
JobFailed - Class in org.apache.spark.scheduler
 
JobFailed(Exception) - Constructor for class org.apache.spark.scheduler.JobFailed
 
jobFailed(Exception) - Method in interface org.apache.spark.scheduler.JobListener
 
jobFailed(Exception) - Method in class org.apache.spark.scheduler.JobWaiter
 
jobFinished() - Method in class org.apache.spark.scheduler.JobWaiter
 
JobGenerator - Class in org.apache.spark.streaming.scheduler
This class generates jobs from DStreams as well as drives checkpointing and cleaning up DStream metadata.
JobGenerator(JobScheduler) - Constructor for class org.apache.spark.streaming.scheduler.JobGenerator
 
JobGeneratorEvent - Interface in org.apache.spark.streaming.scheduler
Event classes for JobGenerator
jobGroup() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
JobGroupCancelled - Class in org.apache.spark.scheduler
 
JobGroupCancelled(String) - Constructor for class org.apache.spark.scheduler.JobGroupCancelled
 
jobId() - Method in class org.apache.spark.scheduler.ActiveJob
 
jobId() - Method in class org.apache.spark.scheduler.JobCancelled
 
jobId() - Method in class org.apache.spark.scheduler.JobSubmitted
 
jobId() - Method in class org.apache.spark.scheduler.JobWaiter
 
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
jobId() - Method in class org.apache.spark.scheduler.Stage
 
jobId() - Method in interface org.apache.spark.SparkJobInfo
 
jobId() - Method in class org.apache.spark.SparkJobInfoImpl
 
jobId() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
jobIds() - Method in interface org.apache.spark.api.java.JavaFutureAction
Returns the job IDs run by the underlying async operation.
jobIds() - Method in class org.apache.spark.ComplexFutureAction
 
jobIds() - Method in interface org.apache.spark.FutureAction
Returns the job IDs run by the underlying async operation.
jobIds() - Method in class org.apache.spark.JavaFutureActionWrapper
 
jobIds() - Method in class org.apache.spark.scheduler.Stage
Set of jobs that this stage belongs to.
jobIds() - Method in class org.apache.spark.SimpleFutureAction
 
jobIdToActiveJob() - Method in class org.apache.spark.scheduler.DAGScheduler
 
jobIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
jobIdToStageIds() - Method in class org.apache.spark.scheduler.DAGScheduler
 
JobListener - Interface in org.apache.spark.scheduler
Interface used to listen for job completion or failure events after submitting a job to the DAGScheduler.
JobLogger - Class in org.apache.spark.scheduler
:: DeveloperApi :: A logger class to record runtime information for jobs in Spark.
JobLogger(String, String) - Constructor for class org.apache.spark.scheduler.JobLogger
 
JobLogger() - Constructor for class org.apache.spark.scheduler.JobLogger
 
JobPage - Class in org.apache.spark.ui.jobs
Page showing statistics and stage list for a given job
JobPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.JobPage
 
jobProgressListener() - Method in class org.apache.spark.SparkContext
 
JobProgressListener - Class in org.apache.spark.ui.jobs
:: DeveloperApi :: Tracks task-level information to be displayed in the UI.
JobProgressListener(SparkConf) - Constructor for class org.apache.spark.ui.jobs.JobProgressListener
 
jobProgressListener() - Method in class org.apache.spark.ui.SparkUI
 
JobResult - Interface in org.apache.spark.scheduler
:: DeveloperApi :: A result of a job in the DAGScheduler.
jobResult() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
jobResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobResultToJson(JobResult) - Static method in class org.apache.spark.util.JsonProtocol
 
jobs() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
JobScheduler - Class in org.apache.spark.streaming.scheduler
This class schedules jobs to be run on Spark.
JobScheduler(StreamingContext) - Constructor for class org.apache.spark.streaming.scheduler.JobScheduler
 
JobSchedulerEvent - Interface in org.apache.spark.streaming.scheduler
 
JobSet - Class in org.apache.spark.streaming.scheduler
Class representing a set of Jobs belong to the same batch.
JobSet(Time, Seq<Job>, Map<Object, ReceivedBlockInfo[]>) - Constructor for class org.apache.spark.streaming.scheduler.JobSet
 
JobsTab - Class in org.apache.spark.ui.jobs
Web UI showing progress status of all jobs in the given SparkContext.
JobsTab(SparkUI) - Constructor for class org.apache.spark.ui.jobs.JobsTab
 
JobStarted - Class in org.apache.spark.streaming.scheduler
 
JobStarted(Job) - Constructor for class org.apache.spark.streaming.scheduler.JobStarted
 
jobStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobStartToJson(SparkListenerJobStart) - Static method in class org.apache.spark.util.JsonProtocol
 
JobSubmitted - Class in org.apache.spark.scheduler
 
JobSubmitted(int, RDD<?>, Function2<TaskContext, Iterator<Object>, ?>, int[], boolean, CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.JobSubmitted
 
JobSucceeded - Class in org.apache.spark.scheduler
 
JobSucceeded() - Constructor for class org.apache.spark.scheduler.JobSucceeded
 
JobWaiter<T> - Class in org.apache.spark.scheduler
An object that waits for a DAGScheduler job to complete.
JobWaiter(DAGScheduler, int, int, Function2<Object, T, BoxedUnit>) - Constructor for class org.apache.spark.scheduler.JobWaiter
 
join(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
join(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD containing all pairs of elements with matching keys in this and other.
join(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join() - Method in class org.apache.spark.sql.execution.Generate
 
join(SchemaRDD, JoinType, Option<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
Performs a relational join on two SchemaRDDs
join(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
joinType() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
joinType() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
joinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD>, ClassTag<U>) - Method in class org.apache.spark.graphx.GraphOps
Join the vertices with an RDD and then apply a function from the the vertex and RDD entry to a new vertex value.
jsonFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Loads a JSON file (one object per line), returning the result as a JavaSchemaRDD.
jsonFile(String, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
:: Experimental :: Loads a JSON file (one object per line) and applies the given schema, returning the result as a JavaSchemaRDD.
jsonFile(String) - Method in class org.apache.spark.sql.SQLContext
Loads a JSON file (one object per line), returning the result as a SchemaRDD.
jsonFile(String, StructType) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Loads a JSON file (one object per line) and applies the given schema, returning the result as a SchemaRDD.
jsonFile(String, double) - Method in class org.apache.spark.sql.SQLContext
:: Experimental ::
jsonOption(JsonAST.JValue) - Static method in class org.apache.spark.util.Utils
Return an option that translates JNothing to None
JsonProtocol - Class in org.apache.spark.util
Serializes SparkListener events to/from JSON.
JsonProtocol() - Constructor for class org.apache.spark.util.JsonProtocol
 
jsonRDD(JavaRDD<String>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a JavaSchemaRDD.
jsonRDD(JavaRDD<String>, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
:: Experimental :: Loads an RDD[String] storing JSON objects (one object per record) and applies the given schema, returning the result as a JavaSchemaRDD.
JsonRDD - Class in org.apache.spark.sql.json
 
JsonRDD() - Constructor for class org.apache.spark.sql.json.JsonRDD
 
jsonRDD(RDD<String>) - Method in class org.apache.spark.sql.SQLContext
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a SchemaRDD.
jsonRDD(RDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Loads an RDD[String] storing JSON objects (one object per record) and applies the given schema, returning the result as a SchemaRDD.
jsonRDD(RDD<String>, double) - Method in class org.apache.spark.sql.SQLContext
:: Experimental ::
JSONRelation - Class in org.apache.spark.sql.json
 
JSONRelation(String, double, SQLContext) - Constructor for class org.apache.spark.sql.json.JSONRelation
 
jsonResponderToServlet(Function1<HttpServletRequest, JsonAST.JValue>) - Static method in class org.apache.spark.ui.JettyUtils
 
jsonStringToRow(RDD<String>, StructType, String) - Static method in class org.apache.spark.sql.json.JsonRDD
 
jvmInformation() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
JvmSource - Class in org.apache.spark.metrics.source
 
JvmSource() - Constructor for class org.apache.spark.metrics.source.JvmSource
 

K

k() - Method in class org.apache.spark.mllib.clustering.KMeansModel
Total number of clusters.
k() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
K_MEANS_PARALLEL() - Static method in class org.apache.spark.mllib.clustering.KMeans
 
KafkaInputDStream<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka
Input stream that pulls messages from a Kafka Broker.
KafkaInputDStream(StreamingContext, Map<String, String>, Map<String, Object>, boolean, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.KafkaInputDStream
 
KafkaReceiver<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka
 
KafkaReceiver(Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.KafkaReceiver
 
KafkaUtils - Class in org.apache.spark.streaming.kafka
 
KafkaUtils() - Constructor for class org.apache.spark.streaming.kafka.KafkaUtils
 
kClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
 
kClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
 
kClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
keyBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Creates tuples of the elements in this RDD by applying f.
keyBy(Function1<T, K>) - Method in class org.apache.spark.rdd.RDD
Creates tuples of the elements in this RDD by applying f.
keyClass() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
keyOrdering() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
keyOrdering() - Method in class org.apache.spark.ShuffleDependency
 
keys() - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the keys of each tuple.
kFold(RDD<T>, int, int, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils
:: Experimental :: Return a k element array of pairs of RDDs with the first element of each pair containing the training data, a complement of the validation data and the second element, the validation data, containing a unique 1/kth of the data.
kill(boolean) - Method in class org.apache.spark.scheduler.Task
Kills a task by setting the interrupted flag to true.
killed() - Method in class org.apache.spark.scheduler.Task
Whether the task has been killed.
KILLED() - Static method in class org.apache.spark.TaskState
 
killEnabled() - Method in class org.apache.spark.ui.jobs.JobsTab
 
killEnabled() - Method in class org.apache.spark.ui.jobs.StagesTab
 
killEnabled() - Method in class org.apache.spark.ui.SparkUI
 
killExecutor(String) - Method in interface org.apache.spark.ExecutorAllocationClient
Request that the cluster manager kill the specified executor.
killExecutor(String) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Request that cluster manager the kill the specified executor.
killExecutors(Seq<String>) - Method in interface org.apache.spark.ExecutorAllocationClient
Request that the cluster manager kill the specified executors.
killExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
Request that the cluster manager kill the specified executors.
killExecutors(Seq<String>) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Request that the cluster manager kill the specified executors.
killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
KillTask - Class in org.apache.spark.scheduler.local
 
KillTask(long, boolean) - Constructor for class org.apache.spark.scheduler.local.KillTask
 
killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.local.LocalBackend
 
killTask(long, String, boolean) - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
KinesisCheckpointState - Class in org.apache.spark.streaming.kinesis
This is a helper class for managing checkpoint clocks.
KinesisCheckpointState(Duration, Clock) - Constructor for class org.apache.spark.streaming.kinesis.KinesisCheckpointState
 
kinesisClientLibConfiguration() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
KinesisReceiver - Class in org.apache.spark.streaming.kinesis
Custom AWS Kinesis-specific implementation of Spark Streaming's Receiver.
KinesisReceiver(String, String, String, Duration, InitialPositionInStream, StorageLevel) - Constructor for class org.apache.spark.streaming.kinesis.KinesisReceiver
 
KinesisRecordProcessor - Class in org.apache.spark.streaming.kinesis
Kinesis-specific implementation of the Kinesis Client Library (KCL) IRecordProcessor.
KinesisRecordProcessor(KinesisReceiver, String, KinesisCheckpointState) - Constructor for class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
 
KinesisUtils - Class in org.apache.spark.streaming.kinesis
Helper class to create Amazon Kinesis Input Stream :: Experimental ::
KinesisUtils() - Constructor for class org.apache.spark.streaming.kinesis.KinesisUtils
 
KinesisWordCountASL - Class in org.apache.spark.examples.streaming
Kinesis Spark Streaming WordCount example.
KinesisWordCountASL() - Constructor for class org.apache.spark.examples.streaming.KinesisWordCountASL
 
KinesisWordCountProducerASL - Class in org.apache.spark.examples.streaming
Usage: KinesisWordCountProducerASL is the name of the Kinesis stream (ie.
KinesisWordCountProducerASL() - Constructor for class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
kManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
KMeans - Class in org.apache.spark.mllib.clustering
K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al).
KMeans() - Constructor for class org.apache.spark.mllib.clustering.KMeans
Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, runs: 1, initializationMode: "k-means||", initializationSteps: 5, epsilon: 1e-4}.
KMeansDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate test data for KMeans.
KMeansDataGenerator() - Constructor for class org.apache.spark.mllib.util.KMeansDataGenerator
 
KMeansModel - Class in org.apache.spark.mllib.clustering
A clustering model for K-means.
KMeansModel(Vector[]) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel
 
kMeansPlusPlus(int, VectorWithNorm[], double[], int, int) - Static method in class org.apache.spark.mllib.clustering.LocalKMeans
Run K-means++ on the weighted point set points.
KryoDeserializationStream - Class in org.apache.spark.serializer
 
KryoDeserializationStream(Kryo, InputStream) - Constructor for class org.apache.spark.serializer.KryoDeserializationStream
 
KryoRegistrator - Interface in org.apache.spark.serializer
Interface implemented by clients to register their classes with Kryo when using Kryo serialization.
KryoResourcePool - Class in org.apache.spark.sql.execution
 
KryoResourcePool(int) - Constructor for class org.apache.spark.sql.execution.KryoResourcePool
 
KryoSerializationStream - Class in org.apache.spark.serializer
 
KryoSerializationStream(Kryo, OutputStream) - Constructor for class org.apache.spark.serializer.KryoSerializationStream
 
KryoSerializer - Class in org.apache.spark.serializer
A Spark serializer that uses the Kryo serialization library.
KryoSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.KryoSerializer
 
KryoSerializerInstance - Class in org.apache.spark.serializer
 
KryoSerializerInstance(KryoSerializer) - Constructor for class org.apache.spark.serializer.KryoSerializerInstance
 
kv() - Method in class org.apache.spark.sql.execution.SetCommand
 

L

L1Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Updater for L1 regularized problems.
L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
 
label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
label() - Method in class org.apache.spark.mllib.tree.impl.TreePoint
 
labelCol() - Method in interface org.apache.spark.ml.param.HasLabelCol
param for label column name
LabeledPoint - Class in org.apache.spark.mllib.regression
Class that represents the features and labels of a data point.
LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
 
LabelPropagation - Class in org.apache.spark.graphx.lib
Label Propagation algorithm.
LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
 
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns the sequence of labels in ascending order
labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns the sequence of labels in ascending order
LassoModel - Class in org.apache.spark.mllib.regression
Regression model trained using Lasso.
LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
 
LassoWithSGD - Class in org.apache.spark.mllib.regression
Train a regression model with L1-regularization using Stochastic Gradient Descent.
LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD
Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 0.01, miniBatchFraction: 1.0}.
lastCompletedBatch() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastDir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
lastFinishTime() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastId() - Static method in class org.apache.spark.Accumulators
 
lastLaunchTime() - Method in class org.apache.spark.scheduler.TaskSetManager
 
lastProgressBar() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastReceivedBatch() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastReceivedBatchRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastSeenMs() - Method in class org.apache.spark.storage.BlockManagerInfo
 
lastUpdateTime() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
laterViewToken() - Static method in class org.apache.spark.sql.hive.HiveQl
 
latestInfo() - Method in class org.apache.spark.scheduler.Stage
Pointer to the latest [StageInfo] object, set by DAGScheduler.
latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Return the latest model.
latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Return the latest model.
LAUNCHING() - Static method in class org.apache.spark.TaskState
 
launchTasks(Seq<Seq<TaskDescription>>) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
 
LBFGS - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to solve an optimization problem using Limited-memory BFGS.
LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
 
LeafNode - Interface in org.apache.spark.sql.execution
 
learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
LeastSquaresGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a Least-squared loss function, as used in linear regression.
LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
left() - Method in class org.apache.spark.sql.execution.Except
 
left() - Method in class org.apache.spark.sql.execution.Intersect
 
left() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
left() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
left() - Method in class org.apache.spark.sql.execution.joins.CartesianProduct
 
left() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
left() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
left() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
The Streamed Relation
left() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
left() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the index of the left child of this node.
leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
leftJoin(Self, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Left outer join another VertexPartition.
leftJoin(Iterator<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Left outer join another iterator of messages.
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
Left joins this VertexRDD with an RDD containing vertex attribute pairs.
leftKeys() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
leftKeys() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
leftKeys() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
leftKeys() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
leftKeys() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
 
leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
LeftSemiJoin() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
LeftSemiJoinBNL - Class in org.apache.spark.sql.execution.joins
:: DeveloperApi :: Using BroadcastNestedLoopJoin to calculate left semi join result when there's no join keys for hash join.
LeftSemiJoinBNL(SparkPlan, SparkPlan, Option<Expression>) - Constructor for class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
LeftSemiJoinHash - Class in org.apache.spark.sql.execution.joins
:: DeveloperApi :: Build the right table's join keys into a HashSet, and iteratively go through the left table, to find the if join keys are in the Hash set.
LeftSemiJoinHash(Seq<Expression>, Seq<Expression>, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
Left joins this RDD with another VertexRDD with the same index.
length() - Method in class org.apache.spark.scheduler.SplitInfo
 
length() - Method in class org.apache.spark.sql.api.java.Row
Returns the number of columns present in this Row.
length() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
length() - Method in class org.apache.spark.storage.FileSegment
 
length() - Method in class org.apache.spark.storage.TachyonFileSegment
 
length() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
length() - Method in class org.apache.spark.util.Distribution
 
length() - Method in class org.apache.spark.util.Vector
 
less(Duration) - Method in class org.apache.spark.streaming.Duration
 
less(Time) - Method in class org.apache.spark.streaming.Time
 
lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
 
lessEq(Time) - Method in class org.apache.spark.streaming.Time
 
LessThan - Class in org.apache.spark.sql.sources
 
LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
 
LessThanOrEqual - Class in org.apache.spark.sql.sources
 
LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
 
level() - Method in class org.apache.spark.storage.BlockInfo
 
lexical() - Method in class org.apache.spark.sql.hive.ExtendedHiveQlParser
 
lexical() - Method in class org.apache.spark.sql.sources.DDLParser
 
lexicographicOrdering() - Static method in class org.apache.spark.graphx.Edge
 
lexicographicOrdering() - Static method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
libraryPathEnvName() - Static method in class org.apache.spark.util.Utils
Return the current system LD_LIBRARY_PATH name
libraryPathEnvPrefix(Seq<String>) - Static method in class org.apache.spark.util.Utils
Return the prefix of a command that appends the given library paths to the system-specific library path environment variable.
LIKE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
Limit - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Take the first limit elements.
Limit(int, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Limit
 
limit() - Method in class org.apache.spark.sql.execution.Limit
 
limit() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
limit(Expression) - Method in class org.apache.spark.sql.SchemaRDD
 
limit(int) - Method in class org.apache.spark.sql.SchemaRDD
Limits the results by the given integer.
LinearDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate sample data used for Linear Data.
LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
 
LinearRegressionModel - Class in org.apache.spark.mllib.regression
Regression model trained using LinearRegression.
LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
 
LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
Train a linear regression model with no regularization using Stochastic Gradient Descent.
LinearRegressionWithSGD(double, int, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
 
LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Construct a LinearRegression object with default parameters: {stepSize: 1.0, numIterations: 100, miniBatchFraction: 1.0}.
listener() - Method in class org.apache.spark.scheduler.ActiveJob
 
listener() - Method in class org.apache.spark.scheduler.JobSubmitted
 
listener() - Method in class org.apache.spark.streaming.ui.StreamingTab
 
listener() - Method in class org.apache.spark.ui.env.EnvironmentTab
 
listener() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
listener() - Method in class org.apache.spark.ui.jobs.JobsTab
 
listener() - Method in class org.apache.spark.ui.jobs.StagesTab
 
listener() - Method in class org.apache.spark.ui.storage.StorageTab
 
listenerBus() - Method in class org.apache.spark.SparkContext
 
listenerBus() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
listenerThread() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
listenerThreadIsAlive() - Method in class org.apache.spark.scheduler.LiveListenerBus
For testing only.
listFiles(String, Configuration) - Static method in class org.apache.spark.sql.parquet.FileSystemHelper
 
listingTable(Seq<String>, Function1<T, Seq<Node>>, Iterable<T>, boolean, Option<String>, Seq<String>, boolean) - Static method in class org.apache.spark.ui.UIUtils
Returns an HTML table constructed by generating a row for each object in a sequence.
LiveListenerBus - Class in org.apache.spark.scheduler
Asynchronously passes SparkListenerEvents to registered SparkListeners.
LiveListenerBus() - Constructor for class org.apache.spark.scheduler.LiveListenerBus
 
loadClass(String) - Method in class org.apache.spark.util.ParentClassLoader
 
loadDefaultSparkProperties(SparkConf, String) - Static method in class org.apache.spark.util.Utils
Load default Spark properties from the given file.
loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile.
loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile with the default number of partitions.
loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of partitions.
loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of features determined automatically and the default number of partitions.
loadTestTable(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads vectors saved using RDD[Vector].saveAsTextFile.
loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads vectors saved using RDD[Vector].saveAsTextFile with the default number of partitions.
localAccums() - Static method in class org.apache.spark.Accumulators
 
LocalActor - Class in org.apache.spark.scheduler.local
Calls to LocalBackend are all serialized through LocalActor.
LocalActor(TaskSchedulerImpl, LocalBackend, int) - Constructor for class org.apache.spark.scheduler.local.LocalActor
 
localActor() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
LocalBackend - Class in org.apache.spark.scheduler.local
LocalBackend is used when running a local version of Spark where the executor, backend, and master all run in the same JVM.
LocalBackend(TaskSchedulerImpl, int) - Constructor for class org.apache.spark.scheduler.local.LocalBackend
 
localDirs() - Method in class org.apache.spark.storage.DiskBlockManager
 
localDstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
localFraction() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
Computes the fraction of the parents' partitions containing preferredLocation within their getPreferredLocs.
LocalHiveContext - Class in org.apache.spark.sql.hive
DEPRECATED: Use HiveContext instead.
LocalHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.LocalHiveContext
 
localHostName() - Static method in class org.apache.spark.util.Utils
Get the local machine's hostname.
localIpAddress() - Static method in class org.apache.spark.util.Utils
Get the local host's IP address in dotted-quad format (e.g.
localIpAddressHostname() - Static method in class org.apache.spark.util.Utils
 
localityWaits() - Method in class org.apache.spark.scheduler.TaskSetManager
 
LocalKMeans - Class in org.apache.spark.mllib.clustering
An utility object to run K-means locally.
LocalKMeans() - Constructor for class org.apache.spark.mllib.clustering.LocalKMeans
 
localSrcId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
localValue() - Method in class org.apache.spark.Accumulable
Get the current value of this accumulator from within a task.
location() - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
location() - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
location() - Method in interface org.apache.spark.scheduler.MapStatus
Location where this task was run.
location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
locations_() - Method in class org.apache.spark.rdd.BlockRDD
 
log() - Method in interface org.apache.spark.Logging
 
log() - Method in interface org.apache.spark.util.ActorLogReceive
 
log(String, boolean) - Method in class org.apache.spark.util.FileLogger
Log the message to the given writer.
log2(double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
 
log_() - Method in interface org.apache.spark.Logging
 
LOG_FILE_PERMISSIONS() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
LOG_PREFIX() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
 
logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logDir() - Method in class org.apache.spark.scheduler.EventLoggingListener
 
logDirName() - Method in class org.apache.spark.scheduler.EventLoggingListener
 
logDirName() - Method in class org.apache.spark.scheduler.JobLogger
 
logError(Function0<String>) - Method in interface org.apache.spark.Logging
 
logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logFileRegex() - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
 
logFilesTologInfo(Seq<Path>) - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
Convert a sequence of files to a sequence of sorted LogInfo objects
loggedEvents() - Method in class org.apache.spark.scheduler.EventLoggingListener
 
Logging - Interface in org.apache.spark
:: DeveloperApi :: Utility trait for classes that want to log data.
logicalPlan() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
logicalPlan() - Method in interface org.apache.spark.sql.SchemaRDDLike
 
logicalPlanToSparkQuery(LogicalPlan) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Allows catalyst LogicalPlans to be executed as a SchemaRDD.
LogicalRDD - Class in org.apache.spark.sql.execution
 
LogicalRDD(Seq<Attribute>, RDD<Row>, SQLContext) - Constructor for class org.apache.spark.sql.execution.LogicalRDD
 
LogicalRelation - Class in org.apache.spark.sql.sources
Used to link a BaseRelation in to a logical query plan.
LogicalRelation(BaseRelation) - Constructor for class org.apache.spark.sql.sources.LogicalRelation
 
logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
 
logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
LogisticGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a logistic loss function, as used in binary classification.
LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
 
LogisticRegression - Class in org.apache.spark.ml.classification
Logistic regression.
LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
 
LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate test data for LogisticRegression.
LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
 
LogisticRegressionModel - Class in org.apache.spark.ml.classification
:: AlphaComponent :: Model produced by LogisticRegression.
LogisticRegressionModel(LogisticRegression, ParamMap, Vector) - Constructor for class org.apache.spark.ml.classification.LogisticRegressionModel
 
LogisticRegressionModel - Class in org.apache.spark.mllib.classification
Classification model trained using Logistic Regression.
LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
 
LogisticRegressionParams - Interface in org.apache.spark.ml.classification
:: AlphaComponent :: Params for logistic regression.
LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification
Train a classification model for Logistic Regression using Limited-memory BFGS.
LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
 
LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
Train a classification model for Logistic Regression using Stochastic Gradient Descent.
LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Construct a LogisticRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
logLine(String, boolean) - Method in class org.apache.spark.util.FileLogger
Log the message to the given writer as a new line.
LogLoss - Class in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Class for log loss calculation (for classification).
LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
 
logMemoryUsage() - Method in class org.apache.spark.storage.MemoryStore
Log information about current memory usage.
logName() - Method in interface org.apache.spark.Logging
 
logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
Generate a graph whose vertex out degree distribution is log normal.
logPaths() - Method in class org.apache.spark.scheduler.EventLoggingInfo
 
logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
 
logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logUncaughtExceptions(Function0<T>) - Static method in class org.apache.spark.util.Utils
Execute the given block, logging and re-throwing any uncaught exception.
logUnrollFailureMessage(BlockId, long) - Method in class org.apache.spark.storage.MemoryStore
Log a warning for failing to unroll a block.
logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
 
logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
LONG - Class in org.apache.spark.sql.columnar
 
LONG() - Constructor for class org.apache.spark.sql.columnar.LONG
 
LONG_FORM() - Static method in class org.apache.spark.util.CallSite
 
LongColumnAccessor - Class in org.apache.spark.sql.columnar
 
LongColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.LongColumnAccessor
 
LongColumnBuilder - Class in org.apache.spark.sql.columnar
 
LongColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.LongColumnBuilder
 
LongColumnStats - Class in org.apache.spark.sql.columnar
 
LongColumnStats() - Constructor for class org.apache.spark.sql.columnar.LongColumnStats
 
LongDelta - Class in org.apache.spark.sql.columnar.compression
 
LongDelta() - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta
 
LongDelta.Decoder - Class in org.apache.spark.sql.columnar.compression
 
LongDelta.Decoder(ByteBuffer, NativeColumnType<LongType$>) - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
LongDelta.Encoder - Class in org.apache.spark.sql.columnar.compression
 
LongDelta.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
longForm() - Method in class org.apache.spark.util.CallSite
 
LongHashSetSerializer - Class in org.apache.spark.sql.execution
 
LongHashSetSerializer() - Constructor for class org.apache.spark.sql.execution.LongHashSetSerializer
 
LongParam - Class in org.apache.spark.ml.param
Specialized version of Param[Long] for Java.
LongParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.LongParam
 
longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
 
LongType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the LongType object.
LongType - Class in org.apache.spark.sql.api.java
The data type representing long and Long values.
longWritableConverter() - Static method in class org.apache.spark.SparkContext
 
lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the list of values in the RDD for key key.
lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return the list of values in the RDD for key key.
lookupCachedData(SchemaRDD) - Method in interface org.apache.spark.sql.CacheManager
Optionally returns cached data for the given SchemaRDD
lookupCachedData(LogicalPlan) - Method in interface org.apache.spark.sql.CacheManager
Optionally returns cached data for the given LogicalPlan.
lookupFunction(String, Seq<Expression>) - Method in class org.apache.spark.sql.hive.HiveFunctionRegistry
 
lookupRelation(Seq<String>, Option<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
lookupTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the default Spark timeout to use for Akka remote actor lookup.
loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
Loss - Interface in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Trait for adding "pluggable" loss functions for the gradient boosting algorithm.
Losses - Class in org.apache.spark.mllib.tree.loss
 
Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
 
LOST() - Static method in class org.apache.spark.TaskState
 
low() - Method in class org.apache.spark.partial.BoundedDouble
 
lower() - Method in class org.apache.spark.rdd.JdbcPartition
 
LOWER() - Static method in class org.apache.spark.sql.hive.HiveQl
 
lowerBound() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
lowerCase() - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
lowSplit() - Method in class org.apache.spark.mllib.tree.model.Bin
 
LZ4CompressionCodec - Class in org.apache.spark.io
:: DeveloperApi :: LZ4 implementation of CompressionCodec.
LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
 
LZFCompressionCodec - Class in org.apache.spark.io
:: DeveloperApi :: LZF implementation of CompressionCodec.
LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec
 

M

main(String[]) - Static method in class org.apache.spark.examples.streaming.JavaKinesisWordCountASL
 
main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountASL
 
main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
 
main(String[]) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
main(String[]) - Static method in class org.apache.spark.streaming.util.RawTextSender
 
main(String[]) - Static method in class org.apache.spark.streaming.util.RecurringTimer
 
main(String[]) - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
main(String[]) - Static method in class org.apache.spark.util.random.XORShiftRandom
Main method for running benchmark
makeBinarySearch(Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.util.CollectionsUtils
 
makeCopy(Object[]) - Method in class org.apache.spark.sql.execution.SparkPlan
Overridden make copy also propogates sqlContext to copied plan.
makeDriverRef(String, SparkConf, ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
 
makeExecutorRef(String, SparkConf, String, int, ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
 
makeOffers() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
makeOffers(String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
makeProgressBar(int, int, int, int, int) - Static method in class org.apache.spark.ui.UIUtils
 
makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD.
makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD, with one or more location preferences (hostnames of Spark nodes) for each object.
makeRDDForPartitionedTable(Seq<Partition>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
 
makeRDDForPartitionedTable(Map<Partition, Class<? extends Deserializer>>, Option<PathFilter>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
Create a HadoopRDD for every partition key specified in the query.
makeRDDForPartitionedTable(Seq<Partition>) - Method in interface org.apache.spark.sql.hive.TableReader
 
makeRDDForTable(Table) - Method in class org.apache.spark.sql.hive.HadoopTableReader
 
makeRDDForTable(Table, Class<? extends Deserializer>, Option<PathFilter>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
Creates a Hadoop RDD to read data from the target table's data directory.
makeRDDForTable(Table) - Method in interface org.apache.spark.sql.hive.TableReader
 
ManualClock - Class in org.apache.spark.streaming.util
 
ManualClock() - Constructor for class org.apache.spark.streaming.util.ManualClock
 
map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
map(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Construct a new edge partition by applying the function f to all edges in this partition.
map(Iterator<ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Construct a new edge partition by using the edge attributes contained in the iterator.
map(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Pass each vertex attribute along with the vertex id through a map function and retain the original RDD's partitioning and index.
map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult
Transform this PartialResult into a PartialResult of type T.
map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to all elements of this RDD.
map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream.
map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by applying a function to all elements of this DStream.
MAP_KEY_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
MAP_OUTPUT_TRACKER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
MAP_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
MAP_VALUE_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
mapAsSerializableJavaMap(Map<A, B>) - Static method in class org.apache.spark.api.java.JavaUtils
 
mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute in the graph using the map function.
mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute using the map function, passing it a whole partition at a time.
mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-------------------------------- * Util JSON deserialization methods | ---------------------------------
mapId() - Method in class org.apache.spark.FetchFailed
 
mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
MapOutputTracker - Class in org.apache.spark
Class that keeps track of the location of the map output of a stage.
MapOutputTracker(SparkConf) - Constructor for class org.apache.spark.MapOutputTracker
 
mapOutputTracker() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
mapOutputTracker() - Method in class org.apache.spark.SparkEnv
 
MapOutputTrackerMaster - Class in org.apache.spark
MapOutputTracker for the driver.
MapOutputTrackerMaster(SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerMaster
 
MapOutputTrackerMasterActor - Class in org.apache.spark
Actor class for MapOutputTrackerMaster
MapOutputTrackerMasterActor(MapOutputTrackerMaster, SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerMasterActor
 
MapOutputTrackerMessage - Interface in org.apache.spark
 
MapOutputTrackerWorker - Class in org.apache.spark
MapOutputTracker for the executors, which fetches map output information from the driver's MapOutputTrackerMaster.
MapOutputTrackerWorker(SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerWorker
 
MapPartitionedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
MapPartitionedDStream(DStream<T>, Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
MapPartitionsRDD<U,T> - Class in org.apache.spark.rdd
 
MapPartitionsRDD(RDD<T>, Function3<TaskContext, Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.MapPartitionsRDD
 
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
:: DeveloperApi :: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD
Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.HadoopRDD
Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.NewHadoopRDD
Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
MappedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
MappedDStream(DStream<T>, Function1<T, U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MappedDStream
 
MappedRDD<U,T> - Class in org.apache.spark.rdd
 
MappedRDD(RDD<T>, Function1<T, U>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.MappedRDD
 
MappedValuesRDD<K,V,U> - Class in org.apache.spark.rdd
 
MappedValuesRDD(RDD<? extends Product2<K, V>>, Function1<V, U>) - Constructor for class org.apache.spark.rdd.MappedValuesRDD
 
mapper() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
MAPRED_REDUCE_TASKS() - Method in class org.apache.spark.sql.SQLConf.Deprecated$
 
mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
 
MapStatus - Interface in org.apache.spark.scheduler
Result returned by a ShuffleMapTask to a scheduler.
mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
mapToJson(Map<String, String>) - Static method in class org.apache.spark.util.JsonProtocol
------------------------------ * Util JSON serialization methods | -------------------------------
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream.
mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute a partition at a time using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
MapType - Class in org.apache.spark.sql.api.java
The data type representing Maps.
MapValuedDStream<K,V,U> - Class in org.apache.spark.streaming.dstream
 
MapValuedDStream(DStream<Tuple2<K, V>>, Function1<V, U>, ClassTag<K>, ClassTag<V>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MapValuedDStream
 
mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD
Map the values in an edge partitioning preserving the structure but changing the values.
mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Maps each vertex attribute, preserving the index.
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Maps each vertex attribute, additionally supplying the vertex ID.
mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapVertexPartitions(Function1<ShippableVertexPartition<VD>, ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapVertexPartitions(Function1<ShippableVertexPartition<VD>, ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Applies a function to each VertexPartition of this RDD and returns a new VertexRDD.
mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph
Transforms each vertex attribute in the graph using the map function.
mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Maps f over this RDD, where f takes an additional parameter of type A.
markCheckpointed(RDD<?>) - Method in class org.apache.spark.rdd.RDD
Changes the dependencies of this RDD from its original parents to a new RDD (newRDD) created from the checkpoint file, and forget its old dependencies and partitions.
MarkedForCheckpoint() - Static method in class org.apache.spark.rdd.CheckpointState
 
markFailed(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markFailure() - Method in class org.apache.spark.storage.BlockInfo
Mark this BlockInfo as ready but failed
markForCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
markGettingResult(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markInterrupted() - Method in class org.apache.spark.TaskContextImpl
Marks the task for interruption, i.e.
markPartiallyConstructed(SparkContext, boolean) - Static method in class org.apache.spark.SparkContext
Called at the beginning of the SparkContext constructor to ensure that no SparkContext is running.
markReady(long) - Method in class org.apache.spark.storage.BlockInfo
Mark this BlockInfo as ready (i.e.
markSuccessful(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markTaskCompleted() - Method in class org.apache.spark.TaskContextImpl
Marks the task as completed and triggers the listeners.
mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Restricts the graph to only the vertices and edges that are also in other, but keeps the attributes from this graph.
mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mask() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
mask() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
mask() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
master() - Method in class org.apache.spark.api.java.JavaSparkContext
 
master() - Method in class org.apache.spark.SparkContext
 
master() - Method in class org.apache.spark.storage.BlockManager
 
master() - Method in class org.apache.spark.storage.TachyonBlockManager
 
master() - Method in class org.apache.spark.streaming.Checkpoint
 
Matrices - Class in org.apache.spark.mllib.linalg
Factory methods for Matrix.
Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
 
Matrix - Interface in org.apache.spark.mllib.linalg
Trait for a local matrix.
MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents an entry in an distributed matrix.
MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation
Model representing the result of matrix factorization.
MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the maximum element from this RDD as defined by the specified Comparator[T].
max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Maximum value of each column.
max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the max of this RDD as defined by the implicit Ordering[T].
MAX() - Static method in class org.apache.spark.sql.hive.HiveQl
 
max(Duration) - Method in class org.apache.spark.streaming.Duration
 
max(Time) - Method in class org.apache.spark.streaming.Time
 
max(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
max() - Method in class org.apache.spark.util.StatCounter
 
MAX_ATTEMPTS() - Method in class org.apache.spark.streaming.CheckpointWriter
 
MAX_DICT_SIZE() - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
MAX_SLAVE_FAILURES() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
maxAkkaFrameSize() - Method in class org.apache.spark.MapOutputTrackerMasterActor
 
maxBatchSize() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxBins() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxDepth() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
maxFrameSizeBytes(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the configured max frame size for Akka messages in bytes.
maxIter() - Method in interface org.apache.spark.ml.param.HasMaxIter
param for max number of iterations
maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
maxMem() - Method in class org.apache.spark.storage.BlockManagerInfo
 
maxMem() - Method in class org.apache.spark.storage.StorageStatus
 
maxMemory() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxMemSize() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the maximum number of nodes which can be in the given level of the tree.
maxRegisteredWaitingTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
maxResultSize() - Method in class org.apache.spark.scheduler.TaskSetManager
 
maxTaskFailures() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
maxTaskFailures() - Method in class org.apache.spark.scheduler.TaskSetManager
 
maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample mean vector.
mean() - Method in class org.apache.spark.partial.BoundedDouble
 
mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.util.StatCounter
 
meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.
meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the approximate mean of the elements in this RDD.
meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the mean within a timeout.
meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
:: Experimental :: Approximate operation to return the mean within a timeout.
meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
Returns the mean average precision (MAP) of all the queries.
MeanEvaluator - Class in org.apache.spark.partial
An ApproximateEvaluator for means.
MeanEvaluator(int, double) - Constructor for class org.apache.spark.partial.MeanEvaluator
 
meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.
megabytesToString(long) - Static method in class org.apache.spark.util.Utils
Convert a quantity in megabytes to a human-readable string such as "4.0 MB".
MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
 
memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
MemoryEntry - Class in org.apache.spark.storage
 
MemoryEntry(Object, long, boolean) - Constructor for class org.apache.spark.storage.MemoryEntry
 
MemoryParam - Class in org.apache.spark.util
An extractor object for parsing JVM memory strings, such as "10g", into an Int representing the number of megabytes.
MemoryParam() - Constructor for class org.apache.spark.util.MemoryParam
 
memoryStore() - Method in class org.apache.spark.storage.BlockManager
 
MemoryStore - Class in org.apache.spark.storage
Stores blocks in memory, either as Arrays of deserialized Java objects or as serialized ByteBuffers.
MemoryStore(BlockManager, long) - Constructor for class org.apache.spark.storage.MemoryStore
 
memoryStringToMb(String) - Static method in class org.apache.spark.util.Utils
Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) to a number of megabytes.
memoryUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
MemoryUtils - Class in org.apache.spark.scheduler.cluster.mesos
 
MemoryUtils() - Constructor for class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
memRemaining() - Method in class org.apache.spark.storage.StorageStatus
Return the memory remaining in this block manager.
memSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
memSize() - Method in class org.apache.spark.storage.BlockStatus
 
memSize() - Method in class org.apache.spark.storage.RDDInfo
 
memUsed() - Method in class org.apache.spark.storage.StorageStatus
Return the memory used by this block manager.
memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
Return the memory used by the given RDD in this block manager in O(1) time.
merge(R) - Method in class org.apache.spark.Accumulable
Merge two accumulable objects together
merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
Merges another.
merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
Merge another MultivariateOnlineSummarizer, and update the statistical summary.
merge(DTStatsAggregator) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Merge this aggregator with another, and returns this aggregator.
merge(double[], int, int) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
Merge the stats from one bin into another.
merge(int, U) - Method in interface org.apache.spark.partial.ApproximateEvaluator
 
merge(int, long) - Method in class org.apache.spark.partial.CountEvaluator
 
merge(int, OpenHashMap<T, Object>) - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
merge(int, HashMap<T, StatCounter>) - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
merge(int, HashMap<T, StatCounter>) - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
merge(int, StatCounter) - Method in class org.apache.spark.partial.MeanEvaluator
 
merge(int, StatCounter) - Method in class org.apache.spark.partial.SumEvaluator
 
merge(Option<AcceptanceResult>) - Method in class org.apache.spark.util.random.AcceptanceResult
 
merge(double) - Method in class org.apache.spark.util.StatCounter
Add a value into this StatCounter, updating the internal statistics.
merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter
Add multiple values into this StatCounter, updating the internal statistics.
merge(StatCounter) - Method in class org.apache.spark.util.StatCounter
Merge another StatCounter into this one, adding up the internal statistics.
mergeCombiners() - Method in class org.apache.spark.Aggregator
 
mergeForFeature(int, int, int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
For a given feature, merge the stats for two bins.
mergeValue() - Method in class org.apache.spark.Aggregator
 
MesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos
A SchedulerBackend for running fine-grained tasks on Mesos.
MesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
message() - Method in class org.apache.spark.FetchFailed
 
message() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
 
message() - Method in class org.apache.spark.scheduler.ExecutorLossReason
 
message() - Method in exception org.apache.spark.storage.BlockException
 
message() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
metadata() - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
 
MetadataBuilder - Class in org.apache.spark.sql.api.java
Builder for [[Metadata]].
MetadataBuilder() - Constructor for class org.apache.spark.sql.api.java.MetadataBuilder
 
metadataCleaner() - Method in class org.apache.spark.SparkContext
 
MetadataCleaner - Class in org.apache.spark.util
Runs a timer task to periodically clean up metadata (e.g.
MetadataCleaner(Enumeration.Value, Function1<Object, BoxedUnit>, SparkConf) - Constructor for class org.apache.spark.util.MetadataCleaner
 
MetadataCleanerType - Class in org.apache.spark.util
 
MetadataCleanerType() - Constructor for class org.apache.spark.util.MetadataCleanerType
 
metastorePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
 
metastorePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
MetastoreRelation - Class in org.apache.spark.sql.hive
 
MetastoreRelation(String, String, Option<String>, Table, Seq<Partition>, SQLContext) - Constructor for class org.apache.spark.sql.hive.MetastoreRelation
 
MetastoreRelation.SchemaAttribute - Class in org.apache.spark.sql.hive
 
MetastoreRelation.SchemaAttribute(FieldSchema) - Constructor for class org.apache.spark.sql.hive.MetastoreRelation.SchemaAttribute
 
method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
param for metric name in evaluation
metricRegistry() - Method in class org.apache.spark.metrics.source.JvmSource
 
metricRegistry() - Method in interface org.apache.spark.metrics.source.Source
 
metricRegistry() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
metricRegistry() - Method in class org.apache.spark.storage.BlockManagerSource
 
metricRegistry() - Method in class org.apache.spark.streaming.StreamingSource
 
metrics() - Method in class org.apache.spark.ExceptionFailure
 
metrics() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
metrics() - Method in class org.apache.spark.scheduler.Task
 
METRICS_CONF() - Method in class org.apache.spark.metrics.MetricsConfig
 
MetricsConfig - Class in org.apache.spark.metrics
 
MetricsConfig(Option<String>) - Constructor for class org.apache.spark.metrics.MetricsConfig
 
MetricsServlet - Class in org.apache.spark.metrics.sink
 
MetricsServlet(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.MetricsServlet
 
MetricsSystem - Class in org.apache.spark.metrics
Spark Metrics System, created by specific "instance", combined by source, sink, periodically poll source metrics data to sink destinations.
metricsSystem() - Method in class org.apache.spark.SparkContext
 
metricsSystem() - Method in class org.apache.spark.SparkEnv
 
MFDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate RDD(s) containing data for Matrix Factorization.
MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
 
microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns micro-averaged label-based f1-measure (equals to micro-averaged document-based f1-measure)
microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns micro-averaged label-based precision (equals to micro-averaged document-based precision)
microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns micro-averaged label-based recall (equals to micro-averaged document-based recall)
milliseconds() - Method in class org.apache.spark.streaming.Duration
 
milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
 
Milliseconds - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of milliseconds.
Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
 
milliseconds() - Method in class org.apache.spark.streaming.Time
 
millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener
Reformat a time interval in milliseconds to a prettier format for output
min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the minimum element from this RDD as defined by the specified Comparator[T].
min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Minimum value of each column.
min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the min of this RDD as defined by the implicit Ordering[T].
MIN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
min(Duration) - Method in class org.apache.spark.streaming.Duration
 
min(Time) - Method in class org.apache.spark.streaming.Time
 
min() - Method in class org.apache.spark.util.StatCounter
 
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
 
MINIMUM_INTERVAL_SECONDS() - Static method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
MINIMUM_SHARES_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
MINIMUM_SIZE_BYTES() - Static method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
minInfoGain() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
minMemoryMapBytes() - Method in class org.apache.spark.storage.DiskStore
 
minPollTime() - Method in class org.apache.spark.streaming.util.SystemClock
 
minRegisteredRatio() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
minSamplingRate() - Static method in class org.apache.spark.util.random.BinomialBounds
 
minShare() - Method in class org.apache.spark.scheduler.Pool
 
minShare() - Method in interface org.apache.spark.scheduler.Schedulable
 
minShare() - Method in class org.apache.spark.scheduler.TaskSetManager
 
minus(Duration) - Method in class org.apache.spark.streaming.Duration
 
minus(Time) - Method in class org.apache.spark.streaming.Time
 
minus(Duration) - Method in class org.apache.spark.streaming.Time
 
minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
minutes(long) - Static method in class org.apache.spark.streaming.Durations
 
Minutes - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of minutes.
Minutes() - Constructor for class org.apache.spark.streaming.Minutes
 
minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
MLUtils - Class in org.apache.spark.mllib.util
Helper methods to load, save and pre-process data used in ML Lib.
MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
 
Model<M extends Model<M>> - Class in org.apache.spark.ml
:: AlphaComponent :: A fitted model, i.e., a Transformer produced by an Estimator.
Model() - Constructor for class org.apache.spark.ml.Model
 
model() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
MODULE$ - Static variable in class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.recommendation.ALS.BlockStats$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.FloatAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.IntAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.LongAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveQl.Token$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.SQLConf.Deprecated$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.ExpireDeadHosts$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocations$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetPeers$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ui.JettyUtils.ServletParams$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ui.jobs.UIData.JobUIData$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ui.jobs.UIData.TaskUIData$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.util.Vector.VectorAccumParam$
Static reference to the singleton instance of this Scala object.
MQTTInputDStream - Class in org.apache.spark.streaming.mqtt
Input stream that subscribe messages from a Mqtt Broker.
MQTTInputDStream(StreamingContext, String, String, StorageLevel) - Constructor for class org.apache.spark.streaming.mqtt.MQTTInputDStream
 
MQTTReceiver - Class in org.apache.spark.streaming.mqtt
 
MQTTReceiver(String, String, StorageLevel) - Constructor for class org.apache.spark.streaming.mqtt.MQTTReceiver
 
MQTTUtils - Class in org.apache.spark.streaming.mqtt
 
MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
 
msDurationToString(long) - Static method in class org.apache.spark.util.Utils
Returns a human-readable string representing a duration such as "35ms"
msg() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
msg() - Method in class org.apache.spark.streaming.scheduler.ErrorReported
 
MulticlassMetrics - Class in org.apache.spark.mllib.evaluation
::Experimental:: Evaluator for multiclass classification.
MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
 
MultilabelMetrics - Class in org.apache.spark.mllib.evaluation
Evaluator for multilabel classification.
MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
 
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Multiply this matrix by a local matrix on the right.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Multiply this matrix by a local matrix on the right.
multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix
Convenience method for `Matrix`-`DenseMatrix` multiplication.
multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix
Convenience method for `Matrix`-`DenseVector` multiplication.
multiply(double) - Method in class org.apache.spark.util.Vector
 
multiplyGramianMatrixBy(DenseVector<Object>) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Multiplies the Gramian matrix A^T A by a dense vector on the right without computing A^T A.
MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat
:: DeveloperApi :: MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for samples in sparse or dense vector format in a online fashion.
MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat
Trait for multivariate statistical summary of a data matrix.
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
 
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
MutablePair<T1,T2> - Class in org.apache.spark.util
:: DeveloperApi :: A tuple of 2 elements.
MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
 
MutablePair() - Constructor for class org.apache.spark.util.MutablePair
No-arg constructor for serialization
MutableRowWriteSupport - Class in org.apache.spark.sql.parquet
 
MutableRowWriteSupport() - Constructor for class org.apache.spark.sql.parquet.MutableRowWriteSupport
 
myLocalityLevels() - Method in class org.apache.spark.scheduler.TaskSetManager
 
myName() - Method in class org.apache.spark.util.InnerClosureFinder
 

N

n() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
n() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
NaiveBayes - Class in org.apache.spark.mllib.classification
Trains a Naive Bayes model given an RDD of (label, features) pairs.
NaiveBayes() - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
 
NaiveBayesModel - Class in org.apache.spark.mllib.classification
Model for Naive Bayes Classifiers.
NaiveBayesModel(double[], double[], double[][]) - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel
 
name() - Method in class org.apache.spark.Accumulable
 
name() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
name() - Method in class org.apache.spark.ml.param.Param
 
name() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
name() - Method in class org.apache.spark.rdd.RDD
A friendly name for this RDD
name() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
name() - Method in class org.apache.spark.scheduler.Pool
 
name() - Method in interface org.apache.spark.scheduler.Schedulable
 
name() - Method in class org.apache.spark.scheduler.Stage
 
name() - Method in class org.apache.spark.scheduler.StageInfo
 
name() - Method in class org.apache.spark.scheduler.TaskDescription
 
name() - Method in class org.apache.spark.scheduler.TaskSetManager
 
name() - Method in interface org.apache.spark.SparkStageInfo
 
name() - Method in class org.apache.spark.SparkStageInfoImpl
 
name() - Method in class org.apache.spark.sql.execution.PythonUDF
 
name() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
 
name() - Method in class org.apache.spark.storage.BlockId
A globally unique identifier for this Block.
name() - Method in class org.apache.spark.storage.BroadcastBlockId
 
name() - Method in class org.apache.spark.storage.RDDBlockId
 
name() - Method in class org.apache.spark.storage.RDDInfo
 
name() - Method in class org.apache.spark.storage.ShuffleBlockId
 
name() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
name() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
name() - Method in class org.apache.spark.storage.StreamBlockId
 
name() - Method in class org.apache.spark.storage.TaskResultBlockId
 
name() - Method in class org.apache.spark.storage.TempLocalBlockId
 
name() - Method in class org.apache.spark.storage.TempShuffleBlockId
 
name() - Method in class org.apache.spark.storage.TestBlockId
 
name() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
name() - Method in class org.apache.spark.ui.WebUITab
 
name() - Method in class org.apache.spark.util.MetadataCleaner
 
namedThreadFactory(String) - Static method in class org.apache.spark.util.Utils
Create a thread factory that names threads with a prefix and also sets the threads to daemon.
nameToObjectMap() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
NarrowCoGroupSplitDep - Class in org.apache.spark.rdd
 
NarrowCoGroupSplitDep(RDD<?>, int, Partition) - Constructor for class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
NarrowDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD.
NarrowDependency(RDD<T>) - Constructor for class org.apache.spark.NarrowDependency
 
NativeColumnAccessor<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnAccessor(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.NativeColumnAccessor
 
NativeColumnBuilder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnBuilder(ColumnStats, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.NativeColumnBuilder
 
NativeColumnType<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnType(T, int, int) - Constructor for class org.apache.spark.sql.columnar.NativeColumnType
 
NativeCommand - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi ::
NativeCommand(String, Seq<Attribute>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.NativeCommand
 
NativePlaceholder - Class in org.apache.spark.sql.hive
Used when we need to start parsing the AST before deciding that we are going to pass the command back for Hive to execute natively.
NativePlaceholder() - Constructor for class org.apache.spark.sql.hive.NativePlaceholder
 
ndcgAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
Compute the average NDCG value of all the queries, truncated at ranking position k.
networkStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented receiver.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop file with an arbitrary new API InputFormat.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.SparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAttemptId() - Method in class org.apache.spark.scheduler.Stage
Return a new attempt id, starting with 0.
newBroadcast(T, boolean, long, ClassTag<T>) - Method in interface org.apache.spark.broadcast.BroadcastFactory
Creates a new broadcast variable.
newBroadcast(T, boolean, ClassTag<T>) - Method in class org.apache.spark.broadcast.BroadcastManager
 
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
newDaemonCachedThreadPool(String) - Static method in class org.apache.spark.util.Utils
Wrapper over newCachedThreadPool.
newDaemonFixedThreadPool(int, String) - Static method in class org.apache.spark.util.Utils
Wrapper over newFixedThreadPool.
newFile(String, Option<FsPermission>) - Method in class org.apache.spark.util.FileLogger
Start a writer for a new file, closing the existing one if it exists.
newGetLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
NewHadoopPartition - Class in org.apache.spark.rdd
 
NewHadoopPartition(int, int, InputSplit) - Constructor for class org.apache.spark.rdd.NewHadoopPartition
 
NewHadoopRDD<K,V> - Class in org.apache.spark.rdd
:: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the new MapReduce API (org.apache.hadoop.mapreduce).
NewHadoopRDD(SparkContext, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, Configuration) - Constructor for class org.apache.spark.rdd.NewHadoopRDD
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD<U,T> - Class in org.apache.spark.rdd
Analogous to MapPartitionsRDD, but passes in an InputSplit to the given function rather than the index of the partition.
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD(RDD<T>, Function2<InputSplit, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$ - Class in org.apache.spark.rdd
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$() - Constructor for class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
 
newId() - Static method in class org.apache.spark.Accumulators
 
newInputSplit() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
newInstance() - Method in class org.apache.spark.serializer.JavaSerializer
 
newInstance() - Method in class org.apache.spark.serializer.KryoSerializer
 
newInstance() - Method in class org.apache.spark.serializer.Serializer
Creates a new SerializerInstance.
newInstance() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
newInstance() - Method in class org.apache.spark.sql.execution.KryoResourcePool
 
newInstance() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
newInstance() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
newInstance() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
newInstance() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
newInstance() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
newInstance() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
newJobContext(JobConf, JobID) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newJobContext(Configuration, JobID) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newKryo() - Method in class org.apache.spark.serializer.KryoSerializer
 
newKryo() - Method in class org.apache.spark.sql.execution.SparkSqlSerializer
 
newKryoOutput() - Method in class org.apache.spark.serializer.KryoSerializer
 
newMesosTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
newPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
 
newRddId() - Method in class org.apache.spark.SparkContext
Register a new RDD, returning its RDD ID
newShuffleId() - Method in class org.apache.spark.SparkContext
 
newTaskAttemptContext(JobConf, TaskAttemptID) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newTaskAttemptContext(Configuration, TaskAttemptID) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newTaskAttemptID(String, int, boolean, int, int) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newTaskAttemptID(String, int, boolean, int, int) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newTaskId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
next() - Method in class org.apache.spark.InterruptibleIterator
 
next() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
next(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.compression.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
next() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
next() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
next() - Method in class org.apache.spark.util.CompletionIterator
 
next() - Method in class org.apache.spark.util.IdGenerator
 
next() - Method in class org.apache.spark.util.NextIterator
 
next() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
next() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
NextIterator<U> - Class in org.apache.spark.util
Provides a basic/boilerplate Iterator implementation.
NextIterator() - Constructor for class org.apache.spark.util.NextIterator
 
nextJobId() - Method in class org.apache.spark.scheduler.DAGScheduler
 
nextKeyValue() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
nextKeyValue() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
nextKeyValue() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
nextMesosTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
nextNullIndex() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
nextTaskId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
nextValue() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
nextValue() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
Returns an i.i.d.
nextValue() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
nextValue() - Method in class org.apache.spark.mllib.random.UniformGenerator
 
NNLS - Class in org.apache.spark.mllib.optimization
Object used to solve nonnegative least squares problems using a modified projected gradient method.
NNLS() - Constructor for class org.apache.spark.mllib.optimization.NNLS
 
NNLS.Workspace - Class in org.apache.spark.mllib.optimization
 
NNLS.Workspace(int) - Constructor for class org.apache.spark.mllib.optimization.NNLS.Workspace
 
NO_PREF() - Static method in class org.apache.spark.scheduler.TaskLocality
 
Node - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Node in a decision tree.
Node(int, Predict, double, boolean, Option<Split>, Option<Node>, Option<Node>, Option<InformationGainStats>) - Constructor for class org.apache.spark.mllib.tree.model.Node
 
NODE_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
NodeIdCache - Class in org.apache.spark.mllib.tree.impl
:: DeveloperApi :: A given TreePoint would belong to a particular node per tree.
NodeIdCache(RDD<int[]>, Option<String>, int) - Constructor for class org.apache.spark.mllib.tree.impl.NodeIdCache
 
nodeIdsForInstances() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
 
nodeIndex() - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
nodeIndexInGroup() - Method in class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
NodeIndexUpdater - Class in org.apache.spark.mllib.tree.impl
:: DeveloperApi :: This is used by the node id cache to find the child id that a data point would belong to.
NodeIndexUpdater(Split, int) - Constructor for class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
nodesToGenerator(Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
nodeToRelation(Node) - Static method in class org.apache.spark.sql.hive.HiveQl
 
nodeToSortOrder(Node) - Static method in class org.apache.spark.sql.hive.HiveQl
 
noLocality() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
NONE - Static variable in class org.apache.spark.api.java.StorageLevels
 
None - Static variable in class org.apache.spark.graphx.TripletFields
None of the triplet fields are exposed.
NONE() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
NONE() - Static method in class org.apache.spark.storage.StorageLevel
 
nonLocalPaths(String, boolean) - Static method in class org.apache.spark.util.Utils
Return all non-local paths from a comma-separated list of paths.
nonNegativeHash(Object) - Static method in class org.apache.spark.util.Utils
 
nonNegativeMod(int, int) - Static method in class org.apache.spark.util.Utils
 
NoopColumnStats - Class in org.apache.spark.sql.columnar
A no-op ColumnStats only used for testing purposes.
NoopColumnStats() - Constructor for class org.apache.spark.sql.columnar.NoopColumnStats
 
norm() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
 
norm(Vector, double) - Static method in class org.apache.spark.mllib.linalg.Vectors
Returns the p-norm of this vector.
NORMAL_APPROX_SAMPLE_SIZE() - Method in class org.apache.spark.partial.StudentTCacher
 
normalApprox() - Method in class org.apache.spark.partial.StudentTCacher
 
Normalizer - Class in org.apache.spark.mllib.feature
:: Experimental :: Normalizes samples individually to unit L^p^ norm
Normalizer(double) - Constructor for class org.apache.spark.mllib.feature.Normalizer
 
Normalizer() - Constructor for class org.apache.spark.mllib.feature.Normalizer
 
normalJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
RandomRDDs.normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.
normalJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d.
normalVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d.
normL1() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
normL1() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
L1 norm of each column
normL2() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
normL2() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Euclidean magnitude of each column
NOT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
NOT_SET() - Static method in class org.apache.spark.ExecutorAllocationManager
 
notifyError(Throwable) - Method in class org.apache.spark.streaming.ContextWaiter
 
notifyStop() - Method in class org.apache.spark.streaming.ContextWaiter
 
nullable() - Method in class org.apache.spark.sql.execution.PythonUDF
 
nullable() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
NullableColumnAccessor - Interface in org.apache.spark.sql.columnar
 
NullableColumnBuilder - Interface in org.apache.spark.sql.columnar
A stackable trait used for building byte buffer for a column containing null values.
nullCount() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
nullCount() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
nullCount() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
nullCount() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
nullHypothesis() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
nullHypothesis() - Method in interface org.apache.spark.mllib.stat.test.TestResult
Null hypothesis of the test.
nulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
nullsBuffer() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
NullType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the NullType object.
NullType - Class in org.apache.spark.sql.api.java
The data type representing null and NULL values.
nullTypeToStringType(StructType) - Static method in class org.apache.spark.sql.json.JsonRDD
 
NUM_PARTITIONS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
numAccepted() - Method in class org.apache.spark.util.random.AcceptanceResult
 
numActives() - Method in class org.apache.spark.graphx.impl.EdgePartition
The number of active vertices, if any exist.
numActiveStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numActiveTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numActiveTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numActiveTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numActiveTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numAvailableOutputs() - Method in class org.apache.spark.scheduler.Stage
 
numberOfHiccups() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numberOfMsgs() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numberOfWorkers() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numBins() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numBlocks() - Method in class org.apache.spark.storage.StorageStatus
Return the number of blocks stored in this block manager in O(RDDs) time.
numCachedPartitions() - Method in class org.apache.spark.storage.RDDInfo
 
numClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
numClasses() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numCols() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Gets or computes the number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.Matrix
Number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
numCompletedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
numCompletedTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numCompletedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numCompletedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numCompleteTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numDescendants() - Method in class org.apache.spark.mllib.tree.model.Node
Get the number of nodes in tree below this node, including leaf nodes.
numEdgePartitions() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
The maximum number of edge partitions this `RoutingTablePartition` is built to join with.
numEdges() - Method in class org.apache.spark.graphx.GraphOps
The number of edges in the graph.
numericAstTypes() - Static method in class org.apache.spark.sql.hive.HiveQl
 
NumericParser - Class in org.apache.spark.mllib.util
Simple parser for a numeric structure consisting of three types:
NumericParser() - Constructor for class org.apache.spark.mllib.util.NumericParser
 
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.SparkContext
 
numExamples() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numExistingExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
Return the number of executors currently registered with this backend.
numFailedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
numFailedStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numFailedTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numFailedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numFailedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numFailedTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numFalseNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of false negatives
numFalseNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of false negatives
numFalsePositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of false positives
numFalsePositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of false positives
numFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
number of features
numFeatures() - Method in class org.apache.spark.mllib.feature.HashingTF
 
numFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numFeaturesPerNode() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numFinished() - Method in class org.apache.spark.scheduler.ActiveJob
 
numFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
param for number of folds for cross validation
numInLinks() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
numItems() - Method in class org.apache.spark.util.random.AcceptanceResult
 
numIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
numNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of negatives
numNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of negatives
numNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
numNodes() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Get number of nodes in tree, including leaf nodes.
numNonzeros() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
numNonzeros() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Number of nonzero elements (including explicitly presented zero values) in each column.
numOutLinks() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
numPartitions() - Method in class org.apache.spark.HashPartitioner
 
numPartitions() - Method in class org.apache.spark.mllib.recommendation.ALSPartitioner
 
numPartitions() - Method in class org.apache.spark.Partitioner
 
numPartitions() - Method in class org.apache.spark.RangePartitioner
 
numPartitions() - Method in class org.apache.spark.scheduler.ActiveJob
 
numPartitions() - Method in class org.apache.spark.scheduler.Stage
 
numPartitions() - Method in class org.apache.spark.sql.execution.AddExchange
 
numPartitions() - Method in class org.apache.spark.sql.execution.SparkStrategies.BasicOperators
 
numPartitions() - Method in class org.apache.spark.storage.RDDInfo
 
numPartitionsInRdd2() - Method in class org.apache.spark.rdd.CartesianRDD
 
numPositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of positives
numPositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of positives
numPositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
numRatings() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
numRddBlocks() - Method in class org.apache.spark.storage.StorageStatus
Return the number of RDD blocks stored in this block manager in O(RDDs) time.
numRddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus
Return the number of blocks that belong to the given RDD in O(1) time.
numReceivers() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numRecords() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
numRetries(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the configured number of times to retry connecting
numRows() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Gets or computes the number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.Matrix
Number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
numShufflePartitions() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
Fewer partitions to speed up testing.
numShufflePartitions() - Method in interface org.apache.spark.sql.SQLConf
Number of partitions to use for shuffle operators.
numShufflePartitions() - Static method in class org.apache.spark.sql.test.TestSQLContext
Fewer partitions to speed up testing.
numSkippedStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numSkippedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numSplits(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Number of splits for the given feature.
numTasks() - Method in class org.apache.spark.scheduler.Stage
 
numTasks() - Method in class org.apache.spark.scheduler.StageInfo
 
numTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
numTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numThreadsUnrolling() - Method in class org.apache.spark.storage.MemoryStore
Return the number of threads currently unrolling blocks.
numTotalCompletedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTotalJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
numTotalProcessedRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTotalReceivedRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTrees() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numTrees() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Get number of trees in forest.
numTrueNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of true negatives
numTrueNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of true negatives
numTruePositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of true positives
numTruePositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of true positives
numUnorderedBins(int) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Given the arity of a categorical feature (arity = number of categories), return the number of bins for the feature if it is to be treated as an unordered feature.
numUnprocessedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numVertices() - Method in class org.apache.spark.graphx.GraphOps
The number of vertices in the graph.

O

objectFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
ObjectInputStreamWithLoader - Class in org.apache.spark.streaming
 
ObjectInputStreamWithLoader(InputStream, ClassLoader) - Constructor for class org.apache.spark.streaming.ObjectInputStreamWithLoader
 
of(RDD<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.evaluation.AreaUnderCurve
Returns the area under the given curve.
of(Iterable<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.evaluation.AreaUnderCurve
Returns the area under the given curve.
OFF_HEAP - Static variable in class org.apache.spark.api.java.StorageLevels
 
OFF_HEAP() - Static method in class org.apache.spark.storage.StorageLevel
 
offerRescinded(SchedulerDriver, Protos.OfferID) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
offerRescinded(SchedulerDriver, Protos.OfferID) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
offHeapUsed() - Method in class org.apache.spark.storage.StorageStatus
Return the off-heap space used by this block manager.
offHeapUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
Return the off-heap space used by the given RDD in this block manager in O(1) time.
offset() - Method in class org.apache.spark.storage.FileSegment
 
offset() - Method in class org.apache.spark.storage.TachyonFileSegment
 
offset() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
offsetBytes(String, long, long) - Static method in class org.apache.spark.util.Utils
Return a string containing part of a file from byte 'start' to 'end'.
offsetBytes(Seq<File>, long, long) - Static method in class org.apache.spark.util.Utils
Return a string containing data across a set of files.
onAddData(Object, Object) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener
Called after a data item is added into the BlockGenerator.
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onApplicationEnd(SparkListenerApplicationEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the application ends
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onApplicationStart(SparkListenerApplicationStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the application starts
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
onBatchCompleted(StreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when processing of a batch of jobs has completed.
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBatchCompletion(Time) - Method in class org.apache.spark.streaming.scheduler.JobGenerator
Callback called when a batch has been completely processed.
onBatchStarted(StreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when processing of a batch of jobs has started.
onBatchStarted(StreamingListenerBatchStarted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a batch of jobs has been submitted for processing.
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a new block manager has joined
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.storage.StorageStatusListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in interface org.apache.spark.scheduler.SparkListener
Called when an existing block manager has been removed
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.storage.StorageStatusListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onCheckpointCompletion(Time) - Method in class org.apache.spark.streaming.scheduler.JobGenerator
Callback called when the checkpoint of a batch has been written.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
 
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in interface org.apache.spark.FutureAction
When this action is completed, either through an exception, or a value, applies the provided function.
onComplete(Function1<R, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult
Set a handler to be called when this PartialResult completes.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in interface org.apache.spark.scheduler.SparkListener
Called when environment properties have been updated
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.env.EnvironmentListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onError(String, Throwable) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener
Called when an error has occurred in the BlockGenerator.
ones(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a DenseMatrix consisting of ones.
ones(int) - Static method in class org.apache.spark.util.Vector
 
OneToOneDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs.
OneToOneDependency(RDD<T>) - Constructor for class org.apache.spark.OneToOneDependency
 
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the driver receives task metrics from an executor in a heartbeat.
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onFail(Function1<Exception, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult
Set a handler to be called if this PartialResult's job fails.
onGenerateBlock(StreamBlockId) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener
Called when a new block of data is generated by the block generator.
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.JobLogger
When job ends, recording job completion status and close log file
onJobEnd(SparkListenerJobEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a job ends
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.JobLogger
When job starts, record job property and stage graph
onJobStart(SparkListenerJobStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a job starts
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onPushBlock(StreamBlockId, ArrayBuffer<?>) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener
Called when a new block is ready to be pushed.
onReceiverError(StreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has reported an error
onReceiverError(StreamingListenerReceiverError) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onReceiverStarted(StreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has been started
onReceiverStarted(StreamingListenerReceiverStarted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onReceiverStopped(StreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has been stopped
onReceiverStopped(StreamingListenerReceiverStopped) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.JobLogger
When stage is completed, record stage completion status
onStageCompleted(SparkListenerStageCompleted) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a stage completes successfully or fails, with information on the completed stage.
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.StatsReportListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.storage.StorageListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.JobLogger
When stage is submitted, record stage submit info
onStageSubmitted(SparkListenerStageSubmitted) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a stage is submitted
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
For FIFO, all stages are contained by "default" pool but "default" pool here is meaningless
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.storage.StorageListener
 
onStart() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
onStart() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
 
onStart() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
onStart() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
onStart() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
onStart() - Method in class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
onStart() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
This is called when the KinesisReceiver starts and must be non-blocking.
onStart() - Method in class org.apache.spark.streaming.mqtt.MQTTReceiver
 
onStart() - Method in class org.apache.spark.streaming.receiver.ActorReceiver
 
onStart() - Method in class org.apache.spark.streaming.receiver.Receiver
This method is called by the system when the receiver is started.
onStart() - Method in class org.apache.spark.streaming.twitter.TwitterReceiver
 
onStop() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
onStop() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
 
onStop() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
onStop() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
onStop() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
onStop() - Method in class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
onStop() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
This is called when the KinesisReceiver stops.
onStop() - Method in class org.apache.spark.streaming.mqtt.MQTTReceiver
 
onStop() - Method in class org.apache.spark.streaming.receiver.ActorReceiver
 
onStop() - Method in class org.apache.spark.streaming.receiver.Receiver
This method is called by the system when the receiver is stopped.
onStop() - Method in class org.apache.spark.streaming.twitter.TwitterReceiver
 
onTaskCompletion(TaskContext) - Method in interface org.apache.spark.util.TaskCompletionListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.JobLogger
When task ends, record task completion status and metrics
onTaskEnd(SparkListenerTaskEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task ends
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.StatsReportListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.storage.StorageStatusListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.storage.StorageListener
Assumes the storage status list is fully up-to-date.
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task begins remotely fetching its result (will not be called for tasks that do not need to fetch the result remotely).
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskStart(SparkListenerTaskStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task starts
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in interface org.apache.spark.scheduler.SparkListener
Called when an RDD is manually unpersisted by the application
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.storage.StorageStatusListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.ui.storage.StorageListener
 
OOM() - Static method in class org.apache.spark.util.SparkExitCode
The default uncaught exception handler was reached, and the uncaught exception was an OutOfMemoryError.
open() - Method in class org.apache.spark.input.PortableDataStream
Create a new DataInputStream from the split and context
open() - Method in class org.apache.spark.SparkHadoopWriter
 
open() - Method in class org.apache.spark.storage.BlockObjectWriter
 
open() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
OpenHashSetSerializer - Class in org.apache.spark.sql.execution
 
OpenHashSetSerializer() - Constructor for class org.apache.spark.sql.execution.OpenHashSetSerializer
 
ops() - Method in class org.apache.spark.graphx.Graph
The associated GraphOps object.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.GradientDescent
:: DeveloperApi :: Runs gradient descent on the given training data.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.LBFGS
 
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in interface org.apache.spark.mllib.optimization.Optimizer
Solve the provided convex optimization problem.
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
 
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
 
Optimizer - Interface in org.apache.spark.mllib.optimization
:: DeveloperApi :: Trait for optimization problem solvers.
optimizer() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
The optimizer to solve the problem.
optimizer() - Method in class org.apache.spark.mllib.regression.LassoWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
 
options() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
optionToOptional(Option<T>) - Static method in class org.apache.spark.api.java.JavaUtils
 
OR() - Static method in class org.apache.spark.sql.hive.HiveQl
 
ord() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
orderBy(Seq<SortOrder>) - Method in class org.apache.spark.sql.SchemaRDD
Sorts the results by the given expressions.
OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
OrderedRDDFunctions(RDD<P>, Ordering<K>, ClassTag<K>, ClassTag<V>, ClassTag<P>) - Constructor for class org.apache.spark.rdd.OrderedRDDFunctions
 
ordering() - Static method in class org.apache.spark.streaming.Time
 
org.apache.spark - package org.apache.spark
Core Spark classes in Scala.
org.apache.spark.annotation - package org.apache.spark.annotation
Spark annotations to mark an API experimental or intended only for advanced usages by developers.
org.apache.spark.api.java - package org.apache.spark.api.java
Spark Java programming APIs.
org.apache.spark.api.java.function - package org.apache.spark.api.java.function
Set of interfaces to represent functions in Spark's Java API.
org.apache.spark.broadcast - package org.apache.spark.broadcast
Spark's broadcast variables, used to broadcast immutable datasets to all nodes.
org.apache.spark.examples.streaming - package org.apache.spark.examples.streaming
 
org.apache.spark.graphx - package org.apache.spark.graphx
ALPHA COMPONENT GraphX is a graph processing framework built on top of Spark.
org.apache.spark.graphx.impl - package org.apache.spark.graphx.impl
 
org.apache.spark.graphx.lib - package org.apache.spark.graphx.lib
Various analytics functions for graphs.
org.apache.spark.graphx.util - package org.apache.spark.graphx.util
Collections of utilities used by graphx.
org.apache.spark.input - package org.apache.spark.input
 
org.apache.spark.io - package org.apache.spark.io
IO codecs used for compression.
org.apache.spark.mapred - package org.apache.spark.mapred
 
org.apache.spark.mapreduce - package org.apache.spark.mapreduce
 
org.apache.spark.metrics - package org.apache.spark.metrics
 
org.apache.spark.metrics.sink - package org.apache.spark.metrics.sink
 
org.apache.spark.metrics.source - package org.apache.spark.metrics.source
 
org.apache.spark.ml - package org.apache.spark.ml
Spark ML is an ALPHA component that adds a new set of machine learning APIs to let users quickly assemble and configure practical machine learning pipelines.
org.apache.spark.ml.classification - package org.apache.spark.ml.classification
 
org.apache.spark.ml.evaluation - package org.apache.spark.ml.evaluation
 
org.apache.spark.ml.feature - package org.apache.spark.ml.feature
 
org.apache.spark.ml.param - package org.apache.spark.ml.param
 
org.apache.spark.ml.tuning - package org.apache.spark.ml.tuning
 
org.apache.spark.mllib.classification - package org.apache.spark.mllib.classification
 
org.apache.spark.mllib.clustering - package org.apache.spark.mllib.clustering
 
org.apache.spark.mllib.evaluation - package org.apache.spark.mllib.evaluation
 
org.apache.spark.mllib.evaluation.binary - package org.apache.spark.mllib.evaluation.binary
 
org.apache.spark.mllib.feature - package org.apache.spark.mllib.feature
 
org.apache.spark.mllib.linalg - package org.apache.spark.mllib.linalg
 
org.apache.spark.mllib.linalg.distributed - package org.apache.spark.mllib.linalg.distributed
 
org.apache.spark.mllib.optimization - package org.apache.spark.mllib.optimization
 
org.apache.spark.mllib.random - package org.apache.spark.mllib.random
 
org.apache.spark.mllib.rdd - package org.apache.spark.mllib.rdd
 
org.apache.spark.mllib.recommendation - package org.apache.spark.mllib.recommendation
 
org.apache.spark.mllib.regression - package org.apache.spark.mllib.regression
 
org.apache.spark.mllib.stat - package org.apache.spark.mllib.stat
 
org.apache.spark.mllib.stat.correlation - package org.apache.spark.mllib.stat.correlation
 
org.apache.spark.mllib.stat.test - package org.apache.spark.mllib.stat.test
 
org.apache.spark.mllib.tree - package org.apache.spark.mllib.tree
 
org.apache.spark.mllib.tree.configuration - package org.apache.spark.mllib.tree.configuration
 
org.apache.spark.mllib.tree.impl - package org.apache.spark.mllib.tree.impl
 
org.apache.spark.mllib.tree.impurity - package org.apache.spark.mllib.tree.impurity
 
org.apache.spark.mllib.tree.loss - package org.apache.spark.mllib.tree.loss
 
org.apache.spark.mllib.tree.model - package org.apache.spark.mllib.tree.model
 
org.apache.spark.mllib.util - package org.apache.spark.mllib.util
 
org.apache.spark.partial - package org.apache.spark.partial
 
org.apache.spark.rdd - package org.apache.spark.rdd
Provides implementation's of various RDDs.
org.apache.spark.scheduler - package org.apache.spark.scheduler
Spark's DAG scheduler.
org.apache.spark.scheduler.cluster - package org.apache.spark.scheduler.cluster
 
org.apache.spark.scheduler.cluster.mesos - package org.apache.spark.scheduler.cluster.mesos
 
org.apache.spark.scheduler.local - package org.apache.spark.scheduler.local
 
org.apache.spark.serializer - package org.apache.spark.serializer
Pluggable serializers for RDD and shuffle data.
org.apache.spark.sql - package org.apache.spark.sql
 
org.apache.spark.sql.api.java - package org.apache.spark.sql.api.java
Allows the execution of relational queries, including those expressed in SQL using Spark.
org.apache.spark.sql.columnar - package org.apache.spark.sql.columnar
 
org.apache.spark.sql.columnar.compression - package org.apache.spark.sql.columnar.compression
 
org.apache.spark.sql.execution - package org.apache.spark.sql.execution
 
org.apache.spark.sql.execution.joins - package org.apache.spark.sql.execution.joins
 
org.apache.spark.sql.hive - package org.apache.spark.sql.hive
 
org.apache.spark.sql.hive.api.java - package org.apache.spark.sql.hive.api.java
 
org.apache.spark.sql.hive.execution - package org.apache.spark.sql.hive.execution
 
org.apache.spark.sql.hive.parquet - package org.apache.spark.sql.hive.parquet
 
org.apache.spark.sql.hive.test - package org.apache.spark.sql.hive.test
 
org.apache.spark.sql.json - package org.apache.spark.sql.json
 
org.apache.spark.sql.parquet - package org.apache.spark.sql.parquet
 
org.apache.spark.sql.sources - package org.apache.spark.sql.sources
 
org.apache.spark.sql.test - package org.apache.spark.sql.test
 
org.apache.spark.sql.types.util - package org.apache.spark.sql.types.util
 
org.apache.spark.storage - package org.apache.spark.storage
 
org.apache.spark.streaming - package org.apache.spark.streaming
 
org.apache.spark.streaming.api.java - package org.apache.spark.streaming.api.java
Java APIs for spark streaming.
org.apache.spark.streaming.dstream - package org.apache.spark.streaming.dstream
Various implementations of DStreams.
org.apache.spark.streaming.flume - package org.apache.spark.streaming.flume
Spark streaming receiver for Flume.
org.apache.spark.streaming.kafka - package org.apache.spark.streaming.kafka
Kafka receiver for spark streaming.
org.apache.spark.streaming.kinesis - package org.apache.spark.streaming.kinesis
 
org.apache.spark.streaming.mqtt - package org.apache.spark.streaming.mqtt
MQTT receiver for Spark Streaming.
org.apache.spark.streaming.rdd - package org.apache.spark.streaming.rdd
 
org.apache.spark.streaming.receiver - package org.apache.spark.streaming.receiver
 
org.apache.spark.streaming.scheduler - package org.apache.spark.streaming.scheduler
 
org.apache.spark.streaming.twitter - package org.apache.spark.streaming.twitter
Twitter feed receiver for spark streaming.
org.apache.spark.streaming.ui - package org.apache.spark.streaming.ui
 
org.apache.spark.streaming.util - package org.apache.spark.streaming.util
 
org.apache.spark.streaming.zeromq - package org.apache.spark.streaming.zeromq
Zeromq receiver for spark streaming.
org.apache.spark.ui - package org.apache.spark.ui
 
org.apache.spark.ui.env - package org.apache.spark.ui.env
 
org.apache.spark.ui.exec - package org.apache.spark.ui.exec
 
org.apache.spark.ui.jobs - package org.apache.spark.ui.jobs
 
org.apache.spark.ui.storage - package org.apache.spark.ui.storage
 
org.apache.spark.util - package org.apache.spark.util
Spark utilities.
org.apache.spark.util.io - package org.apache.spark.util.io
 
org.apache.spark.util.logging - package org.apache.spark.util.logging
 
org.apache.spark.util.random - package org.apache.spark.util.random
Utilities for random number generation.
originals() - Static method in class org.apache.spark.Accumulators
 
originalType() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
other() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.SetCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
otherInfo() - Method in class org.apache.spark.streaming.receiver.Statistics
 
otherVertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet
Given one vertex in the edge return the other vertex.
otherVertexId(long) - Method in class org.apache.spark.graphx.Edge
Given one vertex in the edge return the other vertex.
Out() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges originating from a vertex.
outDegrees() - Method in class org.apache.spark.graphx.GraphOps
The out-degree of each vertex in the graph.
outer() - Method in class org.apache.spark.sql.execution.Generate
 
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option<U>, VD2>, ClassTag<U>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph
Joins the vertices with entries in the table RDD and merges the results using mapFunc.
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option<U>, VD2>, ClassTag<U>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
OutLinkBlock - Class in org.apache.spark.mllib.recommendation
Out-link information for a user or product block.
OutLinkBlock(int[], BitSet[]) - Constructor for class org.apache.spark.mllib.recommendation.OutLinkBlock
 
output() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
output() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
output() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
output() - Method in class org.apache.spark.sql.execution.Aggregate
 
output() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
 
output() - Method in class org.apache.spark.sql.execution.CacheTableCommand
 
output() - Method in class org.apache.spark.sql.execution.DescribeCommand
 
output() - Method in class org.apache.spark.sql.execution.Distinct
 
output() - Method in class org.apache.spark.sql.execution.EvaluatePython
 
output() - Method in class org.apache.spark.sql.execution.Except
 
output() - Method in class org.apache.spark.sql.execution.Exchange
 
output() - Method in class org.apache.spark.sql.execution.ExecutedCommand
 
output() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
output() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
output() - Method in class org.apache.spark.sql.execution.ExternalSort
 
output() - Method in class org.apache.spark.sql.execution.Filter
 
output() - Method in class org.apache.spark.sql.execution.Generate
 
output() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
output() - Method in class org.apache.spark.sql.execution.Intersect
 
output() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
output() - Method in class org.apache.spark.sql.execution.joins.CartesianProduct
 
output() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
output() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
output() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
output() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
output() - Method in class org.apache.spark.sql.execution.Limit
 
output() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
output() - Method in class org.apache.spark.sql.execution.OutputFaker
 
output() - Method in class org.apache.spark.sql.execution.PhysicalRDD
 
output() - Method in class org.apache.spark.sql.execution.Project
 
output() - Method in interface org.apache.spark.sql.execution.RunnableCommand
 
output() - Method in class org.apache.spark.sql.execution.Sample
 
output() - Method in class org.apache.spark.sql.execution.SetCommand
 
output() - Method in class org.apache.spark.sql.execution.Sort
 
output() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
output() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
output() - Method in class org.apache.spark.sql.execution.UncacheTableCommand
 
output() - Method in class org.apache.spark.sql.execution.Union
 
output() - Method in class org.apache.spark.sql.hive.execution.AddFile
 
output() - Method in class org.apache.spark.sql.hive.execution.AddJar
 
output() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
 
output() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
output() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
output() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
output() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
output() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
output() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
output() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
output() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
output() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
output() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
output() - Method in class org.apache.spark.sql.parquet.ParquetRelation
Attributes
output() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
output() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
OUTPUT() - Static method in class org.apache.spark.ui.ToolTips
 
outputBytes() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
outputBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
outputClass() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
outputCol() - Method in interface org.apache.spark.ml.param.HasOutputCol
param for output column name
OutputFaker - Class in org.apache.spark.sql.execution
:: DeveloperApi :: A plan node that does nothing but lie about the output of its child.
OutputFaker(Seq<Attribute>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.OutputFaker
 
outputId() - Method in class org.apache.spark.scheduler.ResultTask
 
outputLocs() - Method in class org.apache.spark.scheduler.Stage
 
outputMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
outputMetricsToJson(OutputMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.Limit
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.SparkPlan
Specifies how data is partitioned across different nodes in the cluster.
outputPartitioning() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
outputPartitioning() - Method in interface org.apache.spark.sql.execution.UnaryNode
 
outputsMerged() - Method in class org.apache.spark.partial.CountEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.MeanEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.SumEvaluator
 
OVERHEAD_FRACTION() - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
OVERHEAD_MINIMUM() - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
overwrite() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
overwrite() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
overwrite() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 

P

pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps
Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
PageRank - Class in org.apache.spark.graphx.lib
PageRank algorithm implementation.
PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
 
pages() - Method in class org.apache.spark.ui.WebUITab
 
PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream
Extra functions available on DStream of (key, value) pairs through an implicit conversion.
PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function
A function that returns zero or more key-value pair records from each input record.
PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function
A function that returns key-value pairs (Tuple2<K, V>), and can be used to construct PairRDDs.
pairFunToScalaFun(PairFunction<A, B, C>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
PairRDDFunctions<K,V> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
 
ParallelCollectionPartition<T> - Class in org.apache.spark.rdd
 
ParallelCollectionPartition(long, int, Seq<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ParallelCollectionPartition
 
ParallelCollectionRDD<T> - Class in org.apache.spark.rdd
 
ParallelCollectionRDD(SparkContext, Seq<T>, int, Map<Object, Seq<String>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ParallelCollectionRDD
 
parallelism() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
Param<T> - Class in org.apache.spark.ml.param
:: AlphaComponent :: A param with self-contained documentation and optionally default value.
Param(Params, String, String, Option<T>) - Constructor for class org.apache.spark.ml.param.Param
 
param() - Method in class org.apache.spark.ml.param.ParamPair
 
ParamGridBuilder - Class in org.apache.spark.ml.tuning
:: AlphaComponent :: Builder for a param grid used in grid search-based model selection.
ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
 
ParamMap - Class in org.apache.spark.ml.param
:: AlphaComponent :: A param to value map.
ParamMap(Map<Param<Object>, Object>) - Constructor for class org.apache.spark.ml.param.ParamMap
 
ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap
Creates an empty param map.
paramMap() - Method in interface org.apache.spark.ml.param.Params
Internal param map.
ParamPair<T> - Class in org.apache.spark.ml.param
A param amd its value.
ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
 
Params - Interface in org.apache.spark.ml.param
:: AlphaComponent :: Trait for components that take parameters.
params() - Method in interface org.apache.spark.ml.param.Params
Returns all params.
parent() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
parent() - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
parent() - Method in class org.apache.spark.ml.Model
The parent estimator that produced this model.
parent() - Method in class org.apache.spark.ml.param.Param
 
parent() - Method in class org.apache.spark.ml.PipelineModel
 
parent() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
parent() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
parent() - Method in class org.apache.spark.scheduler.Pool
 
parent() - Method in interface org.apache.spark.scheduler.Schedulable
 
parent() - Method in class org.apache.spark.scheduler.TaskSetManager
 
parent() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
parent() - Method in class org.apache.spark.streaming.ui.StreamingTab
 
ParentClassLoader - Class in org.apache.spark.util
A class loader which makes findClass accesible to the child
ParentClassLoader(ClassLoader) - Constructor for class org.apache.spark.util.ParentClassLoader
 
parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Get the parent index of the given node, or 0 if it is the root.
parentPartition() - Method in class org.apache.spark.rdd.UnionPartition
 
parentRddIndex() - Method in class org.apache.spark.rdd.UnionPartition
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
parents() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
parents() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
parents() - Method in class org.apache.spark.scheduler.Stage
 
parentsIndices() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
parentSplit() - Method in class org.apache.spark.rdd.PartitionPruningRDDPartition
 
PARQUET_FILTER_DATA() - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
parquetCompressionCodec() - Method in interface org.apache.spark.sql.SQLConf
The compression codec for writing to a Parquetfile
ParquetConversion() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
parquetFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Loads a parquet file, returning the result as a JavaSchemaRDD.
parquetFile(String) - Method in class org.apache.spark.sql.SQLContext
Loads a Parquet file, returning the result as a SchemaRDD.
parquetFilterPushDown() - Method in interface org.apache.spark.sql.SQLConf
When true predicates will be passed to the parquet record reader when possible.
ParquetFilters - Class in org.apache.spark.sql.parquet
 
ParquetFilters() - Constructor for class org.apache.spark.sql.parquet.ParquetFilters
 
ParquetOperations() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
ParquetRelation - Class in org.apache.spark.sql.parquet
Relation that consists of data stored in a Parquet columnar format.
ParquetRelation(String, Option<Configuration>, SQLContext, Seq<Attribute>) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation
 
ParquetRelation2 - Class in org.apache.spark.sql.parquet
An alternative to ParquetRelation that plugs in using the data sources API.
ParquetRelation2(String, SQLContext) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation2
 
parquetSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation
Schema derived from ParquetFile
ParquetTableScan - Class in org.apache.spark.sql.parquet
:: DeveloperApi :: Parquet table scan operator.
ParquetTableScan(Seq<Attribute>, ParquetRelation, Seq<Expression>) - Constructor for class org.apache.spark.sql.parquet.ParquetTableScan
 
ParquetTestData - Class in org.apache.spark.sql.parquet
 
ParquetTestData() - Constructor for class org.apache.spark.sql.parquet.ParquetTestData
 
ParquetTypeInfo - Class in org.apache.spark.sql.parquet
A class representing Parquet info fields we care about, for passing back to Parquet
ParquetTypeInfo(PrimitiveType.PrimitiveTypeName, Option<OriginalType>, Option<DecimalMetadata>, Option<Object>) - Constructor for class org.apache.spark.sql.parquet.ParquetTypeInfo
 
ParquetTypesConverter - Class in org.apache.spark.sql.parquet
 
ParquetTypesConverter() - Constructor for class org.apache.spark.sql.parquet.ParquetTypesConverter
 
parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
Parses a string resulted from Vector#toString into an Vector.
parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
Parses a string resulted from LabeledPoint#toString into an LabeledPoint.
parse(String) - Static method in class org.apache.spark.mllib.util.NumericParser
Parses a string into a Double, an Array[Double], or a Seq[Any].
parseCompressionCodec(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
parseDataType(String) - Method in class org.apache.spark.sql.SQLContext
Parses the data type in our internal string representation.
parseDdl(String) - Static method in class org.apache.spark.sql.hive.HiveQl
 
parseHostPort(String) - Static method in class org.apache.spark.util.Utils
 
parseLoggingInfo(Path, FileSystem) - Static method in class org.apache.spark.scheduler.EventLoggingListener
Parse the event logging information associated with the logs in the given directory.
parseLoggingInfo(String, FileSystem) - Static method in class org.apache.spark.scheduler.EventLoggingListener
Parse the event logging information associated with the logs in the given directory.
parseNumeric(Object) - Static method in class org.apache.spark.mllib.linalg.Vectors
 
parseSparkVersion(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
parseSql(String) - Static method in class org.apache.spark.sql.hive.HiveQl
Returns a LogicalPlan for a given HiveQL string.
parseStream(PortableDataStream) - Method in class org.apache.spark.input.StreamBasedRecordReader
Parse the stream (and close it afterwards) and return the value as in type T
parseStream(PortableDataStream) - Method in class org.apache.spark.input.StreamRecordReader
 
partial() - Method in class org.apache.spark.sql.execution.Aggregate
 
partial() - Method in class org.apache.spark.sql.execution.Distinct
 
partial() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
PartialResult<R> - Class in org.apache.spark.partial
 
PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
 
Partition - Interface in org.apache.spark
An identifier for a partition in an RDD.
partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
partition() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
Partition - Class in org.apache.spark.sql.parquet
 
Partition(Map<String, Object>, Seq<FileStatus>) - Constructor for class org.apache.spark.sql.parquet.Partition
 
partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph
Repartitions the edges in the graph according to partitionStrategy.
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph
Repartitions the edges in the graph according to partitionStrategy.
partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return a copy of the RDD partitioned using the specified partitioner.
PartitionCoalescer - Class in org.apache.spark.rdd
Coalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of this RDD computes one or more of the parent ones.
PartitionCoalescer(int, RDD<?>, double) - Constructor for class org.apache.spark.rdd.PartitionCoalescer
 
PartitionCoalescer.LocationIterator - Class in org.apache.spark.rdd
 
PartitionCoalescer.LocationIterator(RDD<?>) - Constructor for class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
If partitionsRDD already has a partitioner, use it.
partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
Partitioner - Class in org.apache.spark
An object that defines how the elements in a key-value pair RDD are partitioned by key.
Partitioner() - Constructor for class org.apache.spark.Partitioner
 
partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
partitioner() - Method in class org.apache.spark.rdd.FilteredRDD
 
partitioner() - Method in class org.apache.spark.rdd.FlatMappedValuesRDD
 
partitioner() - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
partitioner() - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
partitioner() - Method in class org.apache.spark.rdd.MappedValuesRDD
 
partitioner() - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
partitioner() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
partitioner() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
partitioner() - Method in class org.apache.spark.rdd.RDD
Optionally overridden by subclasses to specify how they are partitioned.
partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
 
partitioner() - Method in class org.apache.spark.rdd.SubtractedRDD
 
partitioner() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
partitioner() - Method in class org.apache.spark.ShuffleDependency
 
PartitionerAwareUnionRDD<T> - Class in org.apache.spark.rdd
Class representing an RDD that can take multiple RDDs partitioned by the same partitioner and unify them into a single RDD while preserving the partitioner.
PartitionerAwareUnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
PartitionerAwareUnionRDDPartition - Class in org.apache.spark.rdd
Class representing partitions of PartitionerAwareUnionRDD, which maintains the list of corresponding partitions of parent RDDs.
PartitionerAwareUnionRDDPartition(Seq<RDD<?>>, int) - Constructor for class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
partitionFilters() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
PartitionGroup - Class in org.apache.spark.rdd
 
PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
 
partitionId() - Method in class org.apache.spark.scheduler.Task
 
partitionId() - Method in class org.apache.spark.TaskContext
 
partitionId() - Method in class org.apache.spark.TaskContextImpl
 
partitioningAttributes() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
partitionKeys() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
partitionPruningPred() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
PartitionPruningRDD<T> - Class in org.apache.spark.rdd
:: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
 
PartitionPruningRDDPartition - Class in org.apache.spark.rdd
 
PartitionPruningRDDPartition(int, Partition) - Constructor for class org.apache.spark.rdd.PartitionPruningRDDPartition
 
partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
Set of partitions in this RDD.
partitions() - Method in class org.apache.spark.rdd.PruneDependency
 
partitions() - Method in class org.apache.spark.rdd.RDD
Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.
partitions() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
partitions() - Method in class org.apache.spark.scheduler.ActiveJob
 
partitions() - Method in class org.apache.spark.scheduler.JobSubmitted
 
partitions() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
partitionSize(int) - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
Returns the number of vertices that will be sent to the specified edge partition.
partitionsRDD() - Method in class org.apache.spark.graphx.EdgeRDD
 
partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
partitionsRDD() - Method in class org.apache.spark.graphx.VertexRDD
 
partitionStatistics() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
PartitionStatistics - Class in org.apache.spark.sql.columnar
 
PartitionStatistics(Seq<Attribute>) - Constructor for class org.apache.spark.sql.columnar.PartitionStatistics
 
PartitionStrategy - Interface in org.apache.spark.graphx
Represents the way edges are assigned to edge partitions based on their source and destination vertex IDs.
PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx
Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical direction, resulting in a random vertex cut that colocates all edges between two vertices, regardless of direction.
PartitionStrategy.CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
 
PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx
Assigns edges to partitions using only the source vertex ID, colocating edges with the same source.
PartitionStrategy.EdgePartition1D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
 
PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx
Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) bound on vertex replication.
PartitionStrategy.EdgePartition2D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
 
PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx
Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a random vertex cut that colocates all same-direction edges between two vertices.
PartitionStrategy.RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
 
partitionToOps(VertexPartition<VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition
Implicit conversion to allow invoking VertexPartitionBase operations directly on a VertexPartition.
partitionValues() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
partitionValues() - Method in class org.apache.spark.sql.parquet.Partition
 
PartitionwiseSampledRDD<T,U> - Class in org.apache.spark.rdd
A RDD sampled from its parent RDD partition-wise.
PartitionwiseSampledRDD(RDD<T>, RandomSampler<T, U>, boolean, long, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.PartitionwiseSampledRDD
 
PartitionwiseSampledRDDPartition - Class in org.apache.spark.rdd
 
PartitionwiseSampledRDDPartition(Partition, long) - Constructor for class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
PassThrough - Class in org.apache.spark.sql.columnar.compression
 
PassThrough() - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough
 
PassThrough.Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
PassThrough.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
PassThrough.Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
PassThrough.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
path() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
path() - Method in class org.apache.spark.scheduler.SplitInfo
 
path() - Method in class org.apache.spark.sql.hive.AddJar
 
path() - Method in class org.apache.spark.sql.hive.execution.AddFile
 
path() - Method in class org.apache.spark.sql.hive.execution.AddJar
 
path() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
path() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
path() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
path() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
PEARSON() - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
PearsonCorrelation - Class in org.apache.spark.mllib.stat.correlation
Compute Pearson correlation for two RDDs of the type RDD[Double] or the correlation matrix for an RDD of the type RDD[Vector].
PearsonCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
 
pendingTasks() - Method in class org.apache.spark.scheduler.Stage
 
pendingTasksWithNoPrefs() - Method in class org.apache.spark.scheduler.TaskSetManager
 
pendingTimes() - Method in class org.apache.spark.streaming.Checkpoint
 
percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph
Caches the vertices and edges associated with this graph at the specified storage level, ignoring any target storage levels previously set.
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
Persists the edge partitions at the specified storage level, ignoring any existing target storage level.
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
Persists the vertex partitions at the specified storage level, ignoring any existing target storage level.
persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.rdd.RDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist(StorageLevel) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.sql.SchemaRDD
 
persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist the RDDs of this DStream with the given storage level
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream
Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.dstream.DStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
persistentRdds() - Method in class org.apache.spark.SparkContext
 
persistRDD(RDD<?>) - Method in class org.apache.spark.SparkContext
Register an RDD to be persisted in memory and/or disk storage
PhysicalRDD - Class in org.apache.spark.sql.execution
 
PhysicalRDD(Seq<Attribute>, RDD<Row>) - Constructor for class org.apache.spark.sql.execution.PhysicalRDD
 
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
pickBin(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
Takes a parent RDD partition and decides which of the partition groups to put it in Takes locality into account, but also uses power of 2 choices to load balance It strikes a balance between the two use the balanceSlack variable
pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps
Picks a random vertex from the graph and returns its ID.
pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(String) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
PipedRDD<T> - Class in org.apache.spark.rdd
An RDD that pipes the contents of each parent partition through an external command (printing them one per line) and returns the output as a collection of strings.
PipedRDD(RDD<T>, Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PipedRDD
 
PipedRDD(RDD<T>, String, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PipedRDD
 
PipedRDD.NotEqualsFileNameFilter - Class in org.apache.spark.rdd
A FilenameFilter that accepts anything that isn't equal to the name passed in.
PipedRDD.NotEqualsFileNameFilter(String) - Constructor for class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
 
Pipeline - Class in org.apache.spark.ml
:: AlphaComponent :: A simple pipeline, which acts as an estimator.
Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
 
PipelineModel - Class in org.apache.spark.ml
:: AlphaComponent :: Represents a compiled pipeline.
PipelineModel(Pipeline, ParamMap, Transformer[]) - Constructor for class org.apache.spark.ml.PipelineModel
 
PipelineStage - Class in org.apache.spark.ml
:: AlphaComponent :: A stage in a pipeline, either an Estimator or a Transformer.
PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
 
plan() - Method in class org.apache.spark.sql.CachedData
 
plan() - Method in class org.apache.spark.sql.execution.CacheTableCommand
 
PluggableInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
PluggableInputDStream(StreamingContext, Receiver<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.PluggableInputDStream
 
plus(Duration) - Method in class org.apache.spark.streaming.Duration
 
plus(Duration) - Method in class org.apache.spark.streaming.Time
 
plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector
return (this + plus) dot other, but without creating any intermediate storage
point() - Method in class org.apache.spark.mllib.feature.VocabWord
 
pointCost(TraversableOnce<VectorWithNorm>, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
Returns the K-means cost of a given point against the given cluster centers.
POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
PoissonBounds - Class in org.apache.spark.util.random
Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample sizes with high confidence when sampling with replacement.
PoissonBounds() - Constructor for class org.apache.spark.util.random.PoissonBounds
 
PoissonGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
 
poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d.
PoissonSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler for sampling with replacement, based on values drawn from Poisson distribution.
PoissonSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
 
poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d.
POLL_TIMEOUT() - Static method in class org.apache.spark.scheduler.DAGScheduler
 
pollDir() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
Pool - Class in org.apache.spark.scheduler
An Schedulable entity that represent collection of Pools or TaskSetManagers
Pool(String, Enumeration.Value, int, int) - Constructor for class org.apache.spark.scheduler.Pool
 
POOL_NAME_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
poolName() - Method in class org.apache.spark.scheduler.Pool
 
PoolPage - Class in org.apache.spark.ui.jobs
Page showing specific pool details
PoolPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.PoolPage
 
POOLS_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
PoolTable - Class in org.apache.spark.ui.jobs
Table showing list of pools
PoolTable(Seq<Schedulable>, StagesTab) - Constructor for class org.apache.spark.ui.jobs.PoolTable
 
poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
port() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
port() - Method in class org.apache.spark.storage.BlockManagerId
 
PortableDataStream - Class in org.apache.spark.input
A class that allows DataStreams to be serialized and moved around by not creating them until they need to be read
PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
 
portMaxRetries(SparkConf) - Static method in class org.apache.spark.util.Utils
Maximum number of retries when binding to a port before giving up.
pos() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
pos() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
post(SparkListenerEvent) - Method in class org.apache.spark.scheduler.LiveListenerBus
 
post(StreamingListenerEvent) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
postStartHook() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
postStartHook() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
postStop() - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessActor
 
postToAll(SparkListenerEvent) - Method in interface org.apache.spark.scheduler.SparkListenerBus
Post an event to all attached listeners.
pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.
Precision - Class in org.apache.spark.mllib.evaluation.binary
Precision.
Precision() - Constructor for class org.apache.spark.mllib.evaluation.binary.Precision
 
precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns precision for a given label (category)
precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns precision
precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns document-based precision averaged by the number of documents
precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns precision for a given label (category)
precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
Compute the average precision of all the queries, truncated at ranking position k.
precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, precision) curve.
predicates() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for examples stored in a JavaRDD.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Returns the cluster index that a given point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Maps given points to their cluster indices.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Maps given points to their cluster indices.
predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Predict the rating of one user for one product.
predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Predict the rating of many users for many products.
predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Java-friendly version of MatrixFactorizationModel.predict.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for examples stored in a JavaRDD.
predict() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Prediction which should be made based on the sufficient statistics.
predict() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Prediction which should be made based on the sufficient statistics.
predict() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Prediction which should be made based on the sufficient statistics.
predict() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Prediction which should be made based on the sufficient statistics.
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Predict values for the given data set using the model trained.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Predict values for the given data set using the model trained.
predict() - Method in class org.apache.spark.mllib.tree.model.Node
 
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node
predict value if node is not leaf
Predict - Class in org.apache.spark.mllib.tree.model
Predicted value for a node
Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
 
predict() - Method in class org.apache.spark.mllib.tree.model.Predict
 
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Predict values for the given data set.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
predictionCol() - Method in interface org.apache.spark.ml.param.HasPredictionCol
param for prediction column name
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Use the clustering model to make predictions on batches of data from a DStream.
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Use the model to make predictions on batches of data from a DStream
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Use the model to make predictions on the values of a DStream and carry over its keys.
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Use the model to make predictions on the values of a DStream and carry over its keys.
preferredLocation() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
preferredLocation() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver
Override this to specify a preferred location (hostname).
preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
Get the preferred locations of a partition, taking into account whether the RDD is checkpointed.
preferredLocations() - Method in class org.apache.spark.rdd.UnionPartition
 
preferredLocations() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
preferredLocations() - Method in class org.apache.spark.scheduler.ResultTask
 
preferredLocations() - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
preferredLocations() - Method in class org.apache.spark.scheduler.Task
 
preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
 
prefix() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
PREFIX() - Static method in class org.apache.spark.streaming.Checkpoint
 
prefix() - Method in class org.apache.spark.ui.WebUIPage
 
prefix() - Method in class org.apache.spark.ui.WebUITab
 
prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
 
pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps
Execute a Pregel-like iterative vertex-parallel abstraction.
Pregel - Class in org.apache.spark.graphx
Implements a Pregel-like bulk-synchronous message-passing API.
Pregel() - Constructor for class org.apache.spark.graphx.Pregel
 
PreInsertionCasts() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
prepare(int) - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
prepareForRead(Configuration, Map<String, String>, MessageType, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.RowReadSupport
 
prepareForWrite(RecordConsumer) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
prepareForWrite(RecordConsumer) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
prependBaseUri(String, String) - Static method in class org.apache.spark.ui.UIUtils
 
preSetup() - Method in class org.apache.spark.SparkHadoopWriter
 
preStart() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
preStart() - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessActor
 
preStart() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
preStart() - Method in class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
prettyPrint() - Method in class org.apache.spark.streaming.Duration
 
prev() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
prev() - Method in class org.apache.spark.rdd.CoalescedRDD
 
prev() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
prev() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
prev() - Method in class org.apache.spark.rdd.ShuffledRDD
 
prev() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
prevHandler() - Method in class org.apache.spark.util.SignalLoggerHandler
 
primitiveType() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Print the first ten elements of each RDD generated in this DStream.
print() - Method in class org.apache.spark.streaming.dstream.DStream
Print the first ten elements of each RDD generated in this DStream.
printSchema() - Method in interface org.apache.spark.sql.SchemaRDDLike
Prints out the schema.
printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
prioritizeContainers(HashMap<K, ArrayBuffer<T>>) - Static method in class org.apache.spark.scheduler.TaskSchedulerImpl
Used to balance containers across hosts.
priority() - Method in class org.apache.spark.scheduler.Pool
 
priority() - Method in interface org.apache.spark.scheduler.Schedulable
 
priority() - Method in class org.apache.spark.scheduler.TaskSet
 
priority() - Method in class org.apache.spark.scheduler.TaskSetManager
 
prob(double) - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Probability of the label given by predict.
prob(double) - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Probability of the label given by predict.
prob(double) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Probability of the label given by predict, or -1 if no probability is available.
prob() - Method in class org.apache.spark.mllib.tree.model.Predict
 
probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for the all jobs of this batch to finish processing from the time they started processing.
processingDelay() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
processingDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
processRecords(List<Record>, IRecordProcessorCheckpointer) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
This method is called by the KCL when a batch of records is pulled from the Kinesis stream.
processResults(ArrayList<Object>) - Static method in class org.apache.spark.sql.hive.HiveShim
 
product() - Method in class org.apache.spark.mllib.recommendation.Rating
 
productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
productToRowRdd(RDD<A>, StructType) - Static method in class org.apache.spark.sql.execution.RDDConversions
 
progressBar() - Method in class org.apache.spark.SparkContext
 
progressListener() - Method in class org.apache.spark.streaming.StreamingContext
 
Project - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Project(Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Project
 
projectList() - Method in class org.apache.spark.sql.execution.Project
 
properties() - Method in class org.apache.spark.metrics.MetricsConfig
 
properties() - Method in class org.apache.spark.scheduler.ActiveJob
 
properties() - Method in class org.apache.spark.scheduler.JobSubmitted
 
properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
properties() - Method in class org.apache.spark.scheduler.TaskSet
 
propertiesFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
propertiesToJson(Properties) - Static method in class org.apache.spark.util.JsonProtocol
 
property() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
property() - Method in class org.apache.spark.metrics.sink.CsvSink
 
property() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
property() - Method in class org.apache.spark.metrics.sink.JmxSink
 
property() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
propertyCategories() - Method in class org.apache.spark.metrics.MetricsConfig
 
propertyToOption(String) - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
provider() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
proxyBase() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
pruneColumns(Seq<Attribute>) - Method in class org.apache.spark.sql.parquet.ParquetTableScan
Applies a (candidate) projection.
PruneDependency<T> - Class in org.apache.spark.rdd
Represents a dependency between the PartitionPruningRDD and its parent.
PruneDependency(RDD<T>, Function1<Object, Object>) - Constructor for class org.apache.spark.rdd.PruneDependency
 
PrunedFilteredScan - Class in org.apache.spark.sql.sources
::DeveloperApi:: A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Row objects.
PrunedFilteredScan() - Constructor for class org.apache.spark.sql.sources.PrunedFilteredScan
 
PrunedScan - Class in org.apache.spark.sql.sources
::DeveloperApi:: A BaseRelation that can eliminate unneeded columns before producing an RDD containing all of its tuples as Row objects.
PrunedScan() - Constructor for class org.apache.spark.sql.sources.PrunedScan
 
prunePartitions(Seq<Partition>) - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
Prunes partitions not involve the query plan.
Pseudorandom - Interface in org.apache.spark.util.random
:: DeveloperApi :: A class with pseudorandom behavior.
pushAndReportBlock(ReceivedBlock, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Store block and report it to driver
pushArrayBuffer(ArrayBuffer<?>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Store an ArrayBuffer of received data as a data block into Spark's memory.
pushArrayBuffer(ArrayBuffer<?>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Store an ArrayBuffer of received data as a data block into Spark's memory.
pushBytes(ByteBuffer, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Store the bytes of received data as a data block into Spark's memory.
pushBytes(ByteBuffer, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Store the bytes of received data as a data block into Spark's memory.
pushIterator(Iterator<Object>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Store a iterator of received data as a data block into Spark's memory.
pushIterator(Iterator<Object>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Store a iterator of received data as a data block into Spark's memory.
pushSingle(Object) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Push a single data item to backend data store.
pushSingle(Object) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Push a single record of received data into block generator.
put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap
Puts a list of param pairs (overwrites if the input params exists).
put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
Puts a (param, value) pair (overwrites if the input param exists).
put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap
Puts a list of param pairs (overwrites if the input params exists).
putAll(Map<A, B>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
putArray(BlockId, Object[], StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
Put a new block of values to the block manager.
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.DiskStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.MemoryStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.TachyonStore
 
putBlockData(BlockId, ManagedBuffer, StorageLevel) - Method in class org.apache.spark.storage.BlockManager
Put the block locally, using the given storage level.
putBytes(BlockId, ByteBuffer, StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
Put a new block of serialized bytes to the block manager.
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.BlockStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.DiskStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.MemoryStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.TachyonStore
 
putCachedMetadata(String, Object) - Static method in class org.apache.spark.rdd.HadoopRDD
 
putIfAbsent(A, B) - Method in class org.apache.spark.util.TimeStampedHashMap
 
putIfAbsent(A, B) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockStore
Put in a block and, possibly, also return its content as either bytes or another Iterator.
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.DiskStore
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.MemoryStore
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean, boolean) - Method in class org.apache.spark.storage.MemoryStore
Attempt to put the given block in memory store.
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.TachyonStore
 
PutResult - Class in org.apache.spark.storage
Result of adding a block into a BlockStore.
PutResult(long, Either<Iterator<Object>, ByteBuffer>, Seq<Tuple2<BlockId, BlockStatus>>) - Constructor for class org.apache.spark.storage.PutResult
 
putSingle(BlockId, Object, StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockManager
Write a block consisting of a single object.
pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult
The probability of obtaining a test statistic result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.
pythonExec() - Method in class org.apache.spark.sql.execution.PythonUDF
 
pythonIncludes() - Method in class org.apache.spark.sql.execution.PythonUDF
 
PythonUDF - Class in org.apache.spark.sql.execution
A serialized version of a Python lambda function.
PythonUDF(String, byte[], Map<String, String>, List<String>, String, List<Broadcast<PythonBroadcast>>, Accumulator<List<byte[]>>, DataType, Seq<Expression>) - Constructor for class org.apache.spark.sql.execution.PythonUDF
 
pyUDT() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
pyUDT() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 

Q

quantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
QuantileStrategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum for selecting the quantile calculation strategy
QuantileStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
quantileStrategy() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
query() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
queryExecution() - Method in interface org.apache.spark.sql.SchemaRDDLike
:: DeveloperApi :: A lazily computed query execution workflow.
QueryExecutionException - Exception in org.apache.spark.sql.execution
 
QueryExecutionException(String) - Constructor for exception org.apache.spark.sql.execution.QueryExecutionException
 
queue() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
QueueInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
QueueInputDStream(StreamingContext, Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.QueueInputDStream
 
queueIsEmpty() - Method in class org.apache.spark.scheduler.LiveListenerBus
Return whether the event queue is empty.
queueStream(Queue<JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean, JavaRDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream from a queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream from a queue of RDDs.

R

r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns R^2^, the coefficient of determination.
RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a DenseMatrix consisting of i.i.d.
RAND() - Static method in class org.apache.spark.sql.hive.HiveQl
 
randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a DenseMatrix consisting of i.i.d.
RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
 
random() - Static method in class org.apache.spark.util.Utils
 
random(int, Random) - Static method in class org.apache.spark.util.Vector
Creates this Vector of given length containing random numbers between 0.0 and 1.0.
RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random
:: DeveloperApi :: Trait for random data generators that generate i.i.d.
RandomForest - Class in org.apache.spark.mllib.tree
:: Experimental :: A class that implements a Random Forest learning algorithm for classification and regression.
RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
 
RandomForest.NodeIndexInfo - Class in org.apache.spark.mllib.tree
 
RandomForest.NodeIndexInfo(int, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
RandomForestModel - Class in org.apache.spark.mllib.tree.model
:: Experimental :: Represents a random forest model.
RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
 
randomize(TraversableOnce<T>, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
Shuffle the elements of a collection into a random order, returning the result in a new collection.
randomizeInPlace(Object, Random) - Static method in class org.apache.spark.util.Utils
Shuffle the elements of an array into a random order, modifying the original array.
randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs
:: DeveloperApi :: Generates an RDD comprised of i.i.d.
RandomRDD<T> - Class in org.apache.spark.mllib.rdd
 
RandomRDD(SparkContext, long, int, RandomDataGenerator<T>, long, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RandomRDD
 
RandomRDDPartition<T> - Class in org.apache.spark.mllib.rdd
 
RandomRDDPartition(int, int, RandomDataGenerator<T>, long) - Constructor for class org.apache.spark.mllib.rdd.RandomRDDPartition
 
RandomRDDs - Class in org.apache.spark.mllib.random
:: Experimental :: Generator methods for creating RDDs comprised of i.i.d.
RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
 
RandomSampler<T,U> - Interface in org.apache.spark.util.random
:: DeveloperApi :: A pseudorandom sampler.
randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD
Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD
Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD
Randomly splits this RDD with the provided weights.
randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
:: DeveloperApi :: Generates an RDD[Vector] with vectors containing i.i.d.
RandomVectorRDD - Class in org.apache.spark.mllib.rdd
 
RandomVectorRDD(SparkContext, long, int, int, RandomDataGenerator<Object>, long) - Constructor for class org.apache.spark.mllib.rdd.RandomVectorRDD
 
RangeDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
 
RangePartitioner<K,V> - Class in org.apache.spark
A Partitioner that partitions sortable records by range into roughly equal ranges.
RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
 
rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation
::Experimental:: Evaluator for ranking algorithms.
RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
 
RateLimitedOutputStream - Class in org.apache.spark.streaming.util
 
RateLimitedOutputStream(OutputStream, int) - Constructor for class org.apache.spark.streaming.util.RateLimitedOutputStream
 
RateLimiter - Class in org.apache.spark.streaming.receiver
Provides waitToPush() method to limit the rate at which receivers consume data.
RateLimiter(SparkConf) - Constructor for class org.apache.spark.streaming.receiver.RateLimiter
 
Rating - Class in org.apache.spark.mllib.recommendation
:: Experimental :: A more compact class to represent a rating than Tuple3[Int, Int, Double].
Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
 
rating() - Method in class org.apache.spark.mllib.recommendation.Rating
 
ratingsForBlock() - Method in class org.apache.spark.mllib.recommendation.InLinkBlock
 
RawInputDStream<T> - Class in org.apache.spark.streaming.dstream
An input stream that reads blocks of serialized objects from a given network address.
RawInputDStream(StreamingContext, String, int, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.RawInputDStream
 
RawNetworkReceiver - Class in org.apache.spark.streaming.dstream
 
RawNetworkReceiver(String, int, StorageLevel) - Constructor for class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
RawTextHelper - Class in org.apache.spark.streaming.util
 
RawTextHelper() - Constructor for class org.apache.spark.streaming.util.RawTextHelper
 
RawTextSender - Class in org.apache.spark.streaming.util
A helper program that sends blocks of Kryo-serialized text strings out on a socket at a specified rate.
RawTextSender() - Constructor for class org.apache.spark.streaming.util.RawTextSender
 
rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
 
rdd() - Method in class org.apache.spark.api.java.JavaRDD
 
rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
rdd() - Method in class org.apache.spark.Dependency
 
rdd() - Method in class org.apache.spark.NarrowDependency
 
rdd() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
rdd() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
RDD<T> - Class in org.apache.spark.rdd
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
 
RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
Construct an RDD with just a one-to-one dependency on one parent
rdd() - Method in class org.apache.spark.scheduler.Stage
 
rdd() - Method in class org.apache.spark.ShuffleDependency
 
rdd() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
rdd() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
rdd() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
rdd() - Method in class org.apache.spark.sql.execution.PhysicalRDD
 
RDD() - Static method in class org.apache.spark.storage.BlockId
 
rdd1() - Method in class org.apache.spark.rdd.CartesianRDD
 
rdd1() - Method in class org.apache.spark.rdd.SubtractedRDD
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd2() - Method in class org.apache.spark.rdd.CartesianRDD
 
rdd2() - Method in class org.apache.spark.rdd.SubtractedRDD
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd3() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd3() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd4() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
RDDBlockId - Class in org.apache.spark.storage
 
RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
 
rddBlocks() - Method in class org.apache.spark.storage.StorageStatus
Return the RDD blocks stored in this block manager.
rddBlocks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus
Return the blocks that belong to the given RDD stored in this block manager.
RDDCheckpointData<T> - Class in org.apache.spark.rdd
This class contains all the information related to RDD checkpointing.
RDDCheckpointData(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDDCheckpointData
 
rddCleaned(int) - Method in interface org.apache.spark.CleanerListener
 
RDDConversions - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
RDDConversions() - Constructor for class org.apache.spark.sql.execution.RDDConversions
 
RDDFunctions<T> - Class in org.apache.spark.mllib.rdd
Machine learning specific RDD functions.
RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
 
rddId() - Method in class org.apache.spark.CleanRDD
 
rddId() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
rddId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
 
rddId() - Method in class org.apache.spark.storage.RDDBlockId
 
RDDInfo - Class in org.apache.spark.storage
 
RDDInfo(int, String, int, StorageLevel) - Constructor for class org.apache.spark.storage.RDDInfo
 
rddInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener
Filter RDD info to include only those with cached partitions
rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
 
rddInfoToJson(RDDInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
RDDPage - Class in org.apache.spark.ui.storage
Page showing storage details for a given RDD
RDDPage(StorageTab) - Constructor for class org.apache.spark.ui.storage.RDDPage
 
rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
rdds() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
rdds() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
rdds() - Method in class org.apache.spark.rdd.UnionRDD
 
rdds() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus
Return the storage level, if any, used by the given RDD in this block manager.
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
 
rddToFileName(String, String, Time) - Static method in class org.apache.spark.streaming.StreamingContext
 
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
 
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
 
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
 
read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
read(Kryo, Input, Class<BigDecimal>) - Method in class org.apache.spark.sql.execution.BigDecimalSerializer
 
read(Kryo, Input, Class<HyperLogLog>) - Method in class org.apache.spark.sql.execution.HyperLogLogSerializer
 
read(Kryo, Input, Class<IntegerHashSet>) - Method in class org.apache.spark.sql.execution.IntegerHashSetSerializer
 
read(Kryo, Input, Class<LongHashSet>) - Method in class org.apache.spark.sql.execution.LongHashSetSerializer
 
read(Kryo, Input, Class<OpenHashSet<?>>) - Method in class org.apache.spark.sql.execution.OpenHashSetSerializer
 
read(String, SparkConf, Configuration) - Static method in class org.apache.spark.streaming.CheckpointReader
 
read(WriteAheadLogFileSegment) - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
read() - Method in class org.apache.spark.util.ByteBufferInputStream
 
read(byte[]) - Method in class org.apache.spark.util.ByteBufferInputStream
 
read(byte[], int, int) - Method in class org.apache.spark.util.ByteBufferInputStream
 
readBatches() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.DirectTaskResult
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
 
readExternal(ObjectInput) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
 
readExternal(ObjectInput) - Static method in class org.apache.spark.streaming.flume.EventTransformer
 
readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
readFromFile(Path, Broadcast<SerializableWritable<Configuration>>, TaskContext) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
readFromLog() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Read all the existing logs from the log directory.
readLock(Function0<A>) - Method in interface org.apache.spark.sql.CacheManager
Acquires a read lock on the cache for the duration of `f`.
readMetaData(Path, Option<Configuration>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Try to read Parquet metadata at the given Path.
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
 
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.JavaDeserializationStream
 
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.KryoDeserializationStream
 
readPartitions() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
readSchemaFromFile(Path, Option<Configuration>, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Reads in Parquet Metadata from the given path and tries to extract the schema (Catalyst attributes) from the application-specific key-value map.
ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
 
ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
Blocks until this action completes.
ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
 
RealClock - Class in org.apache.spark
A clock backed by a monotonically increasing time source.
RealClock() - Constructor for class org.apache.spark.RealClock
 
reason() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
reason() - Method in class org.apache.spark.scheduler.CompletionEvent
 
reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
reason() - Method in class org.apache.spark.scheduler.TaskSetFailed
 
recache() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
Recall - Class in org.apache.spark.mllib.evaluation.binary
Recall.
Recall() - Constructor for class org.apache.spark.mllib.evaluation.binary.Recall
 
recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns recall for a given label (category)
recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns recall (equals to precision for multiclass classifier because sum of all false positives is equal to sum of all false negatives)
recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns document-based recall averaged by the number of documents
recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns recall for a given label (category)
recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, recall) curve.
receive() - Method in class org.apache.spark.scheduler.DAGSchedulerActorSupervisor
 
receive() - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessActor
The main event loop of the DAG scheduler.
receive() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
Create a socket connection and receive data until receiver is stopped
receive() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
receive() - Method in class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
receive() - Method in interface org.apache.spark.util.ActorLogReceive
 
ReceivedBlock - Interface in org.apache.spark.streaming.receiver
Trait representing a received block
ReceivedBlockHandler - Interface in org.apache.spark.streaming.receiver
Trait that represents a class that handles the storage of blocks received by receiver
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.AddBlock
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BlockAdditionEvent
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
ReceivedBlockInfo - Class in org.apache.spark.streaming.scheduler
Information about blocks received by the receiver
ReceivedBlockInfo(int, long, ReceivedBlockStoreResult) - Constructor for class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
ReceivedBlockStoreResult - Interface in org.apache.spark.streaming.receiver
Trait that represents the metadata related to storage of blocks
ReceivedBlockTracker - Class in org.apache.spark.streaming.scheduler
Class that keep track of all the received blocks, and allocate them to batches when required.
ReceivedBlockTracker(SparkConf, Configuration, Seq<Object>, Clock, Option<String>) - Constructor for class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
 
ReceivedBlockTrackerLogEvent - Interface in org.apache.spark.streaming.scheduler
Trait representing any event in the ReceivedBlockTracker that updates its state.
receivedRecordsDistributions() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
Receiver<T> - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: Abstract class of a receiver that can be run on worker nodes to receive external data.
Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
 
receiverActor() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
receiverExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
ReceiverInfo - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Class having information about a receiver
ReceiverInfo(int, String, ActorRef, boolean, String, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
receiverInfo(int) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream
Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data.
ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
ReceiverMessage - Interface in org.apache.spark.streaming.receiver
Messages sent to the Receiver.
ReceiverState() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
 
receiverState() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
State of the receiver
receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented receiver.
receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented receiver.
ReceiverSupervisor - Class in org.apache.spark.streaming.receiver
Abstract class that is responsible for supervising a Receiver in the worker.
ReceiverSupervisor(Receiver<?>, SparkConf) - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisor
 
ReceiverSupervisor.ReceiverState - Class in org.apache.spark.streaming.receiver
 
ReceiverSupervisor.ReceiverState() - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
Enumeration to identify current state of the StreamingContext
ReceiverSupervisorImpl - Class in org.apache.spark.streaming.receiver
Concrete implementation of ReceiverSupervisor which provides all the necessary functionality for handling the data received by the receiver.
ReceiverSupervisorImpl(Receiver<?>, SparkEnv, Configuration, Option<String>) - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
 
receiverTracker() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
ReceiverTracker - Class in org.apache.spark.streaming.scheduler
This class manages the execution of the receivers of ReceiverInputDStreams.
ReceiverTracker(StreamingContext, boolean) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverTracker
 
ReceiverTracker.ReceiverLauncher - Class in org.apache.spark.streaming.scheduler
This thread class runs all the receivers on the cluster.
ReceiverTracker.ReceiverLauncher() - Constructor for class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
ReceiverTrackerMessage - Interface in org.apache.spark.streaming.scheduler
Messages used by the NetworkReceiver and the ReceiverTracker to communicate with each other.
receiveWithLogging() - Method in class org.apache.spark.HeartbeatReceiver
 
receiveWithLogging() - Method in class org.apache.spark.MapOutputTrackerMasterActor
 
receiveWithLogging() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
receiveWithLogging() - Method in class org.apache.spark.scheduler.local.LocalActor
 
receiveWithLogging() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
receiveWithLogging() - Method in class org.apache.spark.storage.BlockManagerSlaveActor
 
receiveWithLogging() - Method in interface org.apache.spark.util.ActorLogReceive
 
recentExceptions() - Method in class org.apache.spark.scheduler.TaskSetManager
 
recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Recommends products to a user.
recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Recommends users to a product.
recomputeLocality() - Method in class org.apache.spark.scheduler.TaskSetManager
 
RECORD_LENGTH_PROPERTY() - Static method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Property name to set in Hadoop JobConfs for record length
recordProcessorFactory() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD
Update the input bytes read metric each time this number of records has been read
RECORDS_BETWEEN_BYTES_WRITTEN_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.PairRDDFunctions
 
RecurringTimer - Class in org.apache.spark.streaming.util
 
RecurringTimer(Clock, long, Function1<Object, BoxedUnit>, String) - Constructor for class org.apache.spark.streaming.util.RecurringTimer
 
RedirectThread - Class in org.apache.spark.util
A utility class to redirect the child process's stdout or stderr.
RedirectThread(InputStream, OutputStream, String, boolean) - Constructor for class org.apache.spark.util.RedirectThread
 
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Create a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by reducing over a using incremental computation.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for reduceByKeyLocally
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reducedStream() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
ReducedWindowedDStream<K,V> - Class in org.apache.spark.streaming.dstream
 
ReducedWindowedDStream(DStream<Tuple2<K, V>>, Function2<V, V, V>, Function2<V, V, V>, Option<Function1<Tuple2<K, V>, Object>>, Duration, Duration, Partitioner, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
reduceId() - Method in class org.apache.spark.FetchFailed
 
reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
REGEX() - Static method in class org.apache.spark.streaming.Checkpoint
 
REGEXP() - Static method in class org.apache.spark.sql.hive.HiveQl
 
register(Accumulable<?, ?>, boolean) - Static method in class org.apache.spark.Accumulators
 
register() - Method in class org.apache.spark.streaming.dstream.DStream
Register this streaming as an output stream.
register(Logger) - Static method in class org.apache.spark.util.SignalLogger
Register a signal handler to log signals on UNIX-like systems.
registerAsTable(String) - Method in interface org.apache.spark.sql.SchemaRDDLike
 
registerBlockManager(BlockManagerId, long, ActorRef) - Method in class org.apache.spark.storage.BlockManagerMaster
Register the BlockManager's id with the driver.
registerBroadcastForCleanup(Broadcast<T>) - Method in class org.apache.spark.ContextCleaner
Register a Broadcast for cleanup when it is garbage collected.
registerClasses(Kryo) - Method in class org.apache.spark.graphx.GraphKryoRegistrator
 
registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
 
registered(SchedulerDriver, Protos.FrameworkID, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
registered(SchedulerDriver, Protos.FrameworkID, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
registeredLock() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
registeredLock() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
registerFunction(String, UDF1<?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF2<?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF3<?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, Function1<?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
registerFunction 1-22 were generated by this script
registerFunction(String, Function2<?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function3<?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function4<?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function5<?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function6<?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function7<?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function8<?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function9<?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils
Registers classes that GraphX uses with Kryo.
registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf
Use Kryo serialization and register the given set of classes with Kryo.
registerMapOutput(int, int, MapStatus) - Method in class org.apache.spark.MapOutputTrackerMaster
 
registerMapOutputs(int, MapStatus[], boolean) - Method in class org.apache.spark.MapOutputTrackerMaster
Register multiple map output information for the given shuffle
registerPython(String, byte[], Map<String, String>, List<String>, String, List<Broadcast<PythonBroadcast>>, Accumulator<List<byte[]>>, String) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerRDDAsTable(JavaSchemaRDD, String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Registers the given RDD as a temporary table in the catalog.
registerRDDAsTable(SchemaRDD, String) - Method in class org.apache.spark.sql.SQLContext
Registers the given RDD as a temporary table in the catalog.
registerRDDForCleanup(RDD<?>) - Method in class org.apache.spark.ContextCleaner
Register a RDD for cleanup when it is garbage collected.
RegisterReceiver - Class in org.apache.spark.streaming.scheduler
 
RegisterReceiver(int, String, String, ActorRef) - Constructor for class org.apache.spark.streaming.scheduler.RegisterReceiver
 
registerShuffle(int, int) - Method in class org.apache.spark.MapOutputTrackerMaster
 
registerShuffleForCleanup(ShuffleDependency<?, ?, ?>) - Method in class org.apache.spark.ContextCleaner
Register a ShuffleDependency for cleanup when it is garbage collected.
registerShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
registerShutdownDeleteDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
registerSource(Source) - Method in class org.apache.spark.metrics.MetricsSystem
 
registerTable(Seq<String>, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
UNIMPLEMENTED: It needs to be decided how we will persist in-memory tables to the metastore.
registerTempTable(String) - Method in interface org.apache.spark.sql.SchemaRDDLike
Registers this RDD as a temporary table using the given name.
registerTestTable(TestHiveContext.TestTable) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
registrationDone() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
registrationLock() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
registry() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
registry() - Method in class org.apache.spark.metrics.sink.CsvSink
 
registry() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
registry() - Method in class org.apache.spark.metrics.sink.JmxSink
 
registry() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
regParam() - Method in interface org.apache.spark.ml.param.HasRegParam
param for regularization parameter
Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
RegressionMetrics - Class in org.apache.spark.mllib.evaluation
:: Experimental :: Evaluator for regression.
RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
 
RegressionModel - Interface in org.apache.spark.mllib.regression
 
reindex() - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Construct a new VertexPartition whose index contains only the vertices in the mask.
reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
reindex() - Method in class org.apache.spark.graphx.VertexRDD
Construct a new VertexRDD that is indexed by only the visible vertices.
relation() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
relation() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
relation() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
relation() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
relation() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
RelationProvider - Interface in org.apache.spark.sql.sources
::DeveloperApi:: Implemented by objects that produce relations for a specific kind of data source.
relativeDirection(long) - Method in class org.apache.spark.graphx.Edge
Return the relative direction of the edge to the corresponding vertex.
releasePythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
 
releaseUnrollMemoryForThisThread(long) - Method in class org.apache.spark.storage.MemoryStore
Release memory used by this thread for unrolling blocks.
ReliableKafkaReceiver<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka
ReliableKafkaReceiver offers the ability to reliably store data into BlockManager without loss.
ReliableKafkaReceiver(Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
remainingMem() - Method in class org.apache.spark.storage.BlockManagerInfo
 
remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Sets each DStreams in this context to remember RDDs it generated in the last given duration.
remember(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
 
remember(Duration) - Method in class org.apache.spark.streaming.DStreamGraph
 
remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext
Set each DStreams in this context to remember RDDs it generated in the last given duration.
rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
rememberDuration() - Method in class org.apache.spark.streaming.DStreamGraph
 
remove(String) - Method in class org.apache.spark.SparkConf
Remove a parameter from the configuration
remove(BlockId) - Method in class org.apache.spark.storage.BlockStore
Remove a block, if it exists.
remove(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
remove(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
remove(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
removeBlock(BlockId, boolean) - Method in class org.apache.spark.storage.BlockManager
Remove a block from both memory and disk.
removeBlock(BlockId) - Method in class org.apache.spark.storage.BlockManagerInfo
 
removeBlock(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove a block from the slaves that have it.
removeBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
Remove the given block from this storage status.
removeBlocks() - Method in class org.apache.spark.rdd.BlockRDD
Remove the data blocks that this BlockRDD is made from.
removeBroadcast(long, boolean) - Method in class org.apache.spark.storage.BlockManager
Remove all blocks belonging to the given broadcast.
removeBroadcast(long, boolean, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove all blocks belonging to the given broadcast.
removeExecutor(String, String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
removeExecutor(String, String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
removeExecutor(String) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove a dead executor from the driver actor.
removeFile(TachyonFile) - Method in class org.apache.spark.storage.TachyonBlockManager
 
removeFromDriver() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
removeOutputLoc(int, BlockManagerId) - Method in class org.apache.spark.scheduler.Stage
 
removeOutputsOnExecutor(String) - Method in class org.apache.spark.scheduler.Stage
Removes all shuffle outputs associated with this executor.
removeRdd(int) - Method in class org.apache.spark.storage.BlockManager
Remove all blocks belonging to the given RDD.
removeRdd(int, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove all blocks belonging to the given RDD.
removeRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager
If the given task ID is in the set of running tasks, removes it.
removeSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
 
removeSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
 
removeSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
 
removeShuffle(int, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove all blocks belonging to the given shuffle.
removeSource(Source) - Method in class org.apache.spark.metrics.MetricsSystem
 
render(HttpServletRequest) - Method in class org.apache.spark.streaming.ui.StreamingPage
Render the page
render(HttpServletRequest) - Method in class org.apache.spark.ui.env.EnvironmentPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.exec.ExecutorsPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.exec.ExecutorThreadDumpPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.AllJobsPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.AllStagesPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.JobPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.PoolPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.StagePage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.storage.RDDPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.storage.StoragePage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.WebUIPage
 
renderJson(HttpServletRequest) - Method in class org.apache.spark.ui.WebUIPage
 
repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream with an increased or decreased level of parallelism.
repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
replay() - Method in class org.apache.spark.scheduler.ReplayListenerBus
Replay each event in the order maintained in the given logs.
ReplayListenerBus - Class in org.apache.spark.scheduler
A SparkListenerBus that replays logged events from persisted storage.
ReplayListenerBus(Seq<Path>, FileSystem, Option<CompressionCodec>) - Constructor for class org.apache.spark.scheduler.ReplayListenerBus
 
replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
ReplicatedVertexView<VD,ED> - Class in org.apache.spark.graphx.impl
Manages shipping vertex attributes to the edge partitions of an EdgeRDD.
ReplicatedVertexView(EdgeRDDImpl<ED, VD>, boolean, boolean, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.impl.ReplicatedVertexView
 
replication() - Method in class org.apache.spark.storage.StorageLevel
 
report() - Method in class org.apache.spark.metrics.MetricsSystem
 
report() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
report() - Method in class org.apache.spark.metrics.sink.CsvSink
 
report() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
report() - Method in class org.apache.spark.metrics.sink.JmxSink
 
report() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
report() - Method in interface org.apache.spark.metrics.sink.Sink
 
reporter() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
reporter() - Method in class org.apache.spark.metrics.sink.CsvSink
 
reporter() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
reporter() - Method in class org.apache.spark.metrics.sink.JmxSink
 
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Report exceptions in receiving data.
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Report errors.
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Report error to the receiver tracker
reportError(String, Throwable) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
ReportError - Class in org.apache.spark.streaming.scheduler
 
ReportError(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReportError
 
requestedAttributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
requestedPartitionOrdinals() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
requestedTotal() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
 
requestExecutors(int) - Method in interface org.apache.spark.ExecutorAllocationClient
Request an additional number of executors from the cluster manager.
requestExecutors(int) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
Request an additional number of executors from the cluster manager.
requestExecutors(int) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Request an additional number of executors from the cluster manager.
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Aggregate
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Distinct
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.ExternalSort
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Sort
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.SparkPlan
Specifies any partition requirements on the input data for this operator.
reregister() - Method in class org.apache.spark.storage.BlockManager
Re-register with the master and report all blocks to it.
reregisterBlockManager() - Method in class org.apache.spark.HeartbeatResponse
 
reregistered(SchedulerDriver, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
reregistered(SchedulerDriver, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
res() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
reservedSizeBytes() - Static method in class org.apache.spark.util.AkkaUtils
Space reserved for extra data in an Akka message besides serialized task or task result.
reserveUnrollMemoryForThisThread(long) - Method in class org.apache.spark.storage.MemoryStore
Reserve additional memory for unrolling blocks used by this thread.
reservoirSampleAndCount(Iterator<T>, int, long, ClassTag<T>) - Static method in class org.apache.spark.util.random.SamplingUtils
Reservoir sampling implementation that also returns the input size.
reset() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
Resets the test instance by deleting any tables that have been created.
resetIterator() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
resolveClass(ObjectStreamClass) - Method in class org.apache.spark.streaming.ObjectInputStreamWithLoader
 
resolved() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
resolveURI(String, boolean) - Static method in class org.apache.spark.util.Utils
Return a well-formed URI for the file described by a user input string.
resolveURIs(String, boolean) - Static method in class org.apache.spark.util.Utils
Resolve a comma-separated list of paths.
resourceOffer(String, String, Enumeration.Value) - Method in class org.apache.spark.scheduler.TaskSetManager
Respond to an offer of a single executor from the scheduler by finding a task
resourceOffers(SchedulerDriver, List<Protos.Offer>) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
Method called by Mesos to offer resources on slaves.
resourceOffers(SchedulerDriver, List<Protos.Offer>) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Method called by Mesos to offer resources on slaves.
resourceOffers(Seq<WorkerOffer>) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
Called by cluster manager to offer resources on slaves.
resourcePool() - Static method in class org.apache.spark.sql.execution.SparkSqlSerializer
 
responder() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
responder() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
restart(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
restartReceiver(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Restart receiver with delay
restartReceiver(String, Option<Throwable>, int) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Restart receiver with delay
restore() - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
Restore the checkpoint data.
restore() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
restoreCheckpointData() - Method in class org.apache.spark.streaming.dstream.DStream
Restore the RDDs in generatedRDDs from the checkpointData.
restoreCheckpointData() - Method in class org.apache.spark.streaming.DStreamGraph
 
RESUBMIT_TIMEOUT() - Static method in class org.apache.spark.scheduler.DAGScheduler
 
resubmitFailedStages() - Method in class org.apache.spark.scheduler.DAGScheduler
Resubmit any failed stages.
ResubmitFailedStages - Class in org.apache.spark.scheduler
 
ResubmitFailedStages() - Constructor for class org.apache.spark.scheduler.ResubmitFailedStages
 
Resubmitted - Class in org.apache.spark
:: DeveloperApi :: A ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed.
Resubmitted() - Constructor for class org.apache.spark.Resubmitted
 
result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
 
result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
Awaits and returns the result (of type T) of this action.
result() - Method in class org.apache.spark.scheduler.CompletionEvent
 
result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
 
result() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
 
result() - Method in class org.apache.spark.streaming.scheduler.Job
 
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
resultAttribute() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
resultAttribute() - Method in class org.apache.spark.sql.execution.EvaluatePython
 
resultObject() - Method in class org.apache.spark.partial.ApproximateActionListener
 
resultOfJob() - Method in class org.apache.spark.scheduler.Stage
For stages that are the final (consists of only ResultTasks), link to the ActiveJob.
resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
 
ResultTask<T,U> - Class in org.apache.spark.scheduler
A task that sends back the output to the driver application.
ResultTask(int, Broadcast<byte[]>, Partition, Seq<TaskLocation>, int) - Constructor for class org.apache.spark.scheduler.ResultTask
 
ResultWithDroppedBlocks - Class in org.apache.spark.storage
 
ResultWithDroppedBlocks(boolean, Seq<Tuple2<BlockId, BlockStatus>>) - Constructor for class org.apache.spark.storage.ResultWithDroppedBlocks
 
retag(Class<T>) - Method in class org.apache.spark.rdd.RDD
Private API for changing an RDD's ClassTag.
retag(ClassTag<T>) - Method in class org.apache.spark.rdd.RDD
Private API for changing an RDD's ClassTag.
RETAINED_FILES_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
retainedCompletedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
retainedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
retryRandom(Function0<T>, int, int) - Static method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
Retry the given amount of times with a random backoff time (millis) less than the given maxBackOffMillis
retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the configured number of milliseconds to wait on each retry
returnInspector() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
ReturnStatementFinder - Class in org.apache.spark.util
 
ReturnStatementFinder() - Constructor for class org.apache.spark.util.ReturnStatementFinder
 
reverse() - Method in class org.apache.spark.graphx.EdgeDirection
Reverse the direction of an edge.
reverse() - Method in class org.apache.spark.graphx.EdgeRDD
Reverse all the edges in this RDD.
reverse() - Method in class org.apache.spark.graphx.Graph
Reverses all edges in the graph.
reverse() - Method in class org.apache.spark.graphx.impl.EdgePartition
Reverse all the edges in this partition.
reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
reverse() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Return a new ReplicatedVertexView where edges are reversed and shipping levels are swapped to match.
reverse() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
Returns a new RoutingTablePartition reflecting a reversal of all edge directions.
reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD
Returns a new VertexRDD reflecting a reversal of all edge directions in the corresponding EdgeRDD.
revertPartialWritesAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter
Reverts writes that haven't been flushed yet.
revertPartialWritesAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
reviveOffers() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
reviveOffers() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
reviveOffers() - Method in class org.apache.spark.scheduler.local.LocalActor
 
reviveOffers() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
ReviveOffers - Class in org.apache.spark.scheduler.local
 
ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
 
reviveOffers() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
RidgeRegressionModel - Class in org.apache.spark.mllib.regression
Regression model trained using RidgeRegression.
RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
 
RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression
Train a regression model with L2-regularization using Stochastic Gradient Descent.
RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 0.01, miniBatchFraction: 1.0}.
right() - Method in class org.apache.spark.sql.execution.Except
 
right() - Method in class org.apache.spark.sql.execution.Intersect
 
right() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
right() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
right() - Method in class org.apache.spark.sql.execution.joins.CartesianProduct
 
right() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
right() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
right() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
The Broadcast relation
right() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
right() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the index of the right child of this node.
rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
rightKeys() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
rightKeys() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
rightKeys() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
rightKeys() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
rightKeys() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
 
rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
RLIKE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
A random graph generator using the R-MAT model, proposed in "R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al.
rnd() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
rolledOver() - Method in interface org.apache.spark.util.logging.RollingPolicy
Notify that rollover has occurred
rolledOver() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Rollover has occurred, so reset the counter
rolledOver() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
Rollover has occurred, so find the next time to rollover
RollingFileAppender - Class in org.apache.spark.util.logging
Continuously appends data from input stream into the given file, and rolls over the file after the given interval.
RollingFileAppender(InputStream, File, RollingPolicy, SparkConf, int) - Constructor for class org.apache.spark.util.logging.RollingFileAppender
 
rollingPolicy() - Method in class org.apache.spark.util.logging.RollingFileAppender
 
RollingPolicy - Interface in org.apache.spark.util.logging
Defines the policy based on which RollingFileAppender will generate rolling files.
rolloverIntervalMillis() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
rolloverSizeBytes() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
rootHandler() - Method in class org.apache.spark.ui.ServerInfo
 
rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns the root mean squared error, which is defined as the square root of the mean squared error.
rootPool() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
rootPool() - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
rootPool() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
rootPool() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
rootPool() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
routingTable() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
RoutingTablePartition - Class in org.apache.spark.graphx.impl
Stores the locations of edge-partition join sites for each vertex attribute in a particular vertex partition.
RoutingTablePartition(Tuple3<long[], BitSet, BitSet>[]) - Constructor for class org.apache.spark.graphx.impl.RoutingTablePartition
 
Row - Class in org.apache.spark.sql.api.java
A result row from a Spark SQL query.
Row(Row) - Constructor for class org.apache.spark.sql.api.java.Row
 
row() - Method in class org.apache.spark.sql.api.java.Row
 
rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
RowMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row-oriented distributed Matrix with no meaningful row indices.
RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
RowReadSupport - Class in org.apache.spark.sql.parquet
A parquet.hadoop.api.ReadSupport for Row objects.
RowReadSupport() - Constructor for class org.apache.spark.sql.parquet.RowReadSupport
 
RowRecordMaterializer - Class in org.apache.spark.sql.parquet
A parquet.io.api.RecordMaterializer for Rows.
RowRecordMaterializer(CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.RowRecordMaterializer
 
RowRecordMaterializer(MessageType, Seq<Attribute>) - Constructor for class org.apache.spark.sql.parquet.RowRecordMaterializer
 
rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
rowToArray(Row, Seq<DataType>) - Static method in class org.apache.spark.sql.execution.EvaluatePython
Convert Row into Java Array (for pickled into Python)
rowToJSON(StructType, JsonFactory, Row) - Static method in class org.apache.spark.sql.json.JsonRDD
Transforms a single Row to JSON using Jackson
RowWriteSupport - Class in org.apache.spark.sql.parquet
A parquet.hadoop.api.WriteSupport for Row ojects.
RowWriteSupport() - Constructor for class org.apache.spark.sql.parquet.RowWriteSupport
 
run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
Executes some action enclosed in the closure.
run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents
Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation
Run static Label Propagation for detecting communities in networks.
run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths
Computes shortest paths to the given set of landmark vertices.
run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents
Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
Implement SVD++ based on "Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model", available at http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf.
run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
 
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans
Train a K-means model on the given set of points; data should be cached for high performance, because this is an iterative algorithm.
run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
Run ALS with the configured parameters on an input RDD of (user, product, rating) triples.
run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
Java-friendly version of ALS.run.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries starting from the initial weights provided.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model over an RDD
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
Method to train a gradient boosting model
run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#run.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model over an RDD
run() - Method in class org.apache.spark.rdd.PartitionCoalescer
Runs the packing algorithm and returns an array of PartitionGroups that if possible are load balanced and grouped by locality
run(long) - Method in class org.apache.spark.scheduler.Task
 
run(SQLContext) - Method in interface org.apache.spark.sql.execution.RunnableCommand
 
run(SQLContext) - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
run() - Method in class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
 
run() - Method in class org.apache.spark.streaming.flume.FlumeBatchFetcher
 
run() - Method in class org.apache.spark.streaming.scheduler.Job
 
run() - Method in class org.apache.spark.util.RedirectThread
 
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, CallSite, long, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
 
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Run a job that can return approximate results.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction
Runs a Spark job.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, CallSite, boolean, Function2<Object, U, BoxedUnit>, Properties, ClassTag<U>) - Method in class org.apache.spark.scheduler.DAGScheduler
 
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a function on a given set of partitions in an RDD and pass the results to the given handler function.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on a given set of partitions of an RDD, but take a function of type Iterator[T] => U instead of (TaskContext, Iterator[T]) => U.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and pass the results to a handler function.
runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and pass the results to a handler function.
runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS
Run Limited-memory BFGS (L-BFGS) in parallel.
RunLengthEncoding - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
RunLengthEncoding.Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
RunLengthEncoding.Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
Run stochastic gradient descent (SGD) in parallel using mini batches.
RunnableCommand - Interface in org.apache.spark.sql.execution
 
running() - Method in class org.apache.spark.scheduler.TaskInfo
 
RUNNING() - Static method in class org.apache.spark.TaskState
 
runningBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
runningLocally() - Method in class org.apache.spark.TaskContext
Deprecated.
runningLocally() - Method in class org.apache.spark.TaskContextImpl
 
runningStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
runningTasks() - Method in class org.apache.spark.scheduler.Pool
 
runningTasks() - Method in interface org.apache.spark.scheduler.Schedulable
 
runningTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
runningTasksSet() - Method in class org.apache.spark.scheduler.TaskSetManager
 
runSqlHive(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
runTask(TaskContext) - Method in class org.apache.spark.scheduler.ResultTask
 
runTask(TaskContext) - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
runTask(TaskContext) - Method in class org.apache.spark.scheduler.Task
 
RuntimePercentage - Class in org.apache.spark.scheduler
 
RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
 
runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

S

s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
s1() - Method in class org.apache.spark.rdd.CartesianPartition
 
s2() - Method in class org.apache.spark.rdd.CartesianPartition
 
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.execution.LogicalRDD
 
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.hive.MetastoreRelation
Only compare database and tablename, not alias.
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.sources.LogicalRelation
 
sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a sampled subset of this RDD.
sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD
Return a sampled subset of this RDD.
Sample - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Sample(double, boolean, long, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sample
 
sample(boolean, double, long) - Method in class org.apache.spark.sql.SchemaRDD
:: Experimental :: Returns a sampled version of the underlying dataset.
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
 
sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
 
sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler
take a random sample
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
SampledRDD<T> - Class in org.apache.spark.rdd
 
SampledRDD(RDD<T>, boolean, double, int, ClassTag<T>) - Constructor for class org.apache.spark.rdd.SampledRDD
 
SampledRDDPartition - Class in org.apache.spark.rdd
 
SampledRDDPartition(Partition, int) - Constructor for class org.apache.spark.rdd.SampledRDDPartition
 
sampleLogNormal(double, double, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
Randomly samples from a log normal distribution whose corresponding normal distribution has the given mean and standard deviation.
sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.util.StatCounter
Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the sample variance of this RDD's elements (which corrects for bias in estimating the standard variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the sample variance of this RDD's elements (which corrects for bias in estimating the variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.util.StatCounter
Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
samplingRatio() - Method in class org.apache.spark.sql.json.JSONRelation
 
SamplingUtils - Class in org.apache.spark.util.random
 
SamplingUtils() - Constructor for class org.apache.spark.util.random.SamplingUtils
 
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsHiveFile(RDD<Row>, Class<?>, ShimFileSinkDesc, SerializableWritable<JobConf>, SparkHiveWriterContainer) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Save labeled data in LIBSVM format.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported storage system, using a Configuration object for that storage system.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop Configuration object for that storage system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
Save each RDD in this DStream as a Sequence file of serialized objects.
saveAsParquetFile(String) - Method in interface org.apache.spark.sql.SchemaRDDLike
Saves the contents of this SchemaRDD as a parquet file, preserving the schema.
saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.
saveAsTable(String) - Method in interface org.apache.spark.sql.SchemaRDDLike
:: Experimental :: Creates a table from the the contents of this SchemaRDD.
saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
Save each RDD in this DStream as at text file, using string representation of elements.
saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
sc() - Method in class org.apache.spark.api.java.JavaSparkContext
 
sc() - Method in class org.apache.spark.scheduler.DAGScheduler
 
sc() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
sc() - Method in class org.apache.spark.streaming.StreamingContext
 
sc() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
sc() - Method in class org.apache.spark.ui.jobs.JobsTab
 
sc() - Method in class org.apache.spark.ui.jobs.StagesTab
 
sc() - Method in class org.apache.spark.ui.SparkUI
 
scal(double, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
x = a * x
scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
scalaTag() - Method in class org.apache.spark.sql.columnar.NativeColumnType
Scala TypeTag.
scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
ScalaToJavaUDTWrapper<UserType> - Class in org.apache.spark.sql.api.java
Java wrapper for a Scala UserDefinedType
ScalaToJavaUDTWrapper(UserDefinedType<UserType>) - Constructor for class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
 
scalaUDT() - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
 
Schedulable - Interface in org.apache.spark.scheduler
An interface for schedulable entities.
SchedulableBuilder - Interface in org.apache.spark.scheduler
An interface to build Schedulable tree buildPools: build the tree nodes(pools) addTaskSetManager: build the leaf nodes(TaskSetManagers)
schedulableBuilder() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
schedulableNameToSchedulable() - Method in class org.apache.spark.scheduler.Pool
 
schedulableQueue() - Method in class org.apache.spark.scheduler.Pool
 
schedulableQueue() - Method in interface org.apache.spark.scheduler.Schedulable
 
schedulableQueue() - Method in class org.apache.spark.scheduler.TaskSetManager
 
scheduler() - Method in class org.apache.spark.streaming.StreamingContext
 
SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.ToolTips
 
schedulerAllocFile() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
SchedulerBackend - Interface in org.apache.spark.scheduler
A backend interface for scheduling systems that allows plugging in different ones under TaskSchedulerImpl.
schedulerBackend() - Method in class org.apache.spark.SparkContext
 
SCHEDULING_MODE_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
SchedulingAlgorithm - Interface in org.apache.spark.scheduler
An interface for sort algorithm FIFO: FIFO algorithm between TaskSetManagers FS: FS algorithm between Pools, and FIFO or FS within Pools
schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for the first job of this batch to start processing from the time this batch was submitted to the streaming scheduler.
schedulingDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
schedulingMode() - Method in class org.apache.spark.scheduler.Pool
 
schedulingMode() - Method in interface org.apache.spark.scheduler.Schedulable
 
SchedulingMode - Class in org.apache.spark.scheduler
"FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues "NONE" is used when the a Schedulable has no sub-queues.
SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
 
schedulingMode() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
schedulingMode() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
schedulingMode() - Method in class org.apache.spark.scheduler.TaskSetManager
 
schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
schedulingPool() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
schema() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Returns the schema of this JavaSchemaRDD (represented by a StructType).
schema() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
schema() - Method in class org.apache.spark.sql.columnar.PartitionStatistics
 
schema() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
 
schema() - Method in class org.apache.spark.sql.json.JSONRelation
 
schema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
schema() - Method in class org.apache.spark.sql.SchemaRDD
Returns the schema of this SchemaRDD (represented by a StructType).
schema() - Method in class org.apache.spark.sql.sources.BaseRelation
 
schemaRDD() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Returns the underlying Scala SchemaRDD.
SchemaRDD - Class in org.apache.spark.sql
:: AlphaComponent :: An RDD of Row objects that has an associated schema.
SchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.SchemaRDD
 
SchemaRDDLike - Interface in org.apache.spark.sql
Contains functions that are shared between all SchemaRDD types (i.e., Scala, Java)
schemaString() - Method in interface org.apache.spark.sql.SchemaRDDLike
Returns the schema as a string in the tree format.
schemes() - Method in interface org.apache.spark.sql.columnar.compression.AllCompressionSchemes
 
schemes() - Method in interface org.apache.spark.sql.columnar.compression.WithCompressionSchemes
 
scoreCol() - Method in interface org.apache.spark.ml.param.HasScoreCol
param for score column name
scratch() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
Scripts() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
ScriptTransformation - Class in org.apache.spark.sql.hive.execution
:: DeveloperApi :: Transforms the input by forking and running the specified script.
ScriptTransformation(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformation
 
seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
seconds(long) - Static method in class org.apache.spark.streaming.Durations
 
Seconds - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of seconds.
Seconds() - Constructor for class org.apache.spark.streaming.Seconds
 
SecurityManager - Class in org.apache.spark
Spark class responsible for security.
SecurityManager(SparkConf) - Constructor for class org.apache.spark.SecurityManager
 
securityManager() - Method in class org.apache.spark.SparkEnv
 
securityManager() - Method in class org.apache.spark.ui.SparkUI
 
seed() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
seed() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
seed() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
seed() - Method in class org.apache.spark.sql.execution.Sample
 
seenNulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
segment() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
segment() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
select(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
Changes the output of this relation to the given expressions, similar to the SELECT clause in SQL.
selectNodesToSplit(Queue<Tuple2<Object, Node>>, long, DecisionTreeMetadata, Random) - Static method in class org.apache.spark.mllib.tree.RandomForest
Pull nodes off of the queue, and collect a group of nodes to be split on this iteration.
sender() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext
Sends a message to the destination vertex.
sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext
Sends a message to the source vertex.
sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop SequenceFile.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext
Version of sequenceFile() for types implicitly convertible to Writables through a WritableConverter.
SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
 
ser() - Method in class org.apache.spark.scheduler.TaskSetManager
 
ser() - Method in class org.apache.spark.sql.execution.KryoResourcePool
 
SerializableBuffer - Class in org.apache.spark.util
A wrapper around a java.nio.ByteBuffer that is serializable through Java serialization, to make it easier to pass ByteBuffers in case class messages.
SerializableBuffer(ByteBuffer) - Constructor for class org.apache.spark.util.SerializableBuffer
 
serializableHadoopSplit() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
 
SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
 
SerializationStream - Class in org.apache.spark.serializer
:: DeveloperApi :: A stream for writing serialized objects.
SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
 
serialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
serialize(Object) - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
Convert the user type to a SQL datum
serialize(Object) - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
Convert the user type to a SQL datum
serialize(Object) - Method in class org.apache.spark.sql.api.java.UserDefinedType
Convert the user type to a SQL datum
serialize(T, ClassTag<T>) - Static method in class org.apache.spark.sql.execution.SparkSqlSerializer
 
serialize(Object, ObjectInspector) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
serialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
serialize(T) - Static method in class org.apache.spark.util.Utils
Serialize an object using Java serialization
serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
serializedTask() - Method in class org.apache.spark.scheduler.TaskDescription
 
serializeFilterExpressions(Seq<Expression>, Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
Note: Inside the Hadoop API we only have access to Configuration, not to SparkContext, so we cannot use broadcasts to convey the actual filter predicate.
serializeMapStatuses(MapStatus[]) - Static method in class org.apache.spark.MapOutputTracker
 
serializePlan(Object, OutputStream) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
Serializer - Class in org.apache.spark.serializer
:: DeveloperApi :: A serializer.
Serializer() - Constructor for class org.apache.spark.serializer.Serializer
 
serializer() - Method in class org.apache.spark.ShuffleDependency
 
serializer() - Method in class org.apache.spark.SparkEnv
 
SerializerInstance - Class in org.apache.spark.serializer
:: DeveloperApi :: An instance of a serializer, for use by one thread at a time.
SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
 
serializeViaNestedStream(OutputStream, SerializerInstance, Function1<SerializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Serialize via nested stream using specific serializer
serializeWithDependencies(Task<?>, HashMap<String, Object>, HashMap<String, Object>, SerializerInstance) - Static method in class org.apache.spark.scheduler.Task
Serialize a task and the current app dependencies (files and JARs added to the SparkContext)
server() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
server() - Method in class org.apache.spark.ui.ServerInfo
 
ServerInfo - Class in org.apache.spark.ui
 
ServerInfo(Server, int, ContextHandlerCollection) - Constructor for class org.apache.spark.ui.ServerInfo
 
ServerStateException - Exception in org.apache.spark
Exception type thrown by HttpServer when it is in the wrong state for an operation.
ServerStateException(String) - Constructor for exception org.apache.spark.ServerStateException
 
serverUri() - Method in class org.apache.spark.HttpFileServer
 
SERVLET_DEFAULT_SAMPLE() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
SERVLET_KEY_PATH() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
SERVLET_KEY_SAMPLE() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
servletPath() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
servletShowSample() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
Sets a parameter in the embedded param map.
set(String, String) - Method in class org.apache.spark.SparkConf
Set a configuration variable.
set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
 
set(Function0<Object>) - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
setAcls(boolean) - Method in class org.apache.spark.SecurityManager
 
setActiveContext(SparkContext, boolean) - Static method in class org.apache.spark.SparkContext
Called at the end of the SparkContext constructor to ensure that no other SparkContext has raced with this constructor and started.
setAdminAcls(String) - Method in class org.apache.spark.SecurityManager
Admin acls should be set before the view or modify acls.
setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD
Set aggregator for RDD's shuffle.
setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Sets Algorithm using a String.
setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
Set multiple parameters together
setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
:: Experimental :: Sets the constant used in computing confidence in implicit ALS.
setAppName(String) - Method in class org.apache.spark.SparkConf
Set a name for your application.
setAppName(String) - Method in class org.apache.spark.ui.SparkUI
Set the app name for this UI.
setBatchDuration(Duration) - Method in class org.apache.spark.streaming.DStreamGraph
 
setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of blocks for both user blocks and product blocks to parallelize the computation into; pass -1 for an auto-configured number of blocks.
setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Pass-through to SparkContext.setCallSite.
setCallSite(String) - Method in class org.apache.spark.SparkContext
Set the thread-local property for overriding the call sites of actions and RDDs.
setCallSite(CallSite) - Method in class org.apache.spark.SparkContext
Set the thread-local property for overriding the call sites of actions and RDDs.
setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Sets categoricalFeaturesInfo using a Java Map.
setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Set the directory under which RDDs are going to be checkpointed.
setCheckpointDir(Option<String>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setCheckpointDir(String) - Method in class org.apache.spark.SparkContext
Set the directory under which RDDs are going to be checkpointed.
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setClock(Clock) - Method in class org.apache.spark.ExecutorAllocationManager
Use a different clock for this allocation manager.
SetCommand - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
SetCommand(Option<Tuple2<String, Option<String>>>, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.SetCommand
 
setCompressCodec(String) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setCompressed(boolean) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setCompressType(String) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setConf(Configuration) - Method in class org.apache.spark.input.WholeCombineFileRecordReader
 
setConf(Configuration) - Method in class org.apache.spark.input.WholeTextFileInputFormat
 
setConf(Configuration) - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
 
setConf(Properties) - Method in interface org.apache.spark.sql.SQLConf
Set Spark SQL configuration properties.
setConf(String, String) - Method in interface org.apache.spark.sql.SQLConf
Set the given Spark SQL configuration property.
setContext(StreamingContext) - Method in class org.apache.spark.streaming.dstream.DStream
 
setContext(StreamingContext) - Method in class org.apache.spark.streaming.DStreamGraph
 
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the convergence tolerance of iterations for L-BFGS.
setCustomHostname(String) - Static method in class org.apache.spark.util.Utils
Allow setting a custom host name because when we run on Mesos we need to use the same hostname it reports to the master.
setDAGScheduler(DAGScheduler) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
setDAGScheduler(DAGScheduler) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Set the decay factor directly (for forgetful algorithms).
setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer
Sets a class loader for the serializer to use in deserialization.
setDelaySeconds(SparkConf, Enumeration.Value, int) - Static method in class org.apache.spark.util.MetadataCleaner
 
setDelaySeconds(SparkConf, int, boolean) - Static method in class org.apache.spark.util.MetadataCleaner
Set the default delay time (in seconds).
setDestTableId(int) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the distance threshold within which we've consider centers to have converged.
setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf
Set an environment variable to be used when launching executors for this application.
setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
Set multiple environment variables to be used when launching executors.
setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf
Set multiple environment variables to be used when launching executors.
setFailure(Exception) - Method in class org.apache.spark.partial.PartialResult
 
setFeatureScaling(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Set if the algorithm should use feature scaling to improve the convergence during optimization.
setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setField(MutableRow, int, byte[]) - Static method in class org.apache.spark.sql.columnar.BINARY
 
setField(MutableRow, int, boolean) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
setField(MutableRow, int, byte) - Static method in class org.apache.spark.sql.columnar.BYTE
 
setField(MutableRow, int, JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType
Sets row(ordinal) to field.
setField(MutableRow, int, Date) - Static method in class org.apache.spark.sql.columnar.DATE
 
setField(MutableRow, int, double) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
setField(MutableRow, int, float) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
setField(MutableRow, int, byte[]) - Static method in class org.apache.spark.sql.columnar.GENERIC
 
setField(MutableRow, int, int) - Static method in class org.apache.spark.sql.columnar.INT
 
setField(MutableRow, int, long) - Static method in class org.apache.spark.sql.columnar.LONG
 
setField(MutableRow, int, short) - Static method in class org.apache.spark.sql.columnar.SHORT
 
setField(MutableRow, int, String) - Static method in class org.apache.spark.sql.columnar.STRING
 
setField(MutableRow, int, Timestamp) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
setFinalValue(R) - Method in class org.apache.spark.partial.PartialResult
 
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the gradient function (of the loss function of one single data example) to be used for SGD.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.
setGraph(DStreamGraph) - Method in class org.apache.spark.streaming.dstream.DStream
 
setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Set the half life and time unit ("batches" or "points") for forgetful algorithms.
setId(int) - Method in class org.apache.spark.streaming.scheduler.Job
 
setIfMissing(String, String) - Method in class org.apache.spark.SparkConf
Set a parameter if it isn't already configured
setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
Sets whether to use implicit preference.
setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Specify initial centers directly.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the initialization algorithm.
setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the number of steps for the k-means|| initialization mode.
setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Set the initial weights.
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
 
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
 
setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Set if the algorithm should add an intercept.
setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
:: DeveloperApi :: Sets storage level for intermediate RDDs (user/product in/out links).
setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of iterations to run.
setJars(Seq<String>) - Method in class org.apache.spark.SparkConf
Set JAR files to distribute to the cluster.
setJars(String[]) - Method in class org.apache.spark.SparkConf
Set JAR files to distribute to the cluster.
setJobDescription(String) - Method in class org.apache.spark.SparkContext
Set a human readable description of the current job.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the number of clusters to create (k).
setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Set the number of clusters.
setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD
Set key ordering for RDD's shuffle.
setLabelCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
Set the smoothing parameter.
setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the regularization parameter, lambda.
setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets initial learning rate (default: 0.025).
setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setLocalProperties(Properties) - Method in class org.apache.spark.SparkContext
 
setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext
Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLocation(Table, CreateTableDesc) - Static method in class org.apache.spark.sql.hive.HiveShim
 
setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD
Set mapSideCombine flag for RDD's shuffle.
setMaster(String) - Method in class org.apache.spark.SparkConf
The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set maximum number of iterations to run.
setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
Deprecated.
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
:: Experimental :: Set fraction of data to be used for each SGD iteration.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Set the fraction of each batch to use for updates.
setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMinPartitions(JobContext, int) - Method in class org.apache.spark.input.StreamFileInputFormat
Allow minPartitions set by end-user in order to keep compatibility with old Hadoop API which is set through setMaxSplitSize
setMinPartitions(JobContext, int) - Method in class org.apache.spark.input.WholeTextFileInputFormat
Allow minPartitions set by end-user in order to keep compatibility with old Hadoop API, which is set through setMaxSplitSize
setModifyAcls(Set<String>, String) - Method in class org.apache.spark.SecurityManager
Admin acls should be set before the view or modify acls.
setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
setName(String) - Method in class org.apache.spark.rdd.RDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Assign a name to this RDD
setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
Set whether the least-squares problems solved at each iteration should have nonnegativity constraints.
setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the number of corrections used in the LBFGS update.
setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
 
setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets number of iterations (default: 1), which should be smaller than or equal to number of partitions.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the number of iterations for SGD.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the maximal number of iterations for L-BFGS.
setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Set the number of iterations of gradient descent to run per update.
setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets number of partitions (default: 1).
setNumSplits(int, int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Set number of splits for a continuous feature.
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
 
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
 
setPredictionCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setPredictionCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of product blocks to parallelize the computation.
setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Initialize random centers, requiring only the number of dimensions.
setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the rank of the feature matrices computed (number of features).
setReceiverId(int) - Method in class org.apache.spark.streaming.receiver.Receiver
Set the ID of the DStream that this receiver is associated with.
setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the regularization parameter.
setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans
:: Experimental :: Set the number of runs of the algorithm to execute in parallel.
setSchema(Seq<Attribute>, Configuration) - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
setScoreCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setScoreCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setScoreCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets random seed (default: a random long integer).
setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
Sets a random seed to have deterministic results.
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
 
setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
 
setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom
Set random seed.
setSeed(long) - Method in class org.apache.spark.util.random.XORShiftRandom
 
setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSerializer(Serializer) - Method in class org.apache.spark.rdd.SubtractedRDD
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSparkHome(String) - Method in class org.apache.spark.SparkConf
Set the location where Spark is installed on worker nodes.
setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
 
setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the initial step size of SGD for the first step.
setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Set the step size for gradient descent.
setStreamingLogLevels() - Static method in class org.apache.spark.examples.streaming.StreamingExamples
Set reasonable logging levels for streaming if the user has not configured log4j.
setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setTableInfo(TableDesc) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setTaskContext(TaskContext) - Static method in class org.apache.spark.TaskContextHelper
 
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
:: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel
:: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
setTime(long) - Method in class org.apache.spark.streaming.util.ManualClock
 
settings() - Method in interface org.apache.spark.sql.SQLConf
Only low degree of contention is expected for conf, thus NOT using ConcurrentHashMap.
setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setup(int, int, int) - Method in class org.apache.spark.SparkHadoopWriter
 
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the updater function to actually perform a gradient step in a given direction.
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the updater function to actually perform a gradient step in a given direction.
setupGroups(int) - Method in class org.apache.spark.rdd.PartitionCoalescer
Initializes targetLen partition groups and assigns a preferredLocation This uses coupon collector to estimate how many preferredLocations it must rotate through until it has seen most of the preferred locations (2 * n log(n))
setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of user blocks to parallelize the computation.
setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Set if the algorithm should validate data before training.
setValue(R) - Method in class org.apache.spark.Accumulable
Set the accumulator's value; only allowed on master
setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets vector size (default: 100).
setViewAcls(Set<String>, String) - Method in class org.apache.spark.SecurityManager
Admin acls should be set before the view or modify acls.
setViewAcls(String, String) - Method in class org.apache.spark.SecurityManager
 
shardId() - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
 
ShimFileSinkDesc - Class in org.apache.spark.sql.hive
 
ShimFileSinkDesc(String, TableDesc, boolean) - Constructor for class org.apache.spark.sql.hive.ShimFileSinkDesc
 
shippablePartitionToOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Implicit conversion to allow invoking VertexPartitionBase operations directly on a ShippableVertexPartition.
ShippableVertexPartition<VD> - Class in org.apache.spark.graphx.impl
A map from vertex id to vertex attribute that additionally stores edge partition join sites for each vertex attribute, enabling joining with an EdgeRDD.
ShippableVertexPartition(OpenHashSet<Object>, Object, BitSet, RoutingTablePartition, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartition
 
ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$ - Class in org.apache.spark.graphx.impl
Implicit evidence that ShippableVertexPartition is a member of the VertexPartitionBaseOpsConstructor typeclass.
ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$() - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
 
ShippableVertexPartitionOps<VD> - Class in org.apache.spark.graphx.impl
 
ShippableVertexPartitionOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Generate a VertexAttributeBlock for each edge partition keyed on the edge partition ID.
shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.VertexRDD
Generates an RDD of vertex attributes suitable for shipping to the edge partitions.
shipVertexIds() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Generate a VertexId array for each edge partition keyed on the edge partition ID.
shipVertexIds() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
shipVertexIds() - Method in class org.apache.spark.graphx.VertexRDD
Generates an RDD of vertex IDs suitable for shipping to the edge partitions.
SHORT - Class in org.apache.spark.sql.columnar
 
SHORT() - Constructor for class org.apache.spark.sql.columnar.SHORT
 
SHORT_FORM() - Static method in class org.apache.spark.util.CallSite
 
ShortColumnAccessor - Class in org.apache.spark.sql.columnar
 
ShortColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.ShortColumnAccessor
 
ShortColumnBuilder - Class in org.apache.spark.sql.columnar
 
ShortColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.ShortColumnBuilder
 
ShortColumnStats - Class in org.apache.spark.sql.columnar
 
ShortColumnStats() - Constructor for class org.apache.spark.sql.columnar.ShortColumnStats
 
ShortestPaths - Class in org.apache.spark.graphx.lib
Computes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark.
ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
 
shortForm() - Method in class org.apache.spark.util.CallSite
 
shortParquetCompressionCodecNames() - Static method in class org.apache.spark.sql.parquet.ParquetRelation
 
ShortType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the ShortType object.
ShortType - Class in org.apache.spark.sql.api.java
The data type representing short and Short values.
shouldCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
Check if it's time to checkpoint based on the current time and the derived time for the next checkpoint
shouldRollover(long) - Method in interface org.apache.spark.util.logging.RollingPolicy
Whether rollover should be initiated at this moment
shouldRollover(long) - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Should rollover if the next set of bytes is going to exceed the size limit
shouldRollover(long) - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
Should rollover if current time has exceeded next rollover time
shouldSend() - Method in class org.apache.spark.mllib.recommendation.OutLinkBlock
 
showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showBytesDistribution(String, Option<Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showBytesDistribution(String, Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Option<Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Option<Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Option<Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
showQuantiles(PrintStream) - Method in class org.apache.spark.util.Distribution
 
SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_BLOCK_MANAGER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_READ() - Static method in class org.apache.spark.ui.ToolTips
 
SHUFFLE_WRITE() - Static method in class org.apache.spark.ui.ToolTips
 
ShuffleBlockFetcherIterator - Class in org.apache.spark.storage
An iterator that fetches multiple blocks.
ShuffleBlockFetcherIterator(TaskContext, ShuffleClient, BlockManager, Seq<Tuple2<BlockManagerId, Seq<Tuple2<BlockId, Object>>>>, Serializer, long) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
ShuffleBlockFetcherIterator.FailureFetchResult - Class in org.apache.spark.storage
Result of a fetch from a remote block unsuccessfully.
ShuffleBlockFetcherIterator.FailureFetchResult(BlockId, Throwable) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
ShuffleBlockFetcherIterator.FailureFetchResult$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.FailureFetchResult$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult$
 
ShuffleBlockFetcherIterator.FetchRequest - Class in org.apache.spark.storage
A request to fetch blocks from a remote BlockManager.
ShuffleBlockFetcherIterator.FetchRequest(BlockManagerId, Seq<Tuple2<BlockId, Object>>) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
ShuffleBlockFetcherIterator.FetchRequest$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.FetchRequest$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest$
 
ShuffleBlockFetcherIterator.FetchResult - Interface in org.apache.spark.storage
Result of a fetch from a remote block.
ShuffleBlockFetcherIterator.SuccessFetchResult - Class in org.apache.spark.storage
Result of a fetch from a remote block successfully.
ShuffleBlockFetcherIterator.SuccessFetchResult(BlockId, long, ManagedBuffer) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
ShuffleBlockFetcherIterator.SuccessFetchResult$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.SuccessFetchResult$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult$
 
ShuffleBlockId - Class in org.apache.spark.storage
 
ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
 
shuffleCleaned(int) - Method in interface org.apache.spark.CleanerListener
 
shuffleClient() - Method in class org.apache.spark.storage.BlockManager
 
ShuffleCoGroupSplitDep - Class in org.apache.spark.rdd
 
ShuffleCoGroupSplitDep(ShuffleHandle) - Constructor for class org.apache.spark.rdd.ShuffleCoGroupSplitDep
 
ShuffleDataBlockId - Class in org.apache.spark.storage
 
ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
 
ShuffledDStream<K,V,C> - Class in org.apache.spark.streaming.dstream
 
ShuffledDStream(DStream<Tuple2<K, V>>, Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.streaming.dstream.ShuffledDStream
 
shuffleDep() - Method in class org.apache.spark.scheduler.Stage
 
ShuffleDependency<K,V,C> - Class in org.apache.spark
:: DeveloperApi :: Represents a dependency on the output of a shuffle stage.
ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean) - Constructor for class org.apache.spark.ShuffleDependency
 
ShuffledHashJoin - Class in org.apache.spark.sql.execution.joins
:: DeveloperApi :: Performs an inner hash join of two child relations by first shuffling the data using the join keys.
ShuffledHashJoin(Seq<Expression>, Seq<Expression>, org.apache.spark.sql.execution.joins.BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd
:: DeveloperApi :: The resulting RDD from a shuffle (e.g.
ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner) - Constructor for class org.apache.spark.rdd.ShuffledRDD
 
ShuffledRDDPartition - Class in org.apache.spark.rdd
 
ShuffledRDDPartition(int) - Constructor for class org.apache.spark.rdd.ShuffledRDDPartition
 
shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
 
shuffleId() - Method in class org.apache.spark.CleanShuffle
 
shuffleId() - Method in class org.apache.spark.FetchFailed
 
shuffleId() - Method in class org.apache.spark.GetMapOutputStatuses
 
shuffleId() - Method in class org.apache.spark.ShuffleDependency
 
shuffleId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
ShuffleIndexBlockId - Class in org.apache.spark.storage
 
ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
 
shuffleManager() - Method in class org.apache.spark.SparkEnv
 
ShuffleMapTask - Class in org.apache.spark.scheduler
A ShuffleMapTask divides the elements of an RDD into multiple buckets (based on a partitioner specified in the ShuffleDependency).
ShuffleMapTask(int, Broadcast<byte[]>, Partition, Seq<TaskLocation>) - Constructor for class org.apache.spark.scheduler.ShuffleMapTask
 
ShuffleMapTask(int) - Constructor for class org.apache.spark.scheduler.ShuffleMapTask
A constructor used only in test suites.
shuffleMemoryManager() - Method in class org.apache.spark.SparkEnv
 
shuffleRead() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
shuffleReadBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
shuffleReadMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleReadMetricsToJson(ShuffleReadMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleServerId() - Method in class org.apache.spark.storage.BlockManager
 
shuffleToMapStage() - Method in class org.apache.spark.scheduler.DAGScheduler
 
shuffleWrite() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
shuffleWriteBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
shuffleWriteMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleWriteMetricsToJson(ShuffleWriteMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
shutdown(IRecordProcessorCheckpointer, ShutdownReason) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
Kinesis Client Library is shutting down this Worker for 1 of 2 reasons: 1) the stream is resharding by splitting or merging adjacent shards (ShutdownReason.TERMINATE) 2) the failed or latent Worker has stopped sending heartbeats for whatever reason (ShutdownReason.ZOMBIE)
shutdownCallback() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
sideEffectResult() - Method in interface org.apache.spark.sql.execution.Command
A concrete command should override this lazy field to wrap up any side effects caused by the command or any other computation that should be evaluated exactly once.
SignalLogger - Class in org.apache.spark.util
Used to log signals received.
SignalLogger() - Constructor for class org.apache.spark.util.SignalLogger
 
SignalLoggerHandler - Class in org.apache.spark.util
 
SignalLoggerHandler(String, Logger) - Constructor for class org.apache.spark.util.SignalLoggerHandler
 
SimpleFutureAction<T> - Class in org.apache.spark
A FutureAction holding the result of an action that triggers a single job.
SimpleFutureAction(JobWaiter<?>, Function0<T>) - Constructor for class org.apache.spark.SimpleFutureAction
 
simpleString() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
SimpleUpdater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: A simple updater for gradient descent *without* any regularization.
SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
 
SimrSchedulerBackend - Class in org.apache.spark.scheduler.cluster
 
SimrSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
SingleItemData<T> - Class in org.apache.spark.streaming.receiver
 
SingleItemData(T) - Constructor for class org.apache.spark.streaming.receiver.SingleItemData
 
SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg
:: Experimental :: Represents singular value decomposition (SVD) factors.
SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
Sink - Interface in org.apache.spark.metrics.sink
 
SINK_REGEX() - Static method in class org.apache.spark.metrics.MetricsSystem
 
size() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
size() - Method in class org.apache.spark.graphx.impl.EdgePartition
The number of edges in this partition
size() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
size() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
size() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
size() - Method in interface org.apache.spark.mllib.linalg.Vector
Size of the vector.
size() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
size() - Method in class org.apache.spark.rdd.PartitionGroup
 
size() - Method in class org.apache.spark.scheduler.IndirectTaskResult
 
size() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
size() - Method in class org.apache.spark.storage.BlockInfo
 
size() - Method in class org.apache.spark.storage.MemoryEntry
 
size() - Method in class org.apache.spark.storage.PutResult
 
size() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
size() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
size() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
size() - Method in class org.apache.spark.util.TimeStampedHashMap
 
size() - Method in class org.apache.spark.util.TimeStampedHashSet
 
size() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
SIZE_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
SIZE_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
SizeBasedRollingPolicy - Class in org.apache.spark.util.logging
Defines a RollingPolicy by which files will be rolled over after reaching a particular size.
SizeBasedRollingPolicy(long, boolean) - Constructor for class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
SizeEstimator - Class in org.apache.spark.util
Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in memory-aware caches.
SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
 
sizeInBytes() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
sizeInBytes() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
sizeInBytes() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation
Returns an estimated size of this relation in bytes.
sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
Sketches the input RDD via reservoir sampling on each partition.
skip(long) - Method in class org.apache.spark.util.ByteBufferInputStream
 
skippedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
slack() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
slaveActor() - Method in class org.apache.spark.storage.BlockManagerInfo
 
slaveIdsWithExecutors() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
slaveIdsWithExecutors() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
slaveLost(SchedulerDriver, Protos.SlaveID) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
slaveLost(SchedulerDriver, Protos.SlaveID) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
SlaveLost - Class in org.apache.spark.scheduler
 
SlaveLost(String) - Constructor for class org.apache.spark.scheduler.SlaveLost
 
slaveTimeout() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
slice() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
slice(Seq<T>, int, ClassTag<T>) - Static method in class org.apache.spark.rdd.ParallelCollectionRDD
Slice a collection into numSlices sub-collections.
slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
slice(Interval) - Method in class org.apache.spark.streaming.dstream.DStream
Return all the RDDs defined by the Interval object (both end times included)
slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream
Return all the RDDs between 'fromTime' to 'toTime' (both included)
slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream
Time interval after which the DStream generates a RDD
slideDuration() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
Returns a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding window over them.
SlidingRDD<T> - Class in org.apache.spark.mllib.rdd
Represents a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding window over them.
SlidingRDD(RDD<T>, int, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.SlidingRDD
 
SlidingRDDPartition<T> - Class in org.apache.spark.mllib.rdd
 
SlidingRDDPartition(int, Partition, Seq<T>) - Constructor for class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
SnappyCompressionCodec - Class in org.apache.spark.io
:: DeveloperApi :: Snappy implementation of CompressionCodec.
SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
 
SocketInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
SocketInputDStream(StreamingContext, String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.SocketInputDStream
 
SocketReceiver<T> - Class in org.apache.spark.streaming.dstream
 
SocketReceiver(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.SocketReceiver
 
socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from TCP source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from TCP source hostname:port.
solve(DoubleMatrix, DoubleMatrix, NNLS.Workspace) - Static method in class org.apache.spark.mllib.optimization.NNLS
Solve a least squares problem, possibly with nonnegativity constraints, by a modified projected gradient method.
solveLeastSquares(DoubleMatrix, DoubleMatrix, NNLS.Workspace) - Method in class org.apache.spark.mllib.recommendation.ALS
Given A^T A and A^T b, find the x minimising ||Ax - b||_2, possibly subject to nonnegativity constraints if nonnegative is true.
Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
Sort - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Performs a sort on-heap.
Sort(Seq<SortOrder>, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sort
 
sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD
Return this RDD sorted by the given key function.
sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
Return this RDD sorted by the given key function.
sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements in ascending order.
sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortOrder() - Method in class org.apache.spark.sql.execution.ExternalSort
 
sortOrder() - Method in class org.apache.spark.sql.execution.Sort
 
sortOrder() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
Source - Interface in org.apache.spark.metrics.source
 
SOURCE_REGEX() - Static method in class org.apache.spark.metrics.MetricsSystem
 
sourceName() - Method in class org.apache.spark.metrics.source.JvmSource
 
sourceName() - Method in interface org.apache.spark.metrics.source.Source
 
sourceName() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
sourceName() - Method in class org.apache.spark.storage.BlockManagerSource
 
sourceName() - Method in class org.apache.spark.streaming.StreamingSource
 
SPARK_CONTEXT() - Static method in class org.apache.spark.util.MetadataCleanerType
 
SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
 
SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
 
SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
 
SPARK_METADATA_KEY() - Static method in class org.apache.spark.sql.parquet.RowReadSupport
 
SPARK_ROW_REQUESTED_SCHEMA() - Static method in class org.apache.spark.sql.parquet.RowReadSupport
 
SPARK_ROW_SCHEMA() - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
SPARK_UNKNOWN_USER() - Static method in class org.apache.spark.SparkContext
 
SPARK_VERSION_PREFIX() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
SparkConf - Class in org.apache.spark
Configuration for a Spark application.
SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
 
SparkConf() - Constructor for class org.apache.spark.SparkConf
Create a SparkConf that loads defaults from system properties and the classpath
sparkConf() - Method in class org.apache.spark.streaming.Checkpoint
 
sparkConfPairs() - Method in class org.apache.spark.streaming.Checkpoint
 
sparkContext() - Method in class org.apache.spark.rdd.RDD
The SparkContext that created this RDD.
SparkContext - Class in org.apache.spark
Main entry point for Spark functionality.
SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
 
SparkContext() - Constructor for class org.apache.spark.SparkContext
Create a SparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
:: DeveloperApi :: Alternative constructor for setting preferred locations where Spark will create executors.
SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
sparkContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sparkContext() - Method in class org.apache.spark.sql.SQLContext
 
sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
The underlying SparkContext
sparkContext() - Method in class org.apache.spark.streaming.StreamingContext
Return the associated Spark context
SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
 
SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
 
SparkDeploySchedulerBackend - Class in org.apache.spark.scheduler.cluster
 
SparkDeploySchedulerBackend(TaskSchedulerImpl, SparkContext, String[]) - Constructor for class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
SparkDriverExecutionException - Exception in org.apache.spark
Exception thrown when execution of some user code in the driver process fails, e.g.
SparkDriverExecutionException(Throwable) - Constructor for exception org.apache.spark.SparkDriverExecutionException
 
SparkEnv - Class in org.apache.spark
:: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc.
SparkEnv(String, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, BroadcastManager, BlockTransferService, BlockManager, SecurityManager, HttpFileServer, String, MetricsSystem, ShuffleMemoryManager, SparkConf) - Constructor for class org.apache.spark.SparkEnv
 
sparkEventFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
--------------------------------------------------- * JSON deserialization methods for SparkListenerEvents | ----------------------------------------------------
sparkEventToJson(SparkListenerEvent) - Static method in class org.apache.spark.util.JsonProtocol
------------------------------------------------- * JSON serialization methods for SparkListenerEvents | --------------------------------------------------
SparkException - Exception in org.apache.spark
 
SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
 
SparkException(String) - Constructor for exception org.apache.spark.SparkException
 
SparkExitCode - Class in org.apache.spark.util
 
SparkExitCode() - Constructor for class org.apache.spark.util.SparkExitCode
 
SparkFiles - Class in org.apache.spark
Resolves paths to files added through SparkContext.addFile().
SparkFiles() - Constructor for class org.apache.spark.SparkFiles
 
sparkFilesDir() - Method in class org.apache.spark.SparkEnv
 
SparkFlumeEvent - Class in org.apache.spark.streaming.flume
A wrapper class for AvroFlumeEvent's with a custom serialization format.
SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
 
SparkHadoopMapReduceUtil - Interface in org.apache.spark.mapreduce
 
SparkHadoopMapRedUtil - Interface in org.apache.spark.mapred
 
SparkHadoopWriter - Class in org.apache.spark
Internal helper class that saves an RDD using a Hadoop OutputFormat.
SparkHadoopWriter(JobConf) - Constructor for class org.apache.spark.SparkHadoopWriter
 
SparkHiveDynamicPartitionWriterContainer - Class in org.apache.spark.sql.hive
 
SparkHiveDynamicPartitionWriterContainer(JobConf, ShimFileSinkDesc, String[]) - Constructor for class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
SparkHiveWriterContainer - Class in org.apache.spark.sql.hive
Internal helper class that saves an RDD using a Hive OutputFormat.
SparkHiveWriterContainer(JobConf, ShimFileSinkDesc) - Constructor for class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
sparkJavaOpts(SparkConf, Function1<String, Object>) - Static method in class org.apache.spark.util.Utils
Convert all spark properties set in the given SparkConf to a sequence of java options.
SparkJobInfo - Interface in org.apache.spark
Exposes information about Spark Jobs.
SparkJobInfoImpl - Class in org.apache.spark
 
SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
 
SparkListener - Interface in org.apache.spark.scheduler
:: DeveloperApi :: Interface for listening to events from the Spark scheduler.
SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
 
SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
SparkListenerApplicationStart - Class in org.apache.spark.scheduler
 
SparkListenerApplicationStart(String, Option<String>, long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
 
SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
 
SparkListenerBlockManagerAdded(long, BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
 
SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
SparkListenerBus - Interface in org.apache.spark.scheduler
A SparkListenerEvent bus that relays events to its listeners
SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
 
SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
SparkListenerEvent - Interface in org.apache.spark.scheduler
 
SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler
Periodic updates from executors.
SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
SparkListenerJobEnd - Class in org.apache.spark.scheduler
 
SparkListenerJobEnd(int, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
 
SparkListenerJobStart - Class in org.apache.spark.scheduler
 
SparkListenerJobStart(int, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
 
sparkListeners() - Method in interface org.apache.spark.scheduler.SparkListenerBus
 
SparkListenerShutdown - Class in org.apache.spark.scheduler
An event used in the listener to shutdown the listener daemon thread.
SparkListenerShutdown() - Constructor for class org.apache.spark.scheduler.SparkListenerShutdown
 
SparkListenerStageCompleted - Class in org.apache.spark.scheduler
 
SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
 
SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
 
SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
SparkListenerTaskEnd - Class in org.apache.spark.scheduler
 
SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
 
SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
 
SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
SparkListenerTaskStart - Class in org.apache.spark.scheduler
 
SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
 
SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
 
SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
SparkLogicalPlan - Class in org.apache.spark.sql.execution
 
SparkLogicalPlan(SparkPlan, SQLContext) - Constructor for class org.apache.spark.sql.execution.SparkLogicalPlan
 
SparkPlan - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
SparkPlan() - Constructor for class org.apache.spark.sql.execution.SparkPlan
 
sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
SparkSqlSerializer - Class in org.apache.spark.sql.execution
 
SparkSqlSerializer(SparkConf) - Constructor for class org.apache.spark.sql.execution.SparkSqlSerializer
 
SparkStageInfo - Interface in org.apache.spark
Exposes information about Spark Stages.
SparkStageInfoImpl - Class in org.apache.spark
 
SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
 
SparkStatusTracker - Class in org.apache.spark
Low-level status reporting APIs for monitoring job and stage progress.
SparkStatusTracker(SparkContext) - Constructor for class org.apache.spark.SparkStatusTracker
 
SparkStrategies - Class in org.apache.spark.sql.execution
 
SparkStrategies() - Constructor for class org.apache.spark.sql.execution.SparkStrategies
 
SparkStrategies.BasicOperators - Class in org.apache.spark.sql.execution
 
SparkStrategies.BasicOperators() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.BasicOperators
 
SparkStrategies.BroadcastNestedLoopJoin - Class in org.apache.spark.sql.execution
 
SparkStrategies.BroadcastNestedLoopJoin() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.BroadcastNestedLoopJoin
 
SparkStrategies.CartesianProduct - Class in org.apache.spark.sql.execution
 
SparkStrategies.CartesianProduct() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.CartesianProduct
 
SparkStrategies.CommandStrategy - Class in org.apache.spark.sql.execution
 
SparkStrategies.CommandStrategy(SQLContext) - Constructor for class org.apache.spark.sql.execution.SparkStrategies.CommandStrategy
 
SparkStrategies.HashAggregation - Class in org.apache.spark.sql.execution
 
SparkStrategies.HashAggregation() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
 
SparkStrategies.HashJoin - Class in org.apache.spark.sql.execution
 
SparkStrategies.HashJoin() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.HashJoin
Uses the ExtractEquiJoinKeys pattern to find joins where at least some of the predicates can be evaluated by matching hash keys.
SparkStrategies.InMemoryScans - Class in org.apache.spark.sql.execution
 
SparkStrategies.InMemoryScans() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.InMemoryScans
 
SparkStrategies.LeftSemiJoin - Class in org.apache.spark.sql.execution
 
SparkStrategies.LeftSemiJoin() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.LeftSemiJoin
 
SparkStrategies.ParquetOperations - Class in org.apache.spark.sql.execution
 
SparkStrategies.ParquetOperations() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.ParquetOperations
 
SparkStrategies.TakeOrdered - Class in org.apache.spark.sql.execution
 
SparkStrategies.TakeOrdered() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.TakeOrdered
 
SparkUI - Class in org.apache.spark.ui
Top level user interface for a Spark application.
SparkUITab - Class in org.apache.spark.ui
 
SparkUITab(SparkUI, String) - Constructor for class org.apache.spark.ui.SparkUITab
 
SparkUncaughtExceptionHandler - Class in org.apache.spark.util
The default uncaught exception handler for Executors terminates the whole process, to avoid getting into a bad state indefinitely.
SparkUncaughtExceptionHandler() - Constructor for class org.apache.spark.util.SparkUncaughtExceptionHandler
 
sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
 
sparkUser() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
sparkUser() - Method in class org.apache.spark.SparkContext
 
sparkVersion() - Method in class org.apache.spark.scheduler.EventLoggingInfo
 
sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector providing its index array and value array.
sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector using unordered (index, value) pairs.
sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
SparseMatrix - Class in org.apache.spark.mllib.linalg
Column-major sparse matrix.
SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
 
SparseVector - Class in org.apache.spark.mllib.linalg
A sparse vector represented by an index array and an value array.
SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
 
SpearmanCorrelation - Class in org.apache.spark.mllib.stat.correlation
Compute Spearman's correlation for two RDDs of the type RDD[Double] or the correlation matrix for an RDD of the type RDD[Vector].
SpearmanCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
 
speculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SPECULATION_INTERVAL() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
SPECULATION_MULTIPLIER() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SPECULATION_QUANTILE() - Method in class org.apache.spark.scheduler.TaskSetManager
 
speculative() - Method in class org.apache.spark.scheduler.TaskInfo
 
split() - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
split() - Method in class org.apache.spark.mllib.tree.model.Node
 
Split - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Split applied to a feature
Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
 
split() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
SPLIT_INFO_REFLECTIONS() - Static method in class org.apache.spark.rdd.HadoopRDD
 
splitAndCountPartitions(Iterator<String>) - Static method in class org.apache.spark.streaming.util.RawTextHelper
Splits lines and counts the words.
splitCommandString(String) - Static method in class org.apache.spark.util.Utils
Split a string of potentially quoted arguments from the command line the way that a shell would do it to determine arguments to a command.
splitIdToFile(int) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
splitIndex() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
 
SplitInfo - Class in org.apache.spark.scheduler
 
SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
 
splitLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
sql(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
Executes a SQL query using Spark, returning the result as a SchemaRDD.
sql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
 
sql() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
sql(String) - Method in class org.apache.spark.sql.hive.HiveContext
 
sql(String) - Method in class org.apache.spark.sql.SQLContext
Executes a SQL query using Spark, returning the result as a SchemaRDD.
SQLConf - Interface in org.apache.spark.sql
A trait that enables the setting and getting of mutable config parameters/hints.
SQLConf.Deprecated$ - Class in org.apache.spark.sql
 
SQLConf.Deprecated$() - Constructor for class org.apache.spark.sql.SQLConf.Deprecated$
 
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSQLContext
 
sqlContext() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
sqlContext() - Method in class org.apache.spark.sql.execution.AddExchange
 
sqlContext() - Method in class org.apache.spark.sql.json.JSONRelation
 
sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sqlContext() - Method in class org.apache.spark.sql.SchemaRDD
 
sqlContext() - Method in interface org.apache.spark.sql.SchemaRDDLike
 
sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
 
SQLContext - Class in org.apache.spark.sql
:: AlphaComponent :: The entry point for running relational queries using Spark.
SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
 
sqlType() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
sqlType() - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
Underlying storage type for this UDT
sqlType() - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
Underlying storage type for this UDT
sqlType() - Method in class org.apache.spark.sql.api.java.UserDefinedType
Underlying storage type for this UDT
sqlType() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
SQRT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
squaredDist(Vector) - Method in class org.apache.spark.util.Vector
 
SquaredError - Class in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Class for squared error loss calculation.
SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
 
SquaredL2Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Updater for L2 regularized problems.
SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
 
Src - Static variable in class org.apache.spark.graphx.TripletFields
Expose the source and edge fields but not the destination field.
srcAttr() - Method in class org.apache.spark.graphx.EdgeContext
The vertex attribute of the edge's source vertex.
srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
The source vertex attribute
srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
srcId() - Method in class org.apache.spark.graphx.Edge
 
srcId() - Method in class org.apache.spark.graphx.EdgeContext
The vertex id of the edge's source vertex.
srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
srcId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
ssc() - Method in class org.apache.spark.streaming.dstream.DStream
 
ssc() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
ssc() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
stackTrace() - Method in class org.apache.spark.ExceptionFailure
 
stackTrace() - Method in class org.apache.spark.util.ThreadStackTrace
 
stackTraceFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stackTraceToJson(StackTraceElement[]) - Static method in class org.apache.spark.util.JsonProtocol
 
Stage - Class in org.apache.spark.scheduler
A stage is a set of independent tasks all computing the same function that need to run as part of a Spark job, where all the tasks have the same shuffle dependencies.
Stage(int, RDD<?>, int, Option<ShuffleDependency<?, ?, ?>>, List<Stage>, int, CallSite) - Constructor for class org.apache.spark.scheduler.Stage
 
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
StageCancelled - Class in org.apache.spark.scheduler
 
StageCancelled(int) - Constructor for class org.apache.spark.scheduler.StageCancelled
 
stageCompletedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stageCompletedToJson(SparkListenerStageCompleted) - Static method in class org.apache.spark.util.JsonProtocol
 
stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
 
stageId() - Method in class org.apache.spark.scheduler.Pool
 
stageId() - Method in interface org.apache.spark.scheduler.Schedulable
 
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
stageId() - Method in class org.apache.spark.scheduler.StageCancelled
 
stageId() - Method in class org.apache.spark.scheduler.StageInfo
 
stageId() - Method in class org.apache.spark.scheduler.Task
 
stageId() - Method in class org.apache.spark.scheduler.TaskSet
 
stageId() - Method in class org.apache.spark.scheduler.TaskSetManager
 
stageId() - Method in interface org.apache.spark.SparkStageInfo
 
stageId() - Method in class org.apache.spark.SparkStageInfoImpl
 
stageId() - Method in class org.apache.spark.TaskContext
 
stageId() - Method in class org.apache.spark.TaskContextImpl
 
stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
stageIds() - Method in interface org.apache.spark.SparkJobInfo
 
stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
 
stageIds() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
stageIdToActiveJobIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToInfo() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToStage() - Method in class org.apache.spark.scheduler.DAGScheduler
 
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
 
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
StageInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Stores information about a stage to pass from the scheduler to SparkListeners.
StageInfo(int, int, String, int, Seq<RDDInfo>, String) - Constructor for class org.apache.spark.scheduler.StageInfo
 
stageInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
--------------------------------------------------------------------- * JSON deserialization methods for classes SparkListenerEvents depend on | ----------------------------------------------------------------------
stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
stageInfoToJson(StageInfo) - Static method in class org.apache.spark.util.JsonProtocol
------------------------------------------------------------------- * JSON serialization methods for classes SparkListenerEvents depend on | --------------------------------------------------------------------
StagePage - Class in org.apache.spark.ui.jobs
Page showing statistics and task list for a given stage
StagePage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.StagePage
 
stages() - Method in class org.apache.spark.ml.Pipeline
param for pipeline stages
stages() - Method in class org.apache.spark.ml.PipelineModel
 
StagesTab - Class in org.apache.spark.ui.jobs
Web UI showing progress status of all stages in the given SparkContext.
StagesTab(SparkUI) - Constructor for class org.apache.spark.ui.jobs.StagesTab
 
stageSubmittedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stageSubmittedToJson(SparkListenerStageSubmitted) - Static method in class org.apache.spark.util.JsonProtocol
 
StageTableBase - Class in org.apache.spark.ui.jobs
Page showing list of all ongoing and recently finished stages
StageTableBase(Seq<StageInfo>, String, JobProgressListener, boolean, boolean) - Constructor for class org.apache.spark.ui.jobs.StageTableBase
 
StandardNormalGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
 
StandardScaler - Class in org.apache.spark.ml.feature
:: AlphaComponent :: Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
 
StandardScaler - Class in org.apache.spark.mllib.feature
:: Experimental :: Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
 
StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
 
StandardScalerModel - Class in org.apache.spark.ml.feature
:: AlphaComponent :: Model fitted by StandardScaler.
StandardScalerModel(StandardScaler, ParamMap, StandardScalerModel) - Constructor for class org.apache.spark.ml.feature.StandardScalerModel
 
StandardScalerModel - Class in org.apache.spark.mllib.feature
:: Experimental :: Represents a StandardScaler model that can transform vectors.
StandardScalerModel(boolean, boolean, Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
 
StandardScalerParams - Interface in org.apache.spark.ml.feature
starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
Create a star graph with vertex 0 being the center.
start() - Method in class org.apache.spark.ContextCleaner
Start the cleaner.
start() - Method in class org.apache.spark.ExecutorAllocationManager
Register for scheduler callbacks to decide when to add and remove executors.
start() - Method in class org.apache.spark.HttpServer
 
start() - Method in class org.apache.spark.metrics.MetricsSystem
 
start() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
start() - Method in class org.apache.spark.metrics.sink.CsvSink
 
start() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
start() - Method in class org.apache.spark.metrics.sink.JmxSink
 
start() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
start() - Method in interface org.apache.spark.metrics.sink.Sink
 
start(String) - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
Starts a new timer, or re-starts a stopped timer.
start() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.EventLoggingListener
Begin logging events.
start() - Method in class org.apache.spark.scheduler.LiveListenerBus
Start sending events to attached listeners.
start() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
start() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
start() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
start() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
start() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Start the execution of the streams.
start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.InputDStream
Method called to start receiving data.
start() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
start(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
start() - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Start block generating and pushing threads.
start() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Start the supervisor
start() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
Start generation of jobs
start() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
start() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
start() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Start the actor and receiver execution thread.
start() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
start() - Method in class org.apache.spark.streaming.StreamingContext
Start the execution of the streams.
start(long) - Method in class org.apache.spark.streaming.util.RecurringTimer
Start at the given start time.
start() - Method in class org.apache.spark.streaming.util.RecurringTimer
Start at the earliest time it can start based on the period.
start() - Method in class org.apache.spark.util.FileLogger
Start this logger by creating the logging directory.
Started() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Started() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
startIdx() - Method in class org.apache.spark.util.Distribution
 
startIndex() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the index of the first node in the given level.
startJettyServer(String, int, Seq<ServletContextHandler>, SparkConf, String) - Static method in class org.apache.spark.ui.JettyUtils
Attempt to start a Jetty server bound to the supplied hostName:port using the given context handlers.
startReceiver() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Start receiver
startServiceOnPort(int, Function1<Object, Tuple2<T, Object>>, SparkConf, String) - Static method in class org.apache.spark.util.Utils
Attempt to start a service on the given port, or fail after a number of attempts.
startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
 
startTime() - Method in class org.apache.spark.partial.ApproximateActionListener
 
startTime() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
startTime() - Method in class org.apache.spark.SparkContext
 
startTime() - Method in class org.apache.spark.streaming.DStreamGraph
 
startTime() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
startTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
STARVATION_TIMEOUT() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
statCounter() - Method in class org.apache.spark.util.Distribution
 
StatCounter - Class in org.apache.spark.util
A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.
StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
 
StatCounter() - Constructor for class org.apache.spark.util.StatCounter
Initialize the StatCounter with no values.
state() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
state() - Method in class org.apache.spark.streaming.StreamingContext
 
StateDStream<K,V,S> - Class in org.apache.spark.streaming.dstream
 
StateDStream(DStream<Tuple2<K, V>>, Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<K>, ClassTag<V>, ClassTag<S>) - Constructor for class org.apache.spark.streaming.dstream.StateDStream
 
STATIC_RESOURCE_DIR() - Static method in class org.apache.spark.ui.SparkUI
 
staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps
Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult
Test statistic.
Statistics - Class in org.apache.spark.mllib.stat
API for statistical functions in MLlib.
Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
 
statistics() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
statistics() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
statistics() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
statistics() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
statistics() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
statistics() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
Statistics - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: Statistics for querying the supervisor about state of workers.
Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
 
stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
 
stats() - Method in class org.apache.spark.mllib.tree.model.Node
 
stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.sql.columnar.CachedBatch
 
StatsReportListener - Class in org.apache.spark.scheduler
:: DeveloperApi :: Simple SparkListener that logs a few summary statistics when each stage completes
StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
 
StatsReportListener - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: A simple StreamingListener that logs summary statistics across Spark Streaming batches
StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
 
statsSize() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
 
status() - Method in class org.apache.spark.scheduler.TaskInfo
 
status() - Method in interface org.apache.spark.SparkJobInfo
 
status() - Method in class org.apache.spark.SparkJobInfoImpl
 
status() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
 
statusTracker() - Method in class org.apache.spark.SparkContext
 
statusUpdate(SchedulerDriver, Protos.TaskStatus) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
statusUpdate(SchedulerDriver, Protos.TaskStatus) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
statusUpdate(long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.local.LocalBackend
 
StatusUpdate - Class in org.apache.spark.scheduler.local
 
StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
 
statusUpdate(long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.util.StatCounter
Return the standard deviation of the values.
stop() - Method in class org.apache.spark.api.java.JavaSparkContext
Shut down the SparkContext.
stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
stop() - Method in class org.apache.spark.broadcast.BroadcastManager
 
stop() - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
stop() - Method in class org.apache.spark.ContextCleaner
Stop the cleaning thread and wait until the thread has finished running its current task.
stop() - Method in class org.apache.spark.HttpFileServer
 
stop() - Method in class org.apache.spark.HttpServer
 
stop() - Method in class org.apache.spark.MapOutputTracker
Stop the tracker.
stop() - Method in class org.apache.spark.MapOutputTrackerMaster
 
stop() - Method in class org.apache.spark.metrics.MetricsSystem
 
stop() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
stop() - Method in class org.apache.spark.metrics.sink.CsvSink
 
stop() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
stop() - Method in class org.apache.spark.metrics.sink.JmxSink
 
stop() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
stop() - Method in interface org.apache.spark.metrics.sink.Sink
 
stop(String) - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
Stops a timer and returns the elapsed time in seconds.
stop() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.DAGScheduler
 
stop() - Method in class org.apache.spark.scheduler.EventLoggingListener
Stop logging events.
stop() - Method in class org.apache.spark.scheduler.LiveListenerBus
 
stop() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
stop() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.TaskResultGetter
 
stop() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
stop() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
stop() - Method in class org.apache.spark.SparkContext
Shut down the SparkContext.
stop() - Method in class org.apache.spark.SparkEnv
 
stop() - Method in class org.apache.spark.storage.BlockManager
 
stop() - Method in class org.apache.spark.storage.BlockManagerMaster
Stop the driver actor, called only on the Spark driver node
stop() - Method in class org.apache.spark.storage.DiskBlockManager
Cleanup local dirs and stop shuffle sender.
stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop() - Method in class org.apache.spark.streaming.CheckpointWriter
 
stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.InputDStream
Method called to stop receiving data.
stop() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
stop() - Method in class org.apache.spark.streaming.DStreamGraph
 
stop() - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Stop all threads.
stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver
Stop the receiver completely.
stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Stop the receiver completely due to an exception
stop(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Mark the supervisor and the receiver for stopping
stop() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.JobGenerator
Stop generation of jobs.
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
stop() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Stop the block tracker.
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Stop the receiver execution thread.
stop() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext
Stop the execution of the streams immediately (does not wait for all received data to be processed).
stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext
Stop the execution of the streams, with option of ensuring all received data has been processed.
stop(boolean) - Method in class org.apache.spark.streaming.util.RecurringTimer
Stop the timer, and return the last time the callback was made.
stop() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Stop the manager, close any open log writer
stop() - Method in class org.apache.spark.ui.SparkUI
Stop the server behind this web interface.
stop() - Method in class org.apache.spark.ui.WebUI
Stop the server behind this web interface.
stop() - Method in class org.apache.spark.util.FileLogger
Close all open writers, streams, and file systems.
stop() - Method in class org.apache.spark.util.logging.FileAppender
Stop the appender
stop() - Method in class org.apache.spark.util.logging.RollingFileAppender
Stop the appender
StopExecutor - Class in org.apache.spark.scheduler.local
 
StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
 
stopExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
StopMapOutputTracker - Class in org.apache.spark
 
StopMapOutputTracker() - Constructor for class org.apache.spark.StopMapOutputTracker
 
Stopped() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Stopped() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
stopping() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
stopReceiver(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Stop receiver
StopReceiver - Class in org.apache.spark.streaming.receiver
 
StopReceiver() - Constructor for class org.apache.spark.streaming.receiver.StopReceiver
 
storageLevel() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
storageLevel() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
storageLevel() - Method in class org.apache.spark.storage.BlockStatus
 
storageLevel() - Method in class org.apache.spark.storage.RDDInfo
 
StorageLevel - Class in org.apache.spark.storage
:: DeveloperApi :: Flags for controlling the storage of an RDD.
StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
 
storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
 
storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
 
storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
storageLevelFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
StorageLevels - Class in org.apache.spark.api.java
Expose some commonly useful storage level constants.
StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
 
storageLevelToJson(StorageLevel) - Static method in class org.apache.spark.util.JsonProtocol
 
storageListener() - Method in class org.apache.spark.ui.SparkUI
 
StorageListener - Class in org.apache.spark.ui.storage
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the BlockManagerUI.
StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
 
StoragePage - Class in org.apache.spark.ui.storage
Page showing list of RDD's currently stored in the cluster
StoragePage(StorageTab) - Constructor for class org.apache.spark.ui.storage.StoragePage
 
StorageStatus - Class in org.apache.spark.storage
:: DeveloperApi :: Storage information for each BlockManager.
StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
 
StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus
Create a storage status with an initial set of blocks, leaving the source unmodified.
storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
 
storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
 
StorageStatusListener - Class in org.apache.spark.storage
:: DeveloperApi :: A SparkListener that maintains executor storage status.
StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
 
storageStatusListener() - Method in class org.apache.spark.ui.SparkUI
 
StorageTab - Class in org.apache.spark.ui.storage
Web UI showing storage status of all RDD's in the given SparkContext.
StorageTab(SparkUI) - Constructor for class org.apache.spark.ui.storage.StorageTab
 
StorageUtils - Class in org.apache.spark.storage
Helper methods for storage-related objects.
StorageUtils() - Constructor for class org.apache.spark.storage.StorageUtils
 
store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store the bytes of received data as a data block into Spark's memory.
store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store a single item of received data to Spark's memory.
store(T) - Method in class org.apache.spark.streaming.receiver.Receiver
Store a single item of received data to Spark's memory.
store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an ArrayBuffer of received data as a data block into Spark's memory.
store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an ArrayBuffer of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver
Store the bytes of received data as a data block into Spark's memory.
store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store the bytes of received data as a data block into Spark's memory.
storeBlock(StreamBlockId, ReceivedBlock) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
storeBlock(StreamBlockId, ReceivedBlock) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
Store a received block with the given block id and return related metadata
storeBlock(StreamBlockId, ReceivedBlock) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
This implementation stores the block into the block manager as well as a write ahead log.
Strategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Stores all the configuration options for tree construction
Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, Option<String>, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
 
Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
Java-friendly constructor for Strategy
STRATEGY_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
STRATEGY_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
StratifiedSamplingUtils - Class in org.apache.spark.util.random
Auxiliary functions and data structures for the sampleByKey method in PairRDDFunctions.
StratifiedSamplingUtils() - Constructor for class org.apache.spark.util.random.StratifiedSamplingUtils
 
STREAM() - Static method in class org.apache.spark.storage.BlockId
 
StreamBasedRecordReader<T> - Class in org.apache.spark.input
An abstract class of RecordReader to reading files out as streams
StreamBasedRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.StreamBasedRecordReader
 
StreamBlockId - Class in org.apache.spark.storage
 
StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
 
streamed() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
streamedKeys() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
streamedPlan() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
StreamFileInputFormat<T> - Class in org.apache.spark.input
A general format for reading whole files in as streams, byte arrays, or other functions to be added
StreamFileInputFormat() - Constructor for class org.apache.spark.input.StreamFileInputFormat
 
streamId() - Method in class org.apache.spark.storage.StreamBlockId
 
streamId() - Method in class org.apache.spark.streaming.receiver.Receiver
Get the unique identifier the receiver input stream that this receiver is associated with.
streamId() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
streamId() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
streamIdToAllocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
StreamingContext - Class in org.apache.spark.streaming
Main entry point for Spark Streaming functionality.
StreamingContext(SparkContext, Checkpoint, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
 
StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext using an existing SparkContext.
StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext by providing the configuration necessary for a new SparkContext.
StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext by providing the details necessary for creating a new SparkContext.
StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext
Recreate a StreamingContext from a checkpoint file.
StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext
Recreate a StreamingContext from a checkpoint file.
StreamingContext.StreamingContextState$ - Class in org.apache.spark.streaming
Enumeration to identify current state of the StreamingContext
StreamingContext.StreamingContextState$() - Constructor for class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
StreamingContextState() - Method in class org.apache.spark.streaming.StreamingContext
Accessor for nested Scala object
StreamingExamples - Class in org.apache.spark.examples.streaming
Utility functions for Spark Streaming examples.
StreamingExamples() - Constructor for class org.apache.spark.examples.streaming.StreamingExamples
 
StreamingJobProgressListener - Class in org.apache.spark.streaming.ui
 
StreamingJobProgressListener(StreamingContext) - Constructor for class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
StreamingKMeans - Class in org.apache.spark.mllib.clustering
:: DeveloperApi :: StreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming, and using the model to make predictions on streaming data.
StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
 
StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
 
StreamingKMeansModel - Class in org.apache.spark.mllib.clustering
:: DeveloperApi :: StreamingKMeansModel extends MLlib's KMeansModel for streaming algorithms, so it can keep track of a continuously updated weight associated with each cluster, and also update the model by doing a single iteration of the standard k-means algorithm.
StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression
:: DeveloperApi :: StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data, and using it for prediction on (possibly different) streaming data.
StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
 
StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
Train or predict a linear regression model on streaming data.
StreamingLinearRegressionWithSGD(double, int, double, Vector) - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
StreamingListener - Interface in org.apache.spark.streaming.scheduler
:: DeveloperApi :: A listener interface for receiving information about an ongoing streaming computation.
StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
StreamingListenerBus - Class in org.apache.spark.streaming.scheduler
Asynchronously passes StreamingListenerEvents to registered StreamingListeners.
StreamingListenerBus() - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Base trait for events related to StreamingListener
StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
StreamingListenerShutdown - Class in org.apache.spark.streaming.scheduler
An event used in the listener to shutdown the listener daemon thread.
StreamingListenerShutdown() - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerShutdown
 
StreamingPage - Class in org.apache.spark.streaming.ui
Page for Spark Web UI that shows statistics of a streaming job
StreamingPage(StreamingTab) - Constructor for class org.apache.spark.streaming.ui.StreamingPage
 
StreamingSource - Class in org.apache.spark.streaming
 
StreamingSource(StreamingContext) - Constructor for class org.apache.spark.streaming.StreamingSource
 
StreamingTab - Class in org.apache.spark.streaming.ui
Spark Web UI tab that shows statistics of a streaming job.
StreamingTab(StreamingContext) - Constructor for class org.apache.spark.streaming.ui.StreamingTab
 
StreamInputFormat - Class in org.apache.spark.input
The format for the PortableDataStream files
StreamInputFormat() - Constructor for class org.apache.spark.input.StreamInputFormat
 
StreamRecordReader - Class in org.apache.spark.input
Reads the record in directly as a stream for other objects to manipulate and handle
StreamRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.StreamRecordReader
 
streamSideKeyGenerator() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
STRING - Class in org.apache.spark.sql.columnar
 
STRING() - Constructor for class org.apache.spark.sql.columnar.STRING
 
StringColumnAccessor - Class in org.apache.spark.sql.columnar
 
StringColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.StringColumnAccessor
 
StringColumnBuilder - Class in org.apache.spark.sql.columnar
 
StringColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.StringColumnBuilder
 
StringColumnStats - Class in org.apache.spark.sql.columnar
 
StringColumnStats() - Constructor for class org.apache.spark.sql.columnar.StringColumnStats
 
stringifyPartialValue(Object) - Static method in class org.apache.spark.Accumulators
 
stringifyValue(Object) - Static method in class org.apache.spark.Accumulators
 
stringToText(String) - Static method in class org.apache.spark.SparkContext
 
stringToTime(String) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
 
StringType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the StringType object.
StringType - Class in org.apache.spark.sql.api.java
The data type representing String values.
stringWritableConverter() - Static method in class org.apache.spark.SparkContext
 
stripDirectory(String) - Static method in class org.apache.spark.util.Utils
Strip the directory from a path name
stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps
Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
StronglyConnectedComponents - Class in org.apache.spark.graphx.lib
Strongly connected components algorithm implementation.
StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
 
StructField - Class in org.apache.spark.sql.api.java
A StructField object represents a field in a StructType object.
StructType - Class in org.apache.spark.sql.api.java
The data type representing Rows.
StudentTCacher - Class in org.apache.spark.partial
A utility class for caching Student's T distribution values for a given confidence level and various sample sizes.
StudentTCacher(double) - Constructor for class org.apache.spark.partial.StudentTCacher
 
subDirsPerLocalDir() - Method in class org.apache.spark.storage.DiskBlockManager
 
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph
Restricts the graph to only the vertices and edges satisfying the predicates.
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
submissionTime() - Method in class org.apache.spark.scheduler.StageInfo
When this stage was submitted from the DAGScheduler to a TaskScheduler.
submissionTime() - Method in interface org.apache.spark.SparkStageInfo
 
submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
 
submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
submitJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, CallSite, boolean, Function2<Object, U, BoxedUnit>, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
Submit a job to the job scheduler and get a JobWaiter object back.
submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext
:: Experimental :: Submit a job for execution and return a FutureJob holding the result.
submitJobSet(JobSet) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
submitTasks(TaskSet) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
submitTasks(TaskSet) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
subProperties(Properties, Regex) - Method in class org.apache.spark.metrics.MetricsConfig
 
subsampleWeights() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
 
subsamplingFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Indicates if feature subsampling is being used.
subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns subset accuracy (for equal sets of labels)
SUBSTR() - Static method in class org.apache.spark.sql.hive.HiveQl
 
subTestSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
subTestSchemaFieldNames() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Subtract the stats from another calculator from this one, modifying and returning this calculator.
subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
subtract(Vector) - Method in class org.apache.spark.util.Vector
 
subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from `this` whose keys are not in `other`.
SubtractedRDD<K,V,W> - Class in org.apache.spark.rdd
An optimized version of cogroup for set difference/subtraction.
SubtractedRDD(RDD<? extends Product2<K, V>>, RDD<? extends Product2<K, W>>, Partitioner, ClassTag<K>, ClassTag<V>, ClassTag<W>) - Constructor for class org.apache.spark.rdd.SubtractedRDD
 
subtreeDepth() - Method in class org.apache.spark.mllib.tree.model.Node
Get depth of tree from this node.
subtreeToString(int) - Method in class org.apache.spark.mllib.tree.model.Node
Recursive print function.
succeededTasks() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
success() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
 
Success - Class in org.apache.spark
:: DeveloperApi :: Task succeeded.
Success() - Constructor for class org.apache.spark.Success
 
successful() - Method in class org.apache.spark.scheduler.TaskInfo
 
successful() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SUCCESSFUL_JOB_OUTPUT_DIR_MARKER() - Static method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 
sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Add up the elements in this RDD.
Sum() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
sum() - Method in class org.apache.spark.partial.CountEvaluator
 
sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Add up the elements in this RDD.
SUM() - Static method in class org.apache.spark.sql.hive.HiveQl
 
sum() - Method in class org.apache.spark.util.StatCounter
 
sum() - Method in class org.apache.spark.util.Vector
 
sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
:: Experimental :: Approximate operation to return the sum within a timeout.
SumEvaluator - Class in org.apache.spark.partial
An ApproximateEvaluator for sums.
SumEvaluator(int, double) - Constructor for class org.apache.spark.partial.SumEvaluator
 
summary(PrintStream) - Method in class org.apache.spark.util.Distribution
print a summary of this distribution to the given PrintStream.
sums() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
sums() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
sums() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
supervisorStrategy() - Method in class org.apache.spark.scheduler.DAGSchedulerActorSupervisor
 
supervisorStrategy() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest
List of supported feature subset sampling strategies.
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
supports(ColumnType<?, ?>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
SVDPlusPlus - Class in org.apache.spark.graphx.lib
Implementation of SVD++ algorithm.
SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
 
SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib
Configuration parameters for SVDPlusPlus.
SVDPlusPlus.Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
SVMDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate sample data used for SVM.
SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
 
SVMModel - Class in org.apache.spark.mllib.classification
Model for Support Vector Machines (SVMs).
SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
 
SVMWithSGD - Class in org.apache.spark.mllib.classification
Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD
Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
symlink(File, File) - Static method in class org.apache.spark.util.Utils
Creates a symlink.
symmetricEigs(Function1<DenseVector<Object>, DenseVector<Object>>, int, int, double, int) - Static method in class org.apache.spark.mllib.linalg.EigenValueDecomposition
Compute the leading k eigenvalues and eigenvectors on a symmetric square matrix using ARPACK.
SystemClock - Class in org.apache.spark.streaming.util
 
SystemClock() - Constructor for class org.apache.spark.streaming.util.SystemClock
 
SystemClock - Class in org.apache.spark.util
 
SystemClock() - Constructor for class org.apache.spark.util.SystemClock
 
systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
systemProperty(Enumeration.Value) - Static method in class org.apache.spark.util.MetadataCleanerType
 

T

t() - Method in class org.apache.spark.SerializableWritable
 
table() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
table() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
table() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
table(String) - Method in class org.apache.spark.sql.SQLContext
Returns the specified table as a SchemaRDD
TABLE_CLASS_NOT_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
 
TABLE_CLASS_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
 
tableDesc() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
tableExists(Seq<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
tableInfo() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
tableName() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
tableName() - Method in class org.apache.spark.sql.execution.CacheTableCommand
 
tableName() - Method in class org.apache.spark.sql.execution.UncacheTableCommand
 
tableName() - Method in class org.apache.spark.sql.hive.AnalyzeTable
 
tableName() - Method in class org.apache.spark.sql.hive.DropTable
 
tableName() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
 
tableName() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
tableName() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
tableName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
tableName() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
TableReader - Interface in org.apache.spark.sql.hive
A trait for subclasses that handle table scans.
TableScan - Class in org.apache.spark.sql.sources
::DeveloperApi:: A BaseRelation that can produce all of its tuples as an RDD of Row objects.
TableScan() - Constructor for class org.apache.spark.sql.sources.TableScan
 
TachyonBlockManager - Class in org.apache.spark.storage
Creates and maintains the logical mapping between logical blocks and tachyon fs locations.
TachyonBlockManager(BlockManager, String, String) - Constructor for class org.apache.spark.storage.TachyonBlockManager
 
TachyonFileSegment - Class in org.apache.spark.storage
References a particular segment of a file (potentially the entire file), based off an offset and a length.
TachyonFileSegment(TachyonFile, long, long) - Constructor for class org.apache.spark.storage.TachyonFileSegment
 
tachyonFolderName() - Method in class org.apache.spark.SparkContext
 
tachyonSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
tachyonSize() - Method in class org.apache.spark.storage.BlockStatus
 
tachyonSize() - Method in class org.apache.spark.storage.RDDInfo
 
tachyonStore() - Method in class org.apache.spark.storage.BlockManager
 
TachyonStore - Class in org.apache.spark.storage
Stores BlockManager blocks on Tachyon.
TachyonStore(BlockManager, TachyonBlockManager) - Constructor for class org.apache.spark.storage.TachyonStore
 
tail() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.rdd.RDD
Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
take(int) - Method in class org.apache.spark.sql.SchemaRDD
 
takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of the take action, which returns a future for retrieving the first num elements of this RDD.
takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for retrieving the first num elements of the RDD.
takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the first k (smallest) elements from this RDD as defined by the specified Comparator[T] and maintains the order.
takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the first k (smallest) elements from this RDD using the natural ordering for T while maintain the order.
takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the first k (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
TakeOrdered() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
TakeOrdered - Class in org.apache.spark.sql.execution
:: DeveloperApi :: Take the first limit elements as defined by the sortOrder.
TakeOrdered(int, Seq<SortOrder>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.TakeOrdered
 
takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
Return a fixed-size sampled subset of this RDD in an array
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
task() - Method in class org.apache.spark.CleanupTaskWeakReference
 
task() - Method in class org.apache.spark.scheduler.BeginEvent
 
task() - Method in class org.apache.spark.scheduler.CompletionEvent
 
Task<T> - Class in org.apache.spark.scheduler
A unit of execution.
Task(int, int) - Constructor for class org.apache.spark.scheduler.Task
 
TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
TASK_SIZE_TO_WARN_KB() - Static method in class org.apache.spark.scheduler.TaskSetManager
 
taskAttempts() - Method in class org.apache.spark.scheduler.TaskSetManager
 
TaskCompletionListener - Interface in org.apache.spark.util
:: DeveloperApi ::
TaskCompletionListenerException - Exception in org.apache.spark.util
Exception thrown when there is an exception in executing the callback in TaskCompletionListener.
TaskCompletionListenerException(Seq<String>) - Constructor for exception org.apache.spark.util.TaskCompletionListenerException
 
TaskContext - Class in org.apache.spark
Contextual information about a task which can be read or mutated during execution.
TaskContext() - Constructor for class org.apache.spark.TaskContext
 
TaskContextHelper - Class in org.apache.spark
This class exists to restrict the visibility of TaskContext setters.
TaskContextHelper() - Constructor for class org.apache.spark.TaskContextHelper
 
TaskContextImpl - Class in org.apache.spark
 
TaskContextImpl(int, int, long, boolean, TaskMetrics) - Constructor for class org.apache.spark.TaskContextImpl
 
taskData() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
TaskDescription - Class in org.apache.spark.scheduler
Description of a task that gets passed onto executors to be executed, usually created by TaskSetManager.resourceOffer.
TaskDescription(long, String, String, int, ByteBuffer) - Constructor for class org.apache.spark.scheduler.TaskDescription
 
TaskDetailsClassNames - Class in org.apache.spark.ui.jobs
Names of the CSS classes corresponding to each type of task detail.
TaskDetailsClassNames() - Constructor for class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
taskEnded(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskEndReason - Interface in org.apache.spark
:: DeveloperApi :: Various possible reasons why a task ended.
taskEndReasonFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskEndReasonToJson(TaskEndReason) - Static method in class org.apache.spark.util.JsonProtocol
 
taskEndToJson(SparkListenerTaskEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskFailedReason - Interface in org.apache.spark
:: DeveloperApi :: Various possible reasons why a task failed.
taskGettingResult(TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskGettingResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskGettingResultToJson(SparkListenerTaskGettingResult) - Static method in class org.apache.spark.util.JsonProtocol
 
taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
taskId() - Method in class org.apache.spark.scheduler.local.KillTask
 
taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
taskId() - Method in class org.apache.spark.scheduler.TaskDescription
 
taskId() - Method in class org.apache.spark.scheduler.TaskInfo
 
taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
 
taskIdsOnSlave() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
taskIdToExecutorId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
taskIdToSlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
taskIdToSlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
taskIdToTaskSetId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
taskInfo() - Method in class org.apache.spark.scheduler.BeginEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.CompletionEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.GettingResultEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
TaskInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Information about a running task attempt inside a TaskSet.
TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
 
taskInfo() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
taskInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskInfos() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskInfoToJson(TaskInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskKilled - Class in org.apache.spark
:: DeveloperApi :: Task was killed intentionally and needs to be rescheduled.
TaskKilled() - Constructor for class org.apache.spark.TaskKilled
 
TaskKilledException - Exception in org.apache.spark
:: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected).
TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
 
taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
 
TaskLocality - Class in org.apache.spark.scheduler
 
TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
 
TaskLocation - Interface in org.apache.spark.scheduler
A location where a task should run.
taskMetrics() - Method in class org.apache.spark.Heartbeat
 
taskMetrics() - Method in class org.apache.spark.scheduler.CompletionEvent
 
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
taskMetrics() - Method in class org.apache.spark.TaskContext
::DeveloperApi::
taskMetrics() - Method in class org.apache.spark.TaskContextImpl
 
taskMetrics() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
taskMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskMetricsToJson(TaskMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskResult<T> - Interface in org.apache.spark.scheduler
 
TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
 
TaskResultBlockId - Class in org.apache.spark.storage
 
TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
 
TaskResultGetter - Class in org.apache.spark.scheduler
Runs a thread pool that deserializes and remotely fetches (if necessary) task results.
TaskResultGetter(SparkEnv, TaskSchedulerImpl) - Constructor for class org.apache.spark.scheduler.TaskResultGetter
 
taskResultGetter() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskResultLost - Class in org.apache.spark
:: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched.
TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
 
taskRow(boolean, boolean, boolean, boolean, boolean, boolean, UIData.TaskUIData) - Method in class org.apache.spark.ui.jobs.StagePage
 
tasks() - Method in class org.apache.spark.scheduler.TaskSet
 
tasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskScheduler() - Method in class org.apache.spark.scheduler.DAGScheduler
 
TaskScheduler - Interface in org.apache.spark.scheduler
Low-level task scheduler interface, currently implemented exclusively by TaskSchedulerImpl.
taskScheduler() - Method in class org.apache.spark.SparkContext
 
TaskSchedulerImpl - Class in org.apache.spark.scheduler
Schedules tasks for multiple types of clusters by acting through a SchedulerBackend.
TaskSchedulerImpl(SparkContext, int, boolean) - Constructor for class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskSchedulerImpl(SparkContext) - Constructor for class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskSet - Class in org.apache.spark.scheduler
A set of tasks submitted together to the low-level TaskScheduler, usually representing missing partitions of a particular stage.
TaskSet(Task<?>[], int, int, int, Properties) - Constructor for class org.apache.spark.scheduler.TaskSet
 
taskSet() - Method in class org.apache.spark.scheduler.TaskSetFailed
 
taskSet() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskSetFailed(TaskSet, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
TaskSetFailed - Class in org.apache.spark.scheduler
 
TaskSetFailed(TaskSet, String) - Constructor for class org.apache.spark.scheduler.TaskSetFailed
 
taskSetFinished(TaskSetManager) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
Called to indicate that all task attempts (including speculated tasks) associated with the given TaskSetManager have completed, so state associated with the TaskSetManager should be cleaned up.
TaskSetManager - Class in org.apache.spark.scheduler
Schedules the tasks within a single TaskSet in the TaskSchedulerImpl.
TaskSetManager(TaskSchedulerImpl, TaskSet, int, Clock) - Constructor for class org.apache.spark.scheduler.TaskSetManager
 
taskSetSchedulingAlgorithm() - Method in class org.apache.spark.scheduler.Pool
 
tasksSuccessful() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskStarted(Task<?>, TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskStartToJson(SparkListenerTaskStart) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskState - Class in org.apache.spark
 
TaskState() - Constructor for class org.apache.spark.TaskState
 
taskSucceeded(int, Object) - Method in class org.apache.spark.partial.ApproximateActionListener
 
taskSucceeded(int, Object) - Method in interface org.apache.spark.scheduler.JobListener
 
taskSucceeded(int, Object) - Method in class org.apache.spark.scheduler.JobWaiter
 
taskTime() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
tellMaster() - Method in class org.apache.spark.storage.BlockInfo
 
TempLocalBlockId - Class in org.apache.spark.storage
Id associated with temporary local data managed as blocks.
TempLocalBlockId(UUID) - Constructor for class org.apache.spark.storage.TempLocalBlockId
 
TempShuffleBlockId - Class in org.apache.spark.storage
Id associated with temporary shuffle data managed as blocks.
TempShuffleBlockId(UUID) - Constructor for class org.apache.spark.storage.TempShuffleBlockId
 
TerminalWidth() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
TEST() - Static method in class org.apache.spark.storage.BlockId
 
TestBlockId - Class in org.apache.spark.storage
 
TestBlockId(String) - Constructor for class org.apache.spark.storage.TestBlockId
 
TestClock - Class in org.apache.spark
A clock that allows the caller to customize the time.
TestClock(long) - Constructor for class org.apache.spark.TestClock
 
testData() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testFilterDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testFilterSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestGroupWriteSupport - Class in org.apache.spark.sql.parquet
 
TestGroupWriteSupport(MessageType) - Constructor for class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
TestHive - Class in org.apache.spark.sql.hive.test
 
TestHive() - Constructor for class org.apache.spark.sql.hive.test.TestHive
 
TestHiveContext - Class in org.apache.spark.sql.hive.test
A locally running test instance of Spark's Hive execution engine.
TestHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext
 
TestHiveContext.QueryExecution - Class in org.apache.spark.sql.hive.test
Override QueryExecution with special debug workflow.
TestHiveContext.QueryExecution() - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
 
TestHiveContext.TestTable - Class in org.apache.spark.sql.hive.test
 
TestHiveContext.TestTable(String, Seq<Function0<BoxedUnit>>) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
 
testNestedData1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedData2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestResult<DF> - Interface in org.apache.spark.mllib.stat.test
:: Experimental :: Trait for hypothesis test results.
testSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testSchemaFieldNames() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestSQLContext - Class in org.apache.spark.sql.test
A SQLContext that can be used for local testing.
TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
 
testTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
A list of test tables and the DDL required to initialize them.
testTempDir() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
TestUtils - Class in org.apache.spark
Utilities for tests.
TestUtils() - Constructor for class org.apache.spark.TestUtils
 
textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.SparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textResponderToServlet(Function1<HttpServletRequest, String>) - Static method in class org.apache.spark.ui.JettyUtils
 
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
thread() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
threadDumpEnabled() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
threadId() - Method in class org.apache.spark.util.ThreadStackTrace
 
threadName() - Method in class org.apache.spark.util.ThreadStackTrace
 
ThreadStackTrace - Class in org.apache.spark.util
Used for shipping per-thread stacktraces from the executors to driver.
ThreadStackTrace(long, String, Thread.State, String) - Constructor for class org.apache.spark.util.ThreadStackTrace
 
threadState() - Method in class org.apache.spark.util.ThreadStackTrace
 
threshold() - Method in interface org.apache.spark.ml.param.HasThreshold
param for threshold in (binary) prediction
threshold() - Method in class org.apache.spark.mllib.tree.model.Split
 
thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns thresholds in descending order.
threshTime() - Method in class org.apache.spark.streaming.receiver.CleanupOldBlocks
 
throwBalls() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
tick(long) - Method in class org.apache.spark.TestClock
 
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
time() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
time() - Method in class org.apache.spark.streaming.scheduler.ClearCheckpointData
 
time() - Method in class org.apache.spark.streaming.scheduler.ClearMetadata
 
time() - Method in class org.apache.spark.streaming.scheduler.DoCheckpoint
 
time() - Method in class org.apache.spark.streaming.scheduler.GenerateJobs
 
time() - Method in class org.apache.spark.streaming.scheduler.Job
 
time() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
Time - Class in org.apache.spark.streaming
This is a simple class that represents an absolute instant of time.
Time(long) - Constructor for class org.apache.spark.streaming.Time
 
TimeBasedRollingPolicy - Class in org.apache.spark.util.logging
Defines a RollingPolicy by which files will be rolled over at a fixed interval.
TimeBasedRollingPolicy(long, String, boolean) - Constructor for class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
timeIt(int, Function0<BoxedUnit>, Option<Function0<BoxedUnit>>) - Static method in class org.apache.spark.util.Utils
Timing method based on iterations that permit JVM JIT optimization.
timeout() - Method in class org.apache.spark.storage.BlockManagerMaster
 
timeoutCheckingTask() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
timeRunning(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
times(int) - Method in class org.apache.spark.streaming.Duration
 
times() - Method in class org.apache.spark.streaming.scheduler.BatchCleanupEvent
 
times(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Method executed for repeating a task for side effects.
TIMESTAMP - Class in org.apache.spark.sql.columnar
 
TIMESTAMP() - Constructor for class org.apache.spark.sql.columnar.TIMESTAMP
 
timestamp() - Method in class org.apache.spark.util.TimeStampedValue
 
TimestampColumnAccessor - Class in org.apache.spark.sql.columnar
 
TimestampColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.TimestampColumnAccessor
 
TimestampColumnBuilder - Class in org.apache.spark.sql.columnar
 
TimestampColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.TimestampColumnBuilder
 
TimestampColumnStats - Class in org.apache.spark.sql.columnar
 
TimestampColumnStats() - Constructor for class org.apache.spark.sql.columnar.TimestampColumnStats
 
TimeStampedHashMap<A,B> - Class in org.apache.spark.util
This is a custom implementation of scala.collection.mutable.Map which stores the insertion timestamp along with each key-value pair.
TimeStampedHashMap(boolean) - Constructor for class org.apache.spark.util.TimeStampedHashMap
 
TimeStampedHashSet<A> - Class in org.apache.spark.util
 
TimeStampedHashSet() - Constructor for class org.apache.spark.util.TimeStampedHashSet
 
TimeStampedValue<V> - Class in org.apache.spark.util
 
TimeStampedValue(V, long) - Constructor for class org.apache.spark.util.TimeStampedValue
 
TimeStampedWeakValueHashMap<A,B> - Class in org.apache.spark.util
A wrapper of TimeStampedHashMap that ensures the values are weakly referenced and timestamped.
TimeStampedWeakValueHashMap(boolean) - Constructor for class org.apache.spark.util.TimeStampedWeakValueHashMap
 
TimestampType - Static variable in class org.apache.spark.sql.api.java.DataType
Gets the TimestampType object.
TimestampType - Class in org.apache.spark.sql.api.java
The data type representing java.sql.Timestamp values.
timeToLogFile(long, long) - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
 
TimeTracker - Class in org.apache.spark.mllib.tree.impl
Time tracker implementation which holds labeled timers.
TimeTracker() - Constructor for class org.apache.spark.mllib.tree.impl.TimeTracker
 
timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
tmpPath() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
to(Time, Duration) - Method in class org.apache.spark.streaming.Time
 
toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike
Deprecated.
As of Spark 1.0.0, toArray() is deprecated, use JavaRDDLike.collect() instead
toArray() - Method in class org.apache.spark.input.PortableDataStream
Read the file as a byte array
toArray() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix
Converts to a dense array in column major.
toArray() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toArray() - Method in interface org.apache.spark.mllib.linalg.Vector
Converts the instance to a double array.
toArray() - Method in class org.apache.spark.rdd.RDD
Return an array that contains all of the elements in this RDD.
toArrays() - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
toAttribute() - Method in class org.apache.spark.sql.hive.MetastoreRelation.SchemaAttribute
 
toBatchInfo() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Collects data and assembles a local matrix.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Collects data and assembles a local dense breeze matrix (for test only).
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
Converts to a breeze matrix.
toBreeze() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
Converts the instance to a breeze vector.
toCatalystDecimal(HiveDecimalObjectInspector, Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
toDataType(String) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
 
toDataType(Type, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Converts a given Parquet Type into the corresponding DataType.
toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike
A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Print the full model to a string.
toDebugString() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Print the full model to a string.
toDebugString() - Method in class org.apache.spark.rdd.RDD
A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.SparkConf
Return a string listing all keys and values, one per line.
toDense() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
Converts the vector to a dense vector.
toEdgePartition() - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
 
toEdgePartition() - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
 
toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext
Converts the edge and vertex properties into an EdgeTriplet for convenience.
toErrorString() - Method in class org.apache.spark.ExceptionFailure
 
toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
 
toErrorString() - Method in class org.apache.spark.FetchFailed
 
toErrorString() - Static method in class org.apache.spark.Resubmitted
 
toErrorString() - Method in interface org.apache.spark.TaskFailedReason
Error message displayed in the web UI.
toErrorString() - Static method in class org.apache.spark.TaskKilled
 
toErrorString() - Static method in class org.apache.spark.TaskResultLost
 
toErrorString() - Static method in class org.apache.spark.UnknownReason
 
toFormattedString() - Method in class org.apache.spark.streaming.Duration
 
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Converts to IndexedRowMatrix.
toInspector(DataType) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
toInspector(Expression) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
toInt() - Method in class org.apache.spark.storage.StorageLevel
 
toJava(Object, DataType) - Static method in class org.apache.spark.sql.execution.EvaluatePython
Helper for converting a Scala object to a java suitable for pyspark serialization.
toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Convert to a JavaDStream
toJavaRDD() - Method in class org.apache.spark.rdd.RDD
 
toJavaSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
Returns this RDD as a JavaSchemaRDD.
toJSON() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Returns a new RDD with each row transformed to a JSON string.
toJSON() - Method in class org.apache.spark.sql.SchemaRDD
Returns a new RDD with each row transformed to a JSON string.
tokenize(String) - Static method in class org.apache.spark.rdd.PipedRDD
 
Tokenizer - Class in org.apache.spark.ml.feature
:: AlphaComponent :: A tokenizer that converts the input string to lowercase and then splits it by white spaces.
Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
 
toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an iterator that contains all of the elements in this RDD.
toLocalIterator() - Method in class org.apache.spark.rdd.RDD
Return an iterator that contains all of the elements in this RDD.
toMap() - Method in class org.apache.spark.util.TimeStampedHashMap
 
toMap() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toMesos(Enumeration.Value) - Static method in class org.apache.spark.TaskState
 
toMetastoreType(DataType) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.ExecutorTable
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.PoolTable
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.StageTableBase
 
ToolTips - Class in org.apache.spark.ui
 
ToolTips() - Constructor for class org.apache.spark.ui.ToolTips
 
toOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
 
toOps(VertexPartition<VD>, ClassTag<VD>) - Method in class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
 
toOps(T, ClassTag<VD>) - Method in interface org.apache.spark.graphx.impl.VertexPartitionBaseOpsConstructor
 
top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the top k (largest) elements from this RDD as defined by the specified Comparator[T].
top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the top k (largest) elements from this RDD using the natural ordering for T.
top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
 
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
 
topK(Iterator<Tuple2<String, Object>>, int) - Static method in class org.apache.spark.streaming.util.RawTextHelper
Gets the top k words in terms of word counts.
topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
toPrimitiveDataType(PrimitiveType, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
 
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Converts to RowMatrix, dropping row indices after grouping by row index.
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Drops row indices and converts this matrix to a RowMatrix.
TorrentBroadcast<T> - Class in org.apache.spark.broadcast
A BitTorrent-like implementation of Broadcast.
TorrentBroadcast(T, long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.TorrentBroadcast
 
TorrentBroadcastFactory - Class in org.apache.spark.broadcast
A Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors.
TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
 
toScalaFunction(Function<T, R>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toScalaFunction2(Function2<T1, T2, R>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
Returns this RDD as a SchemaRDD.
toSeq() - Method in class org.apache.spark.ml.param.ParamMap
Converts this param map to a sequence of param pairs.
toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
 
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
 
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
 
toString() - Method in class org.apache.spark.Accumulable
 
toString() - Method in class org.apache.spark.api.java.JavaRDD
 
toString() - Method in class org.apache.spark.broadcast.Broadcast
 
toString() - Method in class org.apache.spark.graphx.EdgeDirection
 
toString() - Method in class org.apache.spark.graphx.EdgeTriplet
 
toString() - Method in class org.apache.spark.ml.param.Param
 
toString() - Method in class org.apache.spark.ml.param.ParamMap
 
toString() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
A human readable representation of the matrix
toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult
String explaining the hypothesis test result.
toString() - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
Print all timing results in seconds.
toString() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Print a summary of the model.
toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
toString() - Method in class org.apache.spark.mllib.tree.model.Node
 
toString() - Method in class org.apache.spark.mllib.tree.model.Predict
 
toString() - Method in class org.apache.spark.mllib.tree.model.Split
 
toString() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Print a summary of the model.
toString() - Method in class org.apache.spark.partial.BoundedDouble
 
toString() - Method in class org.apache.spark.partial.PartialResult
 
toString() - Method in class org.apache.spark.rdd.RDD
 
toString() - Method in class org.apache.spark.scheduler.ExecutorLossReason
 
toString() - Method in class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
toString() - Method in class org.apache.spark.scheduler.HostTaskLocation
 
toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
toString() - Method in class org.apache.spark.scheduler.ResultTask
 
toString() - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
toString() - Method in class org.apache.spark.scheduler.SplitInfo
 
toString() - Method in class org.apache.spark.scheduler.Stage
 
toString() - Method in class org.apache.spark.scheduler.TaskDescription
 
toString() - Method in class org.apache.spark.scheduler.TaskSet
 
toString() - Method in class org.apache.spark.SerializableWritable
 
toString() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
toString() - Method in class org.apache.spark.sql.api.java.Row
 
toString() - Method in class org.apache.spark.sql.columnar.ColumnType
 
toString() - Method in class org.apache.spark.sql.execution.PythonUDF
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
toString() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
toString() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
toString() - Method in interface org.apache.spark.sql.SchemaRDDLike
 
toString() - Method in class org.apache.spark.storage.BlockId
 
toString() - Method in class org.apache.spark.storage.BlockManagerId
 
toString() - Method in class org.apache.spark.storage.BlockManagerInfo
 
toString() - Method in class org.apache.spark.storage.FileSegment
 
toString() - Method in class org.apache.spark.storage.RDDInfo
 
toString() - Method in class org.apache.spark.storage.StorageLevel
 
toString() - Method in class org.apache.spark.storage.TachyonFileSegment
 
toString() - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
 
toString() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
toString() - Method in class org.apache.spark.streaming.Duration
 
toString() - Method in class org.apache.spark.streaming.Interval
 
toString() - Method in class org.apache.spark.streaming.scheduler.Job
 
toString() - Method in class org.apache.spark.streaming.Time
 
toString() - Method in class org.apache.spark.util.MutablePair
 
toString() - Method in class org.apache.spark.util.StatCounter
 
toString() - Method in class org.apache.spark.util.Vector
 
totalCoreCount() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
totalCores() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
totalCoresAcquired() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
totalCount() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for all the jobs of this batch to finish processing from the time they were submitted.
totalDelay() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
totalDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
totalDuration() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalExpectedCores() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
totalInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalNumNodes() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Get total number of nodes, summed over all trees in the forest.
totalRegisteredExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
totalResultSize() - Method in class org.apache.spark.scheduler.TaskSetManager
 
totalShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalTasks() - Method in class org.apache.spark.partial.ApproximateActionListener
 
totalTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
 
toTypeInfo() - Method in class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
 
toWeakReference(V) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toWeakReferenceFunction(Function1<Tuple2<K, V>, R>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toWeakReferenceTuple(Tuple2<K, V>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
trackerActor() - Method in class org.apache.spark.MapOutputTracker
Set to the MapOutputTrackerActor living on the driver.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a Linear Regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
Trains a decision tree model over an RDD.
train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
Method to train a gradient boosting model.
train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
Java-friendly API for GradientBoostedTrees$.train(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.BoostingStrategy)
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model for binary or multiclass classification.
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Java-friendly API for RandomForest$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by users to some products, in the form of (userID, productID, rating) pairs.
trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Update the clustering model by training on batches of data from a DStream.
trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Update the model by training on batches of data from a DStream.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model for regression.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Java-friendly API for RandomForest$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
transceiver() - Method in class org.apache.spark.streaming.flume.FlumeConnection
 
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.PipelineModel
 
transform(SchemaRDD, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with optional parameters
transform(JavaSchemaRDD, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with optional parameters.
transform(SchemaRDD, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with optional parameters
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with provided parameter map as additional parameters.
transform(JavaSchemaRDD, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with optional parameters.
transform(JavaSchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with provided parameter map as additional parameters.
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
 
transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF
Transforms the input document into a sparse term frequency vector.
transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF
Transforms the input document into a sparse term frequency vector (Java version).
transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
Transforms the input document to term frequency vectors.
transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
Transforms the input document to term frequency vectors (Java version).
transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
Transforms term frequency (TF) vectors to TF-IDF vectors.
transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer
Applies unit length normalization on a vector.
transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
Applies standardization transformation on a vector.
transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
Applies transformation on a vector.
transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
Applies transformation on an RDD[Vector].
transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
Applies transformation on an JavaRDD[Vector].
transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
Transforms a word to its vector representation
transform(PartialFunction<ASTNode, ASTNode>) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Returns a copy of this node where rule has been recursively applied to it and all of its children.
transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
TransformedDStream<U> - Class in org.apache.spark.streaming.dstream
 
TransformedDStream(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<U>>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.TransformedDStream
 
Transformer - Class in org.apache.spark.ml
:: AlphaComponent :: Abstract class for transformers that transform one dataset into another.
Transformer() - Constructor for class org.apache.spark.ml.Transformer
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.Pipeline
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.PipelineModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.PipelineStage
Derives the output schema from the input schema and parameters.
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
 
transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transposeMultiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix
Convenience method for `Matrix`^T^-`DenseMatrix` multiplication.
transposeMultiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix
Convenience method for `Matrix`^T^-`DenseVector` multiplication.
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
Aggregates the elements of this RDD in a multi-level tree pattern.
TreeEnsembleModel - Class in org.apache.spark.mllib.tree.model
Represents a tree ensemble model.
TreeEnsembleModel(Enumeration.Value, DecisionTreeModel[], double[], Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel
 
TreePoint - Class in org.apache.spark.mllib.tree.impl
Internal representation of LabeledPoint for DecisionTree.
TreePoint(double, int[]) - Constructor for class org.apache.spark.mllib.tree.impl.TreePoint
 
treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
Reduces the elements of this RDD in a multi-level tree pattern.
trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
triangleCount() - Method in class org.apache.spark.graphx.GraphOps
Compute the number of triangles passing through each vertex.
TriangleCount - Class in org.apache.spark.graphx.lib
Compute the number of triangles passing through each vertex.
TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
 
TripletFields - Class in org.apache.spark.graphx
Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].
TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields
Constructs a default TripletFields in which all fields are included.
TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
 
tripletIterator(boolean, boolean) - Method in class org.apache.spark.graphx.impl.EdgePartition
Get an iterator over the edge triplets in this partition.
triplets() - Method in class org.apache.spark.graphx.Graph
An RDD containing the edge triplets, which are edges along with the vertex data associated with the adjacent vertices.
triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl
Return a RDD that brings edges together with their source and destination vertices.
TRUE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns true positive rate for a given label (category)
tryLog(Function0<T>) - Static method in class org.apache.spark.util.Utils
Executes the given block in a Try, logging any uncaught exceptions.
tryOrExit(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Execute a block of code that evaluates to Unit, forwarding any uncaught exceptions to the default UncaughtExceptionHandler
tryOrIOException(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Execute a block of code that evaluates to Unit, re-throwing any non-fatal uncaught exceptions as IOException.
tryOrIOException(Function0<T>) - Static method in class org.apache.spark.util.Utils
Execute a block of code that returns a value, re-throwing any non-fatal uncaught exceptions as IOException.
tryUncacheQuery(SchemaRDD, boolean) - Method in interface org.apache.spark.sql.CacheManager
Tries to remove the data for the given SchemaRDD from the cache if it's cached
TwitterInputDStream - Class in org.apache.spark.streaming.twitter
 
TwitterInputDStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Constructor for class org.apache.spark.streaming.twitter.TwitterInputDStream
 
TwitterReceiver - Class in org.apache.spark.streaming.twitter
 
TwitterReceiver(Authorization, Seq<String>, StorageLevel) - Constructor for class org.apache.spark.streaming.twitter.TwitterReceiver
 
TwitterUtils - Class in org.apache.spark.streaming.twitter
 
TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
 
typ() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
typeId() - Method in class org.apache.spark.sql.columnar.ColumnType
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
typeId() - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 

U

U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
udf() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
 
udf() - Method in class org.apache.spark.sql.execution.EvaluatePython
 
UDF1<T1,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 1 arguments.
UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 10 arguments.
UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 11 arguments.
UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 12 arguments.
UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 13 arguments.
UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 14 arguments.
UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 15 arguments.
UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 16 arguments.
UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 17 arguments.
UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 18 arguments.
UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 19 arguments.
UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 2 arguments.
UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 20 arguments.
UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 21 arguments.
UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 22 arguments.
UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 3 arguments.
UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 4 arguments.
UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 5 arguments.
UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 6 arguments.
UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 7 arguments.
UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 8 arguments.
UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 9 arguments.
UDFRegistration - Interface in org.apache.spark.sql.api.java
A collection of functions that allow Java users to register UDFs.
UDFRegistration - Interface in org.apache.spark.sql
Functions for registering scala lambda functions as UDFs in a SQLContext.
UDTWrappers - Class in org.apache.spark.sql.api.java
 
UDTWrappers() - Constructor for class org.apache.spark.sql.api.java.UDTWrappers
 
ui() - Method in class org.apache.spark.SparkContext
 
uid() - Method in interface org.apache.spark.ml.Identifiable
A unique id for the object.
UIData - Class in org.apache.spark.ui.jobs
 
UIData() - Constructor for class org.apache.spark.ui.jobs.UIData
 
UIData.ExecutorSummary - Class in org.apache.spark.ui.jobs
 
UIData.ExecutorSummary() - Constructor for class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
UIData.JobUIData - Class in org.apache.spark.ui.jobs
 
UIData.JobUIData(int, Option<Object>, Option<Object>, Seq<Object>, Option<String>, JobExecutionStatus, int, int, int, int, int, int, OpenHashSet<Object>, int, int) - Constructor for class org.apache.spark.ui.jobs.UIData.JobUIData
 
UIData.JobUIData$ - Class in org.apache.spark.ui.jobs
 
UIData.JobUIData$() - Constructor for class org.apache.spark.ui.jobs.UIData.JobUIData$
 
UIData.StageUIData - Class in org.apache.spark.ui.jobs
 
UIData.StageUIData() - Constructor for class org.apache.spark.ui.jobs.UIData.StageUIData
 
UIData.TaskUIData - Class in org.apache.spark.ui.jobs
These are kept mutable and reused throughout a task's lifetime to avoid excessive reallocation.
UIData.TaskUIData(TaskInfo, Option<TaskMetrics>, Option<String>) - Constructor for class org.apache.spark.ui.jobs.UIData.TaskUIData
 
UIData.TaskUIData$ - Class in org.apache.spark.ui.jobs
 
UIData.TaskUIData$() - Constructor for class org.apache.spark.ui.jobs.UIData.TaskUIData$
 
uiRoot() - Static method in class org.apache.spark.ui.UIUtils
 
uiTab() - Method in class org.apache.spark.streaming.StreamingContext
 
UIUtils - Class in org.apache.spark.ui
Utility functions for generating XML pages with spark content.
UIUtils() - Constructor for class org.apache.spark.ui.UIUtils
 
UIWorkloadGenerator - Class in org.apache.spark.ui
Continuously generates jobs that expose various features of the WebUI (internal testing tool).
UIWorkloadGenerator() - Constructor for class org.apache.spark.ui.UIWorkloadGenerator
 
unapply(Object) - Method in class org.apache.spark.sql.hive.HiveQl.Token$
 
unapply(String) - Static method in class org.apache.spark.util.IntParam
 
unapply(String) - Static method in class org.apache.spark.util.MemoryParam
 
UnaryNode - Interface in org.apache.spark.sql.execution
 
UnaryTransformer<IN,OUT,T extends UnaryTransformer<IN,OUT,T>> - Class in org.apache.spark.ml
Abstract class for transformers that take one input column, apply transformation, and output the result as a new column.
UnaryTransformer() - Constructor for class org.apache.spark.ml.UnaryTransformer
 
unBlockifyObject(ByteBuffer[], Serializer, Option<CompressionCodec>, ClassTag<T>) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
 
unbound() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.BroadcastManager
 
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
Remove all persisted state associated with the HTTP broadcast with the given ID.
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
Remove all persisted state associated with the torrent broadcast with the given ID.
uncacheQuery(SchemaRDD, boolean) - Method in interface org.apache.spark.sql.CacheManager
Removes the data for the given SchemaRDD from the cache
uncacheTable(String) - Method in interface org.apache.spark.sql.CacheManager
Removes the specified table from the in-memory cache.
UncacheTableCommand - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
UncacheTableCommand(String) - Constructor for class org.apache.spark.sql.execution.UncacheTableCommand
 
UNCAUGHT_EXCEPTION() - Static method in class org.apache.spark.util.SparkExitCode
The default uncaught exception handler was reached.
UNCAUGHT_EXCEPTION_TWICE() - Static method in class org.apache.spark.util.SparkExitCode
The default uncaught exception handler was called and an exception was encountered while logging the exception.
uncaughtException(Thread, Throwable) - Static method in class org.apache.spark.util.SparkUncaughtExceptionHandler
 
uncaughtException(Throwable) - Static method in class org.apache.spark.util.SparkUncaughtExceptionHandler
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
uncompressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
underlyingBuffer() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
 
UniformGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
 
uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
RandomRDDs.uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.
uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d.
uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d.
union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the union of this RDD and another one.
union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the union of this RDD and another one.
union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return the union of this RDD and another one.
union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return the union of this RDD and another one.
union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Build the union of a list of RDDs.
union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Build the union of a list of RDDs passed as variable-length arguments.
Union - Class in org.apache.spark.sql.execution
:: DeveloperApi ::
Union(Seq<SparkPlan>) - Constructor for class org.apache.spark.sql.execution.Union
 
union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream by unifying data of another DStream with this DStream.
union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by unifying data of another DStream with this DStream.
union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by unifying data of another DStream with this DStream.
union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
unionAll(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD
Combines the tuples of two RDDs with the same schema, keeping duplicates.
UnionDStream<T> - Class in org.apache.spark.streaming.dstream
 
UnionDStream(DStream<T>[], ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.UnionDStream
 
UnionPartition<T> - Class in org.apache.spark.rdd
Partition for UnionRDD.
UnionPartition(int, RDD<T>, int, int, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionPartition
 
UnionRDD<T> - Class in org.apache.spark.rdd
 
UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
 
uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
 
UniqueKeyHashedRelation - Class in org.apache.spark.sql.execution.joins
A specialized HashedRelation that maps key into a single value.
UniqueKeyHashedRelation(HashMap<Row, Row>) - Constructor for class org.apache.spark.sql.execution.joins.UniqueKeyHashedRelation
 
UnknownReason - Class in org.apache.spark
:: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.
UnknownReason() - Constructor for class org.apache.spark.UnknownReason
 
unorderedFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.broadcast.Broadcast
Asynchronously delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast
Delete cached copies of this broadcast on the executors.
unpersist(long, boolean, boolean) - Static method in class org.apache.spark.broadcast.HttpBroadcast
Remove all persisted blocks associated with this HTTP broadcast on the executors.
unpersist(long, boolean, boolean) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
Remove all persisted blocks associated with this torrent broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.graphx.Graph
Uncaches both vertices and edges of this graph.
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Unpersist intermediate RDDs used in the computation.
unpersist(boolean) - Method in class org.apache.spark.rdd.RDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.SchemaRDD
 
unpersistRDD(int, boolean) - Method in class org.apache.spark.SparkContext
Unpersist an RDD from memory and/or disk storage
unpersistRDDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
unpersistRDDToJson(SparkListenerUnpersistRDD) - Static method in class org.apache.spark.util.JsonProtocol
 
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.Graph
Uncaches only the vertices of this graph, leaving the edges alone.
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
unregisterAllTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
unregisterMapOutput(int, int, BlockManagerId) - Method in class org.apache.spark.MapOutputTrackerMaster
Unregister map output information of the given shuffle, mapper and block manager
unregisterShuffle(int) - Method in class org.apache.spark.MapOutputTracker
Unregister shuffle data.
unregisterShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster
Unregister shuffle data
unregisterTable(Seq<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
UNIMPLEMENTED: It needs to be decided how we will persist in-memory tables to the metastore.
unrollSafely(BlockId, Iterator<Object>, ArrayBuffer<Tuple2<BlockId, BlockStatus>>) - Method in class org.apache.spark.storage.MemoryStore
Unroll the given block in memory safely.
unset() - Static method in class org.apache.spark.TaskContextHelper
 
until(Time, Duration) - Method in class org.apache.spark.streaming.Time
 
unwrap(Object, ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
Converts hive types to native catalyst types.
update(RDD<Vector>, double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
Perform a k-means update on a batch of data.
update(int, int, double) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
update(int, int, double) - Method in interface org.apache.spark.mllib.linalg.Matrix
Update element at (i, j)
update(int, int, double) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
update(int, int, double, double) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Update the stats for a given (feature, bin) for ordered features, using the given label.
update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.EntropyAggregator
Update stats for one (node, feature, bin) with the given label.
update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.GiniAggregator
Update stats for one (node, feature, bin) with the given label.
update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
Update stats for one (node, feature, bin) with the given label.
update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.VarianceAggregator
Update stats for one (node, feature, bin) with the given label.
update() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
update() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
 
update(Row) - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
update(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
Updates the checkpoint data of the DStream.
update(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
update(T1, T2) - Method in class org.apache.spark.util.MutablePair
Updates this pair with new values and returns itself
update(A, B) - Method in class org.apache.spark.util.TimeStampedHashMap
 
update(A, B) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
UPDATE_PERIOD() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
updateAggregateMetrics(UIData.StageUIData, String, TaskMetrics, Option<TaskMetrics>) - Method in class org.apache.spark.ui.jobs.JobProgressListener
Upon receiving new metrics for a task, updates the per-stage and per-executor-per-stage aggregate metrics by calculating deltas between the currently recorded metrics and the new metrics.
updateBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus
Update the given block in this storage status.
updateBlockInfo(BlockId, StorageLevel, long, long, long) - Method in class org.apache.spark.storage.BlockManagerInfo
 
updateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Method in class org.apache.spark.storage.BlockManagerMaster
 
updateCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Refresh the list of checkpointed RDDs that will be saved along with checkpoint of this stream.
updateCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
updatedConf(SparkConf, String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.SparkContext
Creates a modified version of a SparkConf with the parameters that can be passed separately to SparkContext, to make it easier to write SparkContext's constructors.
updateEpoch(long) - Method in class org.apache.spark.MapOutputTracker
Called from executors to update the epoch number, potentially clearing old outputs because of a fetch failure.
updateLastSeenMs() - Method in class org.apache.spark.storage.BlockManagerInfo
 
updateNodeIndex(int[], Bin[][]) - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
Determine a child node index based on the feature value and the split.
updateNodeIndices(RDD<BaggedPoint<TreePoint>>, Map<Object, NodeIndexUpdater>[], Bin[][]) - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
Update the node index values in the cache.
Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to perform steps (weight update) using Gradient Descent methods.
Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
 
updateRddInfo(Seq<RDDInfo>, Seq<StorageStatus>) - Static method in class org.apache.spark.storage.StorageUtils
Update the given list of RDDInfo with the given list of storage statuses.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateVertices(Iterator<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Return a new `EdgePartition` with updates to vertex attributes specified in `iter`.
updateVertices(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Return a new ReplicatedVertexView where vertex attributes in edge partition are updated using updates.
upgrade(VertexRDD<VD>, boolean, boolean) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Upgrade the shipping level in-place to the specified levels by shipping vertex attributes from vertices.
upper() - Method in class org.apache.spark.rdd.JdbcPartition
 
UPPER() - Static method in class org.apache.spark.sql.hive.HiveQl
 
upperBound() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
uri() - Method in class org.apache.spark.HttpServer
Get the URI of this HTTP server (http://host:port)
useCachedData(LogicalPlan) - Method in interface org.apache.spark.sql.CacheManager
Replaces segments of the given logical plan with cached versions where possible.
useCompression() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
useCompression() - Method in interface org.apache.spark.sql.SQLConf
When true tables cached using the in-memory columnar caching will be compressed.
useDisk() - Method in class org.apache.spark.storage.StorageLevel
 
useDst - Variable in class org.apache.spark.graphx.TripletFields
Indicates whether the destination vertex attribute is included.
useEdge - Variable in class org.apache.spark.graphx.TripletFields
Indicates whether the edge attribute is included.
useMemory() - Method in class org.apache.spark.storage.StorageLevel
 
useNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
 
user() - Method in class org.apache.spark.mllib.recommendation.Rating
 
user() - Method in class org.apache.spark.scheduler.JobLogger
 
userClass() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
userClass() - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
 
userClass() - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
 
userClass() - Method in class org.apache.spark.sql.api.java.UserDefinedType
Class object for the UserType
userClass() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
UserDefinedType<UserType> - Class in org.apache.spark.sql.api.java
::DeveloperApi:: The data type representing User-Defined Types (UDTs).
userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
useSrc - Variable in class org.apache.spark.graphx.TripletFields
Indicates whether the source vertex attribute is included.
Utils - Class in org.apache.spark.util
Various utility methods used by Spark.
Utils() - Constructor for class org.apache.spark.util.Utils
 
UUIDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
UUIDToJson(UUID) - Static method in class org.apache.spark.util.JsonProtocol
 

V

V() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
validate(ParamMap) - Method in interface org.apache.spark.ml.param.Params
Validates parameter values stored internally plus the input parameter map.
validate() - Method in interface org.apache.spark.ml.param.Params
Validates parameter values stored internally.
validate() - Method in class org.apache.spark.streaming.Checkpoint
 
validate() - Method in class org.apache.spark.streaming.dstream.DStream
 
validate() - Method in class org.apache.spark.streaming.DStreamGraph
 
validateAndTransformSchema(StructType, ParamMap, boolean) - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
Validates and transforms the input schema with the provided param map.
validateSettings() - Method in class org.apache.spark.SparkConf
Checks for illegal or deprecated config settings.
value() - Method in class org.apache.spark.Accumulable
Access the accumulator's current value; only allowed on master.
value() - Method in class org.apache.spark.broadcast.Broadcast
Get the broadcasted value.
value() - Method in class org.apache.spark.ComplexFutureAction
 
value() - Method in interface org.apache.spark.FutureAction
The value of this Future.
value() - Method in class org.apache.spark.ml.param.ParamPair
 
value() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
value() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
value() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
value() - Method in class org.apache.spark.SerializableWritable
 
value() - Method in class org.apache.spark.SimpleFutureAction
 
value() - Method in class org.apache.spark.sql.sources.EqualTo
 
value() - Method in class org.apache.spark.sql.sources.GreaterThan
 
value() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
 
value() - Method in class org.apache.spark.sql.sources.LessThan
 
value() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
 
value() - Method in class org.apache.spark.storage.MemoryEntry
 
value() - Method in class org.apache.spark.util.SerializableBuffer
 
value() - Method in class org.apache.spark.util.TimeStampedValue
 
value_() - Method in class org.apache.spark.broadcast.HttpBroadcast
 
valueBytes() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
valueClass() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
valueOf(String) - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.JobExecutionStatus
Returns the enum constant of this type with the specified name.
values() - Static method in class org.apache.spark.Accumulators
 
values() - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the values of each tuple.
values() - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
values() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
values() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
values() - Static method in enum org.apache.spark.JobExecutionStatus
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
values() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
values() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
values() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
values() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the values of each tuple.
values() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
values() - Method in class org.apache.spark.sql.sources.In
 
variance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
variance() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
variance() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample variance vector.
Variance - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating variance during regression
Variance() - Constructor for class org.apache.spark.mllib.tree.impurity.Variance
 
variance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.util.StatCounter
Return the variance of the values.
VarianceAggregator - Class in org.apache.spark.mllib.tree.impurity
Class for updating views of a vector of sufficient statistics, in order to compute impurity from a sample.
VarianceAggregator() - Constructor for class org.apache.spark.mllib.tree.impurity.VarianceAggregator
 
VarianceCalculator - Class in org.apache.spark.mllib.tree.impurity
Stores statistics for one (node, feature, bin) for calculating impurity.
VarianceCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.VarianceCalculator
 
vClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
 
vClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
 
vClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
vector() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
 
vector() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
Vector - Interface in org.apache.spark.mllib.linalg
Represents a numeric vector, whose index type is Int and value type is Double.
Vector - Class in org.apache.spark.util
 
Vector(double[]) - Constructor for class org.apache.spark.util.Vector
 
Vector.Multiplier - Class in org.apache.spark.util
 
Vector.Multiplier(double) - Constructor for class org.apache.spark.util.Vector.Multiplier
 
Vector.VectorAccumParam$ - Class in org.apache.spark.util
 
Vector.VectorAccumParam$() - Constructor for class org.apache.spark.util.Vector.VectorAccumParam$
 
Vectors - Class in org.apache.spark.mllib.linalg
 
Vectors() - Constructor for class org.apache.spark.mllib.linalg.Vectors
 
VectorTransformer - Interface in org.apache.spark.mllib.feature
:: DeveloperApi :: Trait for transformation of a vector
VectorUDT - Class in org.apache.spark.mllib.linalg
User-defined type for Vector which allows easy interaction with SQL via SchemaRDD.
VectorUDT() - Constructor for class org.apache.spark.mllib.linalg.VectorUDT
 
VectorWithNorm - Class in org.apache.spark.mllib.clustering
A vector with its norm for fast distance computation.
VectorWithNorm(Vector, double) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
VectorWithNorm(Vector) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
VectorWithNorm(double[]) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
version() - Method in class org.apache.spark.api.java.JavaSparkContext
The version of Spark on which this application is running.
version() - Method in class org.apache.spark.SparkContext
The version of Spark on which this application is running.
version() - Static method in class org.apache.spark.sql.hive.HiveShim
 
vertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet
Get the vertex object for the given vertex in the edge.
VertexAttributeBlock<VD> - Class in org.apache.spark.graphx.impl
Stores vertex attributes to ship to an edge partition.
VertexAttributeBlock(long[], Object, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexAttributeBlock
 
VertexPartition<VD> - Class in org.apache.spark.graphx.impl
A map from vertex id to vertex attribute.
VertexPartition(OpenHashSet<Object>, Object, BitSet, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartition
 
VertexPartition.VertexPartitionOpsConstructor$ - Class in org.apache.spark.graphx.impl
Implicit evidence that VertexPartition is a member of the VertexPartitionBaseOpsConstructor typeclass.
VertexPartition.VertexPartitionOpsConstructor$() - Constructor for class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
 
VertexPartitionBase<VD> - Class in org.apache.spark.graphx.impl
An abstract map from vertex id to vertex attribute.
VertexPartitionBase(ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionBase
 
VertexPartitionBaseOps<VD,Self extends VertexPartitionBase<Object>> - Class in org.apache.spark.graphx.impl
An class containing additional operations for subclasses of VertexPartitionBase that provide implicit evidence of membership in the VertexPartitionBaseOpsConstructor typeclass (for example, VertexPartition.VertexPartitionOpsConstructor).
VertexPartitionBaseOps(Self, ClassTag<VD>, VertexPartitionBaseOpsConstructor<Self>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
VertexPartitionBaseOpsConstructor<T extends VertexPartitionBase<Object>> - Interface in org.apache.spark.graphx.impl
A typeclass for subclasses of VertexPartitionBase representing the ability to wrap them in a VertexPartitionBaseOps.
VertexPartitionOps<VD> - Class in org.apache.spark.graphx.impl
 
VertexPartitionOps(VertexPartition<VD>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionOps
 
VertexRDD<VD> - Class in org.apache.spark.graphx
Extends RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by pre-indexing the entries for fast, efficient joins.
VertexRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.VertexRDD
 
VertexRDDImpl<VD> - Class in org.apache.spark.graphx.impl
 
VertexRDDImpl(RDD<ShippableVertexPartition<VD>>, StorageLevel, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexRDDImpl
 
vertices() - Method in class org.apache.spark.graphx.Graph
An RDD containing the vertices and their associated attributes.
vertices() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
vids() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
viewAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
visit(int, int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.FieldAccessFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.ReturnStatementFinder
 
vManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
VocabWord - Class in org.apache.spark.mllib.feature
Entry in vocabulary
VocabWord(String, int, int[], int[], int) - Constructor for class org.apache.spark.mllib.feature.VocabWord
 
VoidFunction<T> - Interface in org.apache.spark.api.java.function
A function with no return value.
Vote() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 

W

w(boolean) - Method in class org.apache.spark.ml.param.BooleanParam
 
w(double) - Method in class org.apache.spark.ml.param.DoubleParam
 
w(float) - Method in class org.apache.spark.ml.param.FloatParam
 
w(int) - Method in class org.apache.spark.ml.param.IntParam
 
w(long) - Method in class org.apache.spark.ml.param.LongParam
 
w(T) - Method in class org.apache.spark.ml.param.Param
Creates a param pair with the given value (for Java).
waiter() - Method in class org.apache.spark.streaming.StreamingContext
 
waitForAsyncReregister() - Method in class org.apache.spark.storage.BlockManager
For testing.
waitForProcess(Process, long) - Static method in class org.apache.spark.util.Utils
Wait for a process to terminate for at most the specified duration.
waitForReady() - Method in class org.apache.spark.storage.BlockInfo
Wait for this BlockInfo to be marked as ready (i.e.
waitForRegister() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
waitForRegister() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
waitForStopOrError(long) - Method in class org.apache.spark.streaming.ContextWaiter
Return true if it's stopped; or throw the reported error if notifyError has been called; or false if the waiting time detectably elapsed before return from the method.
waitingBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
waitingStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
waitList() - Method in class org.apache.spark.util.random.AcceptanceResult
 
waitListBound() - Method in class org.apache.spark.util.random.AcceptanceResult
 
waitTillTime(long) - Method in interface org.apache.spark.streaming.util.Clock
 
waitTillTime(long) - Method in class org.apache.spark.streaming.util.ManualClock
 
waitTillTime(long) - Method in class org.apache.spark.streaming.util.SystemClock
 
waitToPush() - Method in class org.apache.spark.streaming.receiver.RateLimiter
 
waitUntilEmpty(int) - Method in class org.apache.spark.scheduler.LiveListenerBus
For testing only.
waitUntilEmpty(int) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
Waits until there are no more events in the queue, or until the specified time has elapsed.
warehousePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
 
warehousePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
warmUp(SparkContext) - Static method in class org.apache.spark.streaming.util.RawTextHelper
Warms up the SparkContext in master and slave by running tasks to force JIT kick in before real workload starts.
WebUI - Class in org.apache.spark.ui
The top level component of the UI hierarchy that contains the server.
WebUI(SecurityManager, int, SparkConf, String, String) - Constructor for class org.apache.spark.ui.WebUI
 
WebUIPage - Class in org.apache.spark.ui
A page that represents the leaf node in the UI hierarchy.
WebUIPage(String) - Constructor for class org.apache.spark.ui.WebUIPage
 
WebUITab - Class in org.apache.spark.ui
A tab that represents a collection of pages.
WebUITab(WebUI, String) - Constructor for class org.apache.spark.ui.WebUITab
 
weight() - Method in class org.apache.spark.scheduler.Pool
 
weight() - Method in interface org.apache.spark.scheduler.Schedulable
 
weight() - Method in class org.apache.spark.scheduler.TaskSetManager
 
WEIGHT_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
weightedFalsePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted false positive rate
weightedFMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted averaged f-measure
weightedFMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted averaged f1-measure
weightedPrecision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted averaged precision
weightedRecall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted averaged recall (equals to precision, recall and f-measure)
weightedTruePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted true positive rate (equals to precision, recall and f-measure)
weights() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
weights() - Method in class org.apache.spark.mllib.classification.SVMModel
 
weights() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
weights() - Method in class org.apache.spark.mllib.regression.LassoModel
 
weights() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
weights() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
WHEN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
where(Expression) - Method in class org.apache.spark.sql.SchemaRDD
Filters the output, only returning those rows where condition evaluates to true.
where(Symbol, Function1<T1, Object>) - Method in class org.apache.spark.sql.SchemaRDD
Filters tuples using a function over the value of the specified column.
where(Function1<DynamicRow, Object>) - Method in class org.apache.spark.sql.SchemaRDD
:: Experimental :: Filters tuples using a function over a Dynamic version of a given Row.
WholeCombineFileRecordReader - Class in org.apache.spark.input
A RecordReader for reading a single whole text file out in a key-value pair, where the key is the file path and the value is the entire content of the file.
WholeCombineFileRecordReader(InputSplit, TaskAttemptContext) - Constructor for class org.apache.spark.input.WholeCombineFileRecordReader
 
WholeTextFileInputFormat - Class in org.apache.spark.input
A CombineFileInputFormat for reading whole text files.
WholeTextFileInputFormat() - Constructor for class org.apache.spark.input.WholeTextFileInputFormat
 
WholeTextFileRDD - Class in org.apache.spark.rdd
Analogous to MapPartitionsRDD, but passes in an InputSplit to the given function rather than the index of the partition.
WholeTextFileRDD(SparkContext, Class<? extends WholeTextFileInputFormat>, Class<String>, Class<String>, Configuration, int) - Constructor for class org.apache.spark.rdd.WholeTextFileRDD
 
WholeTextFileRecordReader - Class in org.apache.spark.input
A RecordReader for reading a single whole text file out in a key-value pair, where the key is the file path and the value is the entire content of the file.
WholeTextFileRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.WholeTextFileRecordReader
 
wholeTextFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String, int) - Method in class org.apache.spark.SparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
windowDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
windowDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
WindowedDStream<T> - Class in org.apache.spark.streaming.dstream
 
WindowedDStream(DStream<T>, Duration, Duration, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.WindowedDStream
 
windowsDrive() - Static method in class org.apache.spark.util.Utils
Pattern for matching a Windows drive, which contains only a single alphabet character.
windowSize() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
wipe() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
withActiveSet(Iterator<Object>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Return a new `EdgePartition` with the specified active set, provided as an iterator.
withActiveSet(VertexRDD<?>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Return a new ReplicatedVertexView where the activeSet in each edge partition contains only vertex ids present in actives.
withChildren(Seq<ASTNode>) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Returns this ASTNode with the children changed to newChildren.
WithCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
 
withData(Object, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Return a new `EdgePartition` with the specified edge data.
withEdges(EdgeRDDImpl<ED2, VD2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Return a new ReplicatedVertexView with the specified EdgeRDD, which must have the same shipping level.
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.VertexRDD
Prepares this VertexRDD for efficient joins with the given EdgeRDD.
withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
withMean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
withOutput(Seq<Attribute>) - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
withoutVertexAttributes(ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Return a new `EdgePartition` without any locally cached vertex attributes.
withPartitionsRDD(RDD<Tuple2<Object, EdgePartition<ED2, VD2>>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
withPartitionsRDD(RDD<ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withPartitionsRDD(RDD<ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Replaces the vertex partitions while preserving all other properties of the VertexRDD.
withReplacement() - Method in class org.apache.spark.sql.execution.Sample
 
withRoutingTable(RoutingTablePartition) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Return a new ShippableVertexPartition with the specified routing table.
withStd() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.EdgeRDD
Changes the target storage level while preserving all other properties of the EdgeRDD.
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.VertexRDD
Changes the target storage level while preserving all other properties of the VertexRDD.
withText(String) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Returns this ASTNode with the text changed to newText.
withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
word() - Method in class org.apache.spark.mllib.feature.VocabWord
 
Word2Vec - Class in org.apache.spark.mllib.feature
:: Experimental :: Word2Vec creates vector representation of words in a text corpus.
Word2Vec() - Constructor for class org.apache.spark.mllib.feature.Word2Vec
 
Word2VecModel - Class in org.apache.spark.mllib.feature
:: Experimental :: Word2Vec model
Word2VecModel(Map<String, float[]>) - Constructor for class org.apache.spark.mllib.feature.Word2VecModel
 
worker() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
worker() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
workerId() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
WorkerOffer - Class in org.apache.spark.scheduler
Represents free resources available on an executor.
WorkerOffer(String, String, int) - Constructor for class org.apache.spark.scheduler.WorkerOffer
 
wrap(Object, ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
Converts native catalyst types to the types expected by Hive
wrap(Seq<Object>, Seq<ObjectInspector>, Object[]) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
wrapAsJava(UserDefinedType<?>) - Static method in class org.apache.spark.sql.api.java.UDTWrappers
 
wrapAsScala(UserDefinedType<?>) - Static method in class org.apache.spark.sql.api.java.UDTWrappers
 
wrapForCompression(BlockId, OutputStream) - Method in class org.apache.spark.storage.BlockManager
Wrap an output stream for compression if block compression is enabled for its block type
wrapForCompression(BlockId, InputStream) - Method in class org.apache.spark.storage.BlockManager
Wrap an input stream for compression if block compression is enabled for its block type
wrapperClass() - Static method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
wrapperFor(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
Wraps with Hive types based on object inspector.
wrapperToFileSinkDesc(ShimFileSinkDesc) - Static method in class org.apache.spark.sql.hive.HiveShim
 
wrapRDD(RDD<Double>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaPairRDD
 
wrapRDD(RDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
 
wrapRDD(RDD<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
wrapRDD(RDD<Row>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
wrapRDD(RDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
wrapRDD(RDD<T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
writableClass() - Method in class org.apache.spark.WritableConverter
 
WritableConverter<T> - Class in org.apache.spark
A class encapsulating how to convert some type T to Writable.
WritableConverter(Function1<ClassTag<T>, Class<? extends Writable>>, Function1<Writable, T>) - Constructor for class org.apache.spark.WritableConverter
 
writableWritableConverter() - Static method in class org.apache.spark.SparkContext
 
write(Kryo, Output, Iterable<?>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
write(Object, Object) - Method in class org.apache.spark.SparkHadoopWriter
 
write(Kryo, Output, BigDecimal) - Method in class org.apache.spark.sql.execution.BigDecimalSerializer
 
write(Kryo, Output, HyperLogLog) - Method in class org.apache.spark.sql.execution.HyperLogLogSerializer
 
write(Kryo, Output, IntegerHashSet) - Method in class org.apache.spark.sql.execution.IntegerHashSetSerializer
 
write(Kryo, Output, LongHashSet) - Method in class org.apache.spark.sql.execution.LongHashSetSerializer
 
write(Kryo, Output, OpenHashSet<?>) - Method in class org.apache.spark.sql.execution.OpenHashSetSerializer
 
write(Row) - Method in class org.apache.spark.sql.parquet.MutableRowWriteSupport
 
write(Row) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
write(Group) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
write(Object) - Method in class org.apache.spark.storage.BlockObjectWriter
Writes an object.
write(Object) - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
write(Checkpoint) - Method in class org.apache.spark.streaming.CheckpointWriter
 
write(int) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(byte[]) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(byte[], int, int) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(ByteBuffer) - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter
Write the bytebuffer to the log file
write(int) - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
write(byte[], int, int) - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
WriteAheadLogBackedBlockRDD<T> - Class in org.apache.spark.streaming.rdd
This class represents a special case of the BlockRDD where the data blocks in the block manager are also backed by segments in write ahead logs.
WriteAheadLogBackedBlockRDD(SparkContext, BlockId[], WriteAheadLogFileSegment[], boolean, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
 
WriteAheadLogBackedBlockRDDPartition - Class in org.apache.spark.streaming.rdd
Partition class for WriteAheadLogBackedBlockRDD.
WriteAheadLogBackedBlockRDDPartition(int, BlockId, WriteAheadLogFileSegment) - Constructor for class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
WriteAheadLogBasedBlockHandler - Class in org.apache.spark.streaming.receiver
Implementation of a ReceivedBlockHandler which stores the received blocks in both, a write ahead log and a block manager.
WriteAheadLogBasedBlockHandler(BlockManager, int, StorageLevel, SparkConf, Configuration, String, Clock) - Constructor for class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
WriteAheadLogBasedStoreResult - Class in org.apache.spark.streaming.receiver
Implementation of ReceivedBlockStoreResult that stores the metadata related to storage of blocks using WriteAheadLogBasedBlockHandler
WriteAheadLogBasedStoreResult(StreamBlockId, WriteAheadLogFileSegment) - Constructor for class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
WriteAheadLogFileSegment - Class in org.apache.spark.streaming.util
Class for representing a segment of data in a write ahead log file
WriteAheadLogFileSegment(String, long, int) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
WriteAheadLogManager - Class in org.apache.spark.streaming.util
This class manages write ahead log files.
WriteAheadLogManager(String, Configuration, int, int, String, Clock) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager
 
WriteAheadLogManager.LogInfo - Class in org.apache.spark.streaming.util
 
WriteAheadLogManager.LogInfo(long, long, String) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
WriteAheadLogManager.LogInfo$ - Class in org.apache.spark.streaming.util
 
WriteAheadLogManager.LogInfo$() - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo$
 
WriteAheadLogRandomReader - Class in org.apache.spark.streaming.util
A random access reader for reading write ahead log files written using WriteAheadLogWriter.
WriteAheadLogRandomReader(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
WriteAheadLogReader - Class in org.apache.spark.streaming.util
A reader for reading write ahead log files written using WriteAheadLogWriter.
WriteAheadLogReader(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogReader
 
WriteAheadLogWriter - Class in org.apache.spark.streaming.util
A writer for writing byte-buffers to a write ahead log file.
WriteAheadLogWriter(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogWriter
 
writeAll(Iterator<T>, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
 
writeArray(ArrayType, Seq<Object>) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeByteBuffer(ByteBuffer, ObjectOutput) - Static method in class org.apache.spark.util.Utils
Primitive often used when writing ByteBuffer to DataOutput
writeDecimal(Decimal, int) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.DirectTaskResult
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.serializer.JavaSerializer
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerId
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.StorageLevel
 
writeExternal(ObjectOutput, Map<CharSequence, CharSequence>, byte[]) - Static method in class org.apache.spark.streaming.flume.EventTransformer
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
writeFile() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeFilterFile(int) - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeLock(Function0<A>) - Method in interface org.apache.spark.sql.CacheManager
Acquires a write lock on the cache for the duration of `f`.
writeMap(MapType, Map<?, Object>) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeMetaData(Seq<Attribute>, Path, Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
writeNestedFile1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializationStream
Calling reset to avoid memory leak: http://stackoverflow.com/questions/1281549/memory-leak-traps-in-the-java-standard-api But only call it every 100th time to avoid bloated serialization streams (when the stream 'resets' object class descriptions have to be re-written)
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializationStream
 
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
 
writePrimitive(PrimitiveType, Object) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writer() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeStruct(StructType, Seq<Object>) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeToFile(String, Broadcast<SerializableWritable<Configuration>>, int, TaskContext, Iterator<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
writeToLog(ByteBuffer) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Write a byte buffer to the log file.
writeValue(DataType, Object) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 

X

x() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
x() - Method in class org.apache.spark.sql.test.ExamplePoint
 
XORShiftRandom - Class in org.apache.spark.util.random
This class implements a XORShift random number generator algorithm Source: Marsaglia, G.
XORShiftRandom(long) - Constructor for class org.apache.spark.util.random.XORShiftRandom
 
XORShiftRandom() - Constructor for class org.apache.spark.util.random.XORShiftRandom
 

Y

y() - Method in class org.apache.spark.sql.test.ExamplePoint
 
YarnSchedulerBackend - Class in org.apache.spark.scheduler.cluster
Abstract Yarn scheduler backend that contains common logic between the client and cluster Yarn scheduler backends.
YarnSchedulerBackend(TaskSchedulerImpl, SparkContext) - Constructor for class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 

Z

zero() - Method in class org.apache.spark.Accumulable
 
zero(R) - Method in interface org.apache.spark.AccumulableParam
Return the "zero" (identity) value for an accumulator type, given its initial value.
zero(R) - Method in class org.apache.spark.GrowableAccumulableParam
 
zero(double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
zero(float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
zero(int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
 
zero(long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
 
zero(Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
 
ZeroMQReceiver<T> - Class in org.apache.spark.streaming.zeromq
A receiver to subscribe to ZeroMQ stream.
ZeroMQReceiver(String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
ZeroMQUtils - Class in org.apache.spark.streaming.zeromq
 
ZeroMQUtils() - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQUtils
 
zeros(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a DenseMatrix consisting of zeros.
zeros(int) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector of all zeros.
zeros(int) - Static method in class org.apache.spark.util.Vector
 
zeroTime() - Method in class org.apache.spark.streaming.dstream.DStream
 
zeroTime() - Method in class org.apache.spark.streaming.DStreamGraph
 
zip(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zip(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zipPartitions(JavaRDDLike<U, ?>, FlatMapFunction2<Iterator<T>, Iterator<U>, V>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD<B>, boolean, Function2<Iterator<T>, Iterator<B>, Iterator<V>>, ClassTag<B>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD<B>, Function2<Iterator<T>, Iterator<B>, Iterator<V>>, ClassTag<B>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, boolean, Function3<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, Function3<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, RDD<D>, boolean, Function4<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, RDD<D>, Function4<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
ZippedPartitionsBaseRDD<V> - Class in org.apache.spark.rdd
 
ZippedPartitionsBaseRDD(SparkContext, Seq<RDD<?>>, boolean, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
ZippedPartitionsPartition - Class in org.apache.spark.rdd
 
ZippedPartitionsPartition(int, Seq<RDD<?>>, Seq<String>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsPartition
 
ZippedPartitionsRDD2<A,B,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD2(SparkContext, Function2<Iterator<A>, Iterator<B>, Iterator<V>>, RDD<A>, RDD<B>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD2
 
ZippedPartitionsRDD3<A,B,C,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD3(SparkContext, Function3<Iterator<A>, Iterator<B>, Iterator<C>, Iterator<V>>, RDD<A>, RDD<B>, RDD<C>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD3
 
ZippedPartitionsRDD4<A,B,C,D,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD4(SparkContext, Function4<Iterator<A>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, RDD<A>, RDD<B>, RDD<C>, RDD<D>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD4
 
ZippedWithIndexRDD<T> - Class in org.apache.spark.rdd
Represents a RDD zipped with its element indices.
ZippedWithIndexRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ZippedWithIndexRDD
 
ZippedWithIndexRDDPartition - Class in org.apache.spark.rdd
 
ZippedWithIndexRDDPartition(Partition, long) - Constructor for class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
zipWithIndex() - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with its element indices.
zipWithIndex() - Method in class org.apache.spark.rdd.RDD
Zips this RDD with its element indices.
zipWithUniqueId() - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with generated unique Long ids.
zipWithUniqueId() - Method in class org.apache.spark.rdd.RDD
Zips this RDD with generated unique Long ids.

_

_1() - Method in class org.apache.spark.util.MutablePair
 
_2() - Method in class org.apache.spark.util.MutablePair
 
_message() - Method in class org.apache.spark.scheduler.SlaveLost
 
_rddInfoMap() - Method in class org.apache.spark.ui.storage.StorageListener
 
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _