Index (Spark 1.2.1 JavaDoc)

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _

A

abort(String) - Method in class org.apache.spark.scheduler.TaskSetManager
abortStage(Stage, String) - Method in class org.apache.spark.scheduler.DAGScheduler: Aborts all jobs depending on a particular Stage.
AbsoluteError - Class in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Class for absolute error loss calculation (for regression).
AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
accept(File, String) - Method in class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
AcceptanceResult - Class in org.apache.spark.util.random: Object used by seqOp to keep track of the number of items accepted and items waitlisted per stratum, as well as the bounds for accepting and waitlisting items.
AcceptanceResult(long, long) - Constructor for class org.apache.spark.util.random.AcceptanceResult
acceptBound() - Method in class org.apache.spark.util.random.AcceptanceResult
Accumulable<R,T> - Class in org.apache.spark: A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext: Create an Accumulable shared variable, to which tasks can add values with +=.
accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext: Create an Accumulable shared variable, with a name for display in the Spark UI.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext: Create an accumulator from a "mutable collection" type.
AccumulableInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Information about an Accumulable modified during a task or stage.
AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
AccumulableParam<R,T> - Interface in org.apache.spark: Helper object defining how to accumulate values of a particular type.
accumulables() - Method in class org.apache.spark.scheduler.StageInfo: Terminal values of accumulables updated during this stage.
accumulables() - Method in class org.apache.spark.scheduler.TaskInfo: Intermediate updates to accumulables during this task.
accumulables() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
Accumulator<T> - Class in org.apache.spark: A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.e.
Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the += method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Create an Accumulator variable of a given type, with a name for display in the Spark UI.
accumulator() - Method in class org.apache.spark.sql.execution.PythonUDF
AccumulatorParam<T> - Interface in org.apache.spark: A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.
Accumulators - Class in org.apache.spark
Accumulators() - Constructor for class org.apache.spark.Accumulators
accumUpdates() - Method in class org.apache.spark.scheduler.CompletionEvent
accumUpdates() - Method in class org.apache.spark.scheduler.DirectTaskResult
accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns accuracy
aclsEnabled() - Method in class org.apache.spark.SecurityManager: Check to see if Acls for the UI are enabled
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
activeExecutorIds() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
ActiveJob - Class in org.apache.spark.scheduler: Tracks information about an active job in the DAGScheduler.
ActiveJob(int, Stage, Function2<TaskContext, Iterator<Object>, ?>, int[], CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.ActiveJob
activeJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
activeTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
activeTaskSets() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
ActorHelper - Interface in org.apache.spark.streaming.receiver: :: DeveloperApi :: A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.
ActorLogReceive - Interface in org.apache.spark.util: A trait to enable logging all Akka actor messages.
ActorReceiver<T> - Class in org.apache.spark.streaming.receiver: Provides Actors as receivers for receiving stream.
ActorReceiver(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver
ActorReceiver.Supervisor - Class in org.apache.spark.streaming.receiver
ActorReceiver.Supervisor() - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
ActorReceiverData - Interface in org.apache.spark.streaming.receiver: Case class to receive data sent by child actors
actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: A helper with set of defaults for supervisor strategy
ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
actorSystem() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
actorSystem() - Method in class org.apache.spark.SparkEnv
actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType: Returns the size of the value row(ordinal).
actualSize(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
add(T) - Method in class org.apache.spark.Accumulable: Add more data to this accumulator / accumulable
add(Map<Object, Object>) - Static method in class org.apache.spark.Accumulators
add(long, long, ED) - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder: Add a new edge to the partition.
add(long, long, int, int, ED) - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder: Add a new edge to the partition.
add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Adds a new document.
add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Add a new sample to this summarizer, and update the statistical summary.
add(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator: Add the stats from another calculator into this one, modifying and returning this calculator.
add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
add(Vector) - Method in class org.apache.spark.util.Vector
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam: Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
addAccumulator(R, T) - Method in class org.apache.spark.GrowableAccumulableParam
addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
addBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus: Add the given block to this storage status.
AddBlock - Class in org.apache.spark.streaming.scheduler
AddBlock(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.AddBlock
addBlock(ReceivedBlockInfo) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker: Add received block.
addBoolean(boolean) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
addData(Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator: Push a single data item into the buffer.
addDataWithCallback(Object, Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator: Push a single data item into the buffer.
addDouble(double) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
addedFiles() - Method in class org.apache.spark.SparkContext
addedJars() - Method in class org.apache.spark.SparkContext
AddExchange - Class in org.apache.spark.sql.execution: Ensures that the Partitioning of input data meets the Distribution requirements for each operator by inserting Exchange Operators where required.
AddExchange(SQLContext) - Constructor for class org.apache.spark.sql.execution.AddExchange
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Add a file to be downloaded with this Spark job on every node.
addFile(File) - Method in class org.apache.spark.HttpFileServer
addFile(String) - Method in class org.apache.spark.SparkContext: Add a file to be downloaded with this Spark job on every node.
AddFile - Class in org.apache.spark.sql.hive
AddFile(String) - Constructor for class org.apache.spark.sql.hive.AddFile
AddFile - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi ::
AddFile(String) - Constructor for class org.apache.spark.sql.hive.execution.AddFile
addFileToDir(File, File) - Method in class org.apache.spark.HttpFileServer
addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils: Add filters, if any, to the given list of ServletContextHandlers
addFloat(float) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a param with multiple values (overwrites if the input param exists).
addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a double param with multiple values.
addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a int param with multiple values.
addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a float param with multiple values.
addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a long param with multiple values.
addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a boolean param with true and false.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam: Merge two accumulated values together.
addInPlace(R, R) - Method in class org.apache.spark.GrowableAccumulableParam
addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
addInPlace(Vector) - Method in class org.apache.spark.util.Vector
addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
addInputStream(InputDStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
addInt(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(File) - Method in class org.apache.spark.HttpFileServer
addJar(String) - Method in class org.apache.spark.SparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
AddJar - Class in org.apache.spark.sql.hive
AddJar(String) - Constructor for class org.apache.spark.sql.hive.AddJar
AddJar - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi ::
AddJar(String) - Constructor for class org.apache.spark.sql.hive.execution.AddJar
addListener(SparkListener) - Method in interface org.apache.spark.scheduler.SparkListenerBus
addListener(StreamingListener) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD: Add Hadoop configuration specific to a single partition and attempt.
addLong(long) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
addOnCompleteCallback(Function0<Unit>) - Method in class org.apache.spark.TaskContext: Deprecated.
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
addOutputLoc(int, MapStatus) - Method in class org.apache.spark.scheduler.Stage
addOutputStream(DStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
addPartitioningAttributes(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
address() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
addresses() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
addRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager: If the given task ID is not in the set of running tasks, adds it.
addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
addSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext: Add a (Java friendly) listener to be executed on task completion.
addTaskCompletionListener(Function1<TaskContext, Unit>) - Method in class org.apache.spark.TaskContext: Add a listener in the form of a Scala closure to be executed on task completion.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContextImpl
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
addTaskSetManager(Schedulable, Properties) - Method in interface org.apache.spark.scheduler.SchedulableBuilder
addToTime(long) - Method in class org.apache.spark.streaming.util.ManualClock
adminAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
advanceCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState: Advance the checkpoint clock by the checkpoint interval.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
Aggregate - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Groups input data by groupingExpressions and computes the aggregateExpressions for each group.
Aggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Aggregate
aggregate() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
aggregate(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD: Performs an aggregation over all Rows in this RDD.
Aggregate.ComputedAggregate - Class in org.apache.spark.sql.execution: An aggregate that needs to be computed for each row in a group.
Aggregate.ComputedAggregate(AggregateExpression, AggregateExpression, AttributeReference) - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
Aggregate.ComputedAggregate$ - Class in org.apache.spark.sql.execution
Aggregate.ComputedAggregate$() - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate$
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
AggregateEvaluation - Class in org.apache.spark.sql.execution
AggregateEvaluation(Seq<Attribute>, Seq<Expression>, Seq<Expression>, Expression) - Constructor for class org.apache.spark.sql.execution.AggregateEvaluation
aggregateExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
aggregateExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph: Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesEdgeScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition: Send messages along edges and aggregate them at the receiving vertices.
aggregateMessagesIndexScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition: Send messages along edges and aggregate them at the receiving vertices.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph: Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
aggregateSizeForNode(DecisionTreeMetadata, Option<int[]>) - Static method in class org.apache.spark.mllib.tree.RandomForest: Get the number of values to be stored for this node in the bin aggregates.
aggregateUsingIndex(Iterator<Product2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.
AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
Aggregator<K,V,C> - Class in org.apache.spark: :: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
aggregator() - Method in class org.apache.spark.ShuffleDependency
AkkaUtils - Class in org.apache.spark.util: Various utility classes for working with Akka.
AkkaUtils() - Constructor for class org.apache.spark.util.AkkaUtils
Algo - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
alias() - Method in class org.apache.spark.sql.hive.MetastoreRelation
aliasNames() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
All - Static variable in class org.apache.spark.graphx.TripletFields: Expose all the fields (source, edge, and destination).
allAggregates(Seq<Expression>) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
AllCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
AllJobsCancelled - Class in org.apache.spark.scheduler
AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
AllJobsPage - Class in org.apache.spark.ui.jobs: Page showing list of all ongoing and recently finished jobs
AllJobsPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.AllJobsPage
allJoinTokens() - Static method in class org.apache.spark.sql.hive.HiveQl
allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker: Allocate all unallocated blocks to the given batch.
allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker: Allocate all unallocated blocks to the given batch.
AllocatedBlocks - Class in org.apache.spark.streaming.scheduler: Class representing the blocks of all the streams allocated to a batch
AllocatedBlocks(Map<Object, Seq<ReceivedBlockInfo>>) - Constructor for class org.apache.spark.streaming.scheduler.AllocatedBlocks
allocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
allowLocal() - Method in class org.apache.spark.scheduler.JobSubmitted
allPendingTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
AllStagesPage - Class in org.apache.spark.ui.jobs: Page showing list of all ongoing and recently finished stages and pools
AllStagesPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.AllStagesPage
AlphaComponent - Annotation Type in org.apache.spark.annotation: A new component of Spark which may have unstable API's.
alreadyPlanned() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
ALS - Class in org.apache.spark.mllib.recommendation: Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS: Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
ALS.BlockStats - Class in org.apache.spark.mllib.recommendation: :: DeveloperApi :: Statistics of a block in ALS computation.
ALS.BlockStats(String, int, long, long, long, long) - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats
ALS.BlockStats$ - Class in org.apache.spark.mllib.recommendation
ALS.BlockStats$() - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats$
ALSPartitioner - Class in org.apache.spark.mllib.recommendation: Partitioner for ALS.
ALSPartitioner(int) - Constructor for class org.apache.spark.mllib.recommendation.ALSPartitioner
analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext: Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
analyzeBlocks(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: :: DeveloperApi :: Given an RDD of ratings, number of user blocks, and number of product blocks, computes the statistics of each block in ALS computation.
analyzed() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
AnalyzeTable - Class in org.apache.spark.sql.hive
AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.AnalyzeTable
AnalyzeTable - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi :: Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.execution.AnalyzeTable
AND() - Static method in class org.apache.spark.sql.hive.HiveQl
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
append(boolean, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
append(byte, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
append(byte[], ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
append(JvmType, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType: Appends the given value v of type T into the given ByteBuffer.
append(Row, int, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType: Appends row(ordinal) of type T into the given ByteBuffer.
append(Date, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
append(double, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
append(float, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
append(long, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
append(short, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
append(String, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
append(Timestamp, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
append(AvroFlumeEvent) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
appendBatch(List<AvroFlumeEvent>) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils: Returns a new vector with 1.0 (bias) appended to the input vector.
appendFrom(Row, int) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder: Appends row(ordinal) to the column builder.
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
AppendingParquetOutputFormat - Class in org.apache.spark.sql.parquet: TODO: this will be able to append to directories it created itself, not necessarily to imported ones.
AppendingParquetOutputFormat(int) - Constructor for class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
appId() - Method in class org.apache.spark.scheduler.ApplicationEventListener
appId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
appId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
appId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
appId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
appId() - Method in interface org.apache.spark.scheduler.TaskScheduler
APPLICATION_COMPLETE() - Static method in class org.apache.spark.scheduler.EventLoggingListener
applicationComplete() - Method in class org.apache.spark.scheduler.EventLoggingInfo
applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
ApplicationEventListener - Class in org.apache.spark.scheduler: A simple listener for application events.
ApplicationEventListener() - Constructor for class org.apache.spark.scheduler.ApplicationEventListener
applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
applicationId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
applicationId() - Method in class org.apache.spark.scheduler.local.LocalBackend
applicationId() - Method in interface org.apache.spark.scheduler.SchedulerBackend: Get an application ID associated with the job.
applicationId() - Method in interface org.apache.spark.scheduler.TaskScheduler: Get an application ID associated with the job.
applicationId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
applicationId() - Method in class org.apache.spark.SparkContext
applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph: Construct a graph from a collection of vertices and edges with attributes.
apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from edges, setting referenced vertices to `defaultVertexAttr`.
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`.
apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition: Construct a `ShippableVertexPartition` from the given vertices without any routing table.
apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition: Construct a ShippableVertexPartition from the given vertices with the specified routing table, filling in missing vertices mentioned in the routing table using defaultVal.
apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition: Construct a ShippableVertexPartition from the given vertices with the specified routing table, filling in missing vertices mentioned in the routing table using defaultVal, and merging duplicate vertex atrribute with mergeFunc.
apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition: Construct a `VertexPartition` from the given vertices.
apply(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase: Return the vertex attribute for the given vertex ID.
apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel: Execute a Pregel-like iterative vertex-parallel abstraction.
apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a standalone VertexRDD (one that is not set up for efficient joins with an EdgeRDD) from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap: Gets the value of the input param or its default value if it does not exist.
apply(BinaryConfusionMatrix) - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryClassificationMetricComputer
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
apply(BinaryConfusionMatrix) - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector: Gets the value of the ith element.
apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node: Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$: Alternate factory method that takes a ByteBuffer directly for the data field
apply(BlockManagerId, long[]) - Static method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
apply(DataType) - Static method in class org.apache.spark.sql.columnar.ColumnType
apply(boolean, int, StorageLevel, SparkPlan, Option<String>) - Static method in class org.apache.spark.sql.columnar.InMemoryRelation
apply(SparkPlan) - Method in class org.apache.spark.sql.execution.AddExchange
apply(PythonUDF, LogicalPlan) - Static method in class org.apache.spark.sql.execution.EvaluatePython
apply(LogicalPlan) - Static method in class org.apache.spark.sql.execution.ExtractPythonUdfs
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.BasicOperators
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.BroadcastNestedLoopJoin
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.CartesianProduct
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.CommandStrategy
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashJoin
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.InMemoryScans
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.LeftSemiJoin
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.ParquetOperations
apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.TakeOrdered
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.DataSinks
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.Scripts
apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.DataSourceStrategy
apply(String) - Method in class org.apache.spark.sql.sources.DDLParser
apply(String) - Static method in class org.apache.spark.storage.BlockId: Converts a BlockId "name" String back into a BlockId.
apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId: Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
apply(long) - Static method in class org.apache.spark.streaming.Minutes
apply(long) - Static method in class org.apache.spark.streaming.Seconds
apply(I, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.CompletionIterator
apply(Traversable<Object>) - Static method in class org.apache.spark.util.Distribution
apply(InputStream, File, SparkConf) - Static method in class org.apache.spark.util.logging.FileAppender: Create the right appender based on Spark configuration
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values passed as variable-length arguments.
apply(A) - Method in class org.apache.spark.util.TimeStampedHashMap
apply(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
apply(int) - Method in class org.apache.spark.util.Vector
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: Applies a schema to an RDD of Java Beans.
applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: :: DeveloperApi :: Creates a JavaSchemaRDD from an RDD containing Rows by applying a schema to this RDD.
applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext: :: DeveloperApi :: Creates a SchemaRDD from an RDD containing Rows by applying a schema to this RDD.
applySchemaToPythonRDD(RDD<Object[]>, String) - Method in class org.apache.spark.sql.SQLContext: Apply a schema defined by the schemaString to an RDD.
applySchemaToPythonRDD(RDD<Object[]>, StructType) - Method in class org.apache.spark.sql.SQLContext: Apply a schema defined by the schema to an RDD.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
appName() - Method in class org.apache.spark.scheduler.ApplicationEventListener
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
appName() - Method in class org.apache.spark.SparkContext
appName() - Method in class org.apache.spark.ui.SparkUI
appName() - Method in class org.apache.spark.ui.SparkUITab
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
ApproximateActionListener<T,U,R> - Class in org.apache.spark.partial: A JobListener for an approximate single-result action, such as count() or non-parallel reduce().
ApproximateActionListener(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Constructor for class org.apache.spark.partial.ApproximateActionListener
ApproximateEvaluator<U,R> - Interface in org.apache.spark.partial: An object that computes a function incrementally by merging in results of type U from multiple tasks.
appUIAddress() - Method in class org.apache.spark.ui.SparkUI
appUIHostPort() - Method in class org.apache.spark.ui.SparkUI: Return the application UI host:port.
AreaUnderCurve - Class in org.apache.spark.mllib.evaluation: Computes the area under the curve (AUC) using the trapezoidal rule.
AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the receiver operating characteristic (ROC) curve.
areBoundsEmpty() - Method in class org.apache.spark.util.random.AcceptanceResult
argString() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
arr() - Method in class org.apache.spark.rdd.PartitionGroup
ARRAY() - Static method in class org.apache.spark.sql.hive.HiveQl
ARRAY_CONTAINS_NULL_BAG_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
ARRAY_ELEMENTS_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
arrayBuffer() - Method in class org.apache.spark.streaming.receiver.ArrayBufferBlock
ArrayBufferBlock - Class in org.apache.spark.streaming.receiver: class representing a block received as an ArrayBuffer
ArrayBufferBlock(ArrayBuffer<?>) - Constructor for class org.apache.spark.streaming.receiver.ArrayBufferBlock
ArrayType - Class in org.apache.spark.sql.api.java: The data type representing Lists.
ArrayValues - Class in org.apache.spark.storage
ArrayValues(Object[]) - Constructor for class org.apache.spark.storage.ArrayValues
as(Symbol) - Method in class org.apache.spark.sql.SchemaRDD: Applies a qualifier to the attributes of this relation.
asIterator() - Method in class org.apache.spark.serializer.DeserializationStream: Read the elements of this stream through an iterator.
asJavaDataType(DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions: Returns the equivalent DataType in Java for the given DataType in Scala.
asJavaStructField(StructField) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions: Returns the equivalent StructField in Scala for the given StructField in Java.
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
askTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils: Returns the default Spark timeout to use for Akka ask operations.
askWithReply(Object, ActorRef, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils: Send a message to the given actor and get its result within a default timeout, or throw a SparkException if this fails.
askWithReply(Object, ActorRef, int, int, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils: Send a message to the given actor and get its result within a default timeout, or throw a SparkException if this fails even after the specified number of retries.
asRDDId() - Method in class org.apache.spark.storage.BlockId
asScalaDataType(DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions: Returns the equivalent DataType in Scala for the given DataType in Java.
asScalaStructField(StructField) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions: Returns the equivalent StructField in Scala for the given StructField in Java.
assertValid() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy: Check validity of parameters.
assertValid() - Method in class org.apache.spark.mllib.tree.configuration.Strategy: Check validity of parameters.
assertValid() - Method in class org.apache.spark.rdd.BlockRDD: Check if this BlockRDD is valid.
AsyncRDDActions<T> - Class in org.apache.spark.rdd: A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
attachExecutor(ReceiverSupervisor) - Method in class org.apache.spark.streaming.receiver.Receiver: Attach Network Receiver executor to this receiver.
attachHandler(ServletContextHandler) - Method in class org.apache.spark.ui.WebUI: Attach a handler to this UI.
attachListener(CleanerListener) - Method in class org.apache.spark.ContextCleaner: Attach a listener object to get information of when objects are cleaned.
attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUI: Attach a page to this UI.
attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUITab: Attach a page to this tab.
attachTab(WebUITab) - Method in class org.apache.spark.ui.WebUI: Attach a tab to this UI, along with all of its attached pages.
attempt() - Method in class org.apache.spark.scheduler.TaskInfo
attempt() - Method in class org.apache.spark.scheduler.TaskSet
attemptId() - Method in class org.apache.spark.scheduler.Stage
attemptId() - Method in class org.apache.spark.scheduler.StageInfo
attemptId() - Method in class org.apache.spark.TaskContext
attemptId() - Method in class org.apache.spark.TaskContextImpl
attr() - Method in class org.apache.spark.graphx.Edge
attr() - Method in class org.apache.spark.graphx.EdgeContext: The attribute associated with the edge.
attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
attr() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
attribute() - Method in class org.apache.spark.sql.sources.EqualTo
attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
attribute() - Method in class org.apache.spark.sql.sources.In
attribute() - Method in class org.apache.spark.sql.sources.LessThan
attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
attributeMap() - Method in class org.apache.spark.sql.hive.MetastoreRelation: An attribute map that can be used to lookup original attributes based on expression id.
attributeMap() - Method in class org.apache.spark.sql.parquet.ParquetRelation
attributeMap() - Method in class org.apache.spark.sql.sources.LogicalRelation: Used to lookup original attribute capitalization
attributes() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
attributes() - Method in class org.apache.spark.sql.hive.MetastoreRelation: Non-partitionKey attributes
attributes() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
attributes() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
attrs() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
autoBroadcastJoinThreshold() - Method in interface org.apache.spark.sql.SQLConf: Upper bound on the sizes (in bytes) of the tables qualified for the auto conversion to a broadcast value during the physical executions of join operations.
Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
AVG() - Static method in class org.apache.spark.sql.hive.HiveQl
awaitResult() - Method in class org.apache.spark.partial.ApproximateActionListener: Waits for up to timeout milliseconds since the listener was created and then returns a PartialResult with the result so far.
awaitResult() - Method in class org.apache.spark.scheduler.JobWaiter
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor: Wait the thread until the supervisor is stopped
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.util.logging.FileAppender: Wait for the appender to stop appending, either because input stream is closed or because of any error in appending
axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS: y += a * x

B

backend() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
BaggedPoint<Datum> - Class in org.apache.spark.mllib.tree.impl: Internal representation of a datapoint which belongs to several subsamples of the same dataset, particularly for bagging (e.g., for random forests).
BaggedPoint(Datum, double[]) - Constructor for class org.apache.spark.mllib.tree.impl.BaggedPoint
base() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
baseDir() - Method in class org.apache.spark.HttpFileServer
baseLogicalPlan() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
baseLogicalPlan() - Method in class org.apache.spark.sql.SchemaRDD
baseLogicalPlan() - Method in interface org.apache.spark.sql.SchemaRDDLike
baseOn(ParamPair<?>...) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
baseOn(ParamMap) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
baseOn(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
basePath() - Method in class org.apache.spark.ui.SparkUI
basePath() - Method in class org.apache.spark.ui.WebUITab
BaseRelation - Class in org.apache.spark.sql.sources: ::DeveloperApi:: Represents a collection of tuples with a known schema.
BaseRelation() - Constructor for class org.apache.spark.sql.sources.BaseRelation
baseRelationToSchemaRDD(BaseRelation) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
baseRelationToSchemaRDD(BaseRelation) - Method in class org.apache.spark.sql.SQLContext
baseSchemaRDD() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
baseSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD
baseSchemaRDD() - Method in interface org.apache.spark.sql.SchemaRDDLike
BasicColumnAccessor<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
BasicColumnAccessor(ByteBuffer, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.BasicColumnAccessor
BasicColumnBuilder<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
BasicColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.BasicColumnBuilder
BasicOperators() - Method in class org.apache.spark.sql.execution.SparkStrategies
basicSparkPage(Function0<Seq<Node>>, String) - Static method in class org.apache.spark.ui.UIUtils: Returns a page with the spark css/js and a simple format.
BatchAllocationEvent - Class in org.apache.spark.streaming.scheduler
BatchAllocationEvent(Time, AllocatedBlocks) - Constructor for class org.apache.spark.streaming.scheduler.BatchAllocationEvent
BatchCleanupEvent - Class in org.apache.spark.streaming.scheduler
BatchCleanupEvent(Seq<Time>) - Constructor for class org.apache.spark.streaming.scheduler.BatchCleanupEvent
batchDuration() - Method in class org.apache.spark.streaming.DStreamGraph
batchDuration() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
BATCHES() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
BatchInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, ReceivedBlockInfo[]>, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
BatchPythonEvaluation - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Uses PythonRDD to evaluate a PythonUDF, one partition of tuples at a time.
BatchPythonEvaluation(PythonUDF, Seq<Attribute>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.BatchPythonEvaluation
batchSize() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
batchTimeToSelectedFiles() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
BeginEvent - Class in org.apache.spark.scheduler
BeginEvent(Task<?>, TaskInfo) - Constructor for class org.apache.spark.scheduler.BeginEvent
beginTime() - Method in class org.apache.spark.streaming.Interval
benchmark(int) - Static method in class org.apache.spark.util.random.XORShiftRandom
BernoulliCellSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on Bernoulli trials for partitioning a data sequence.
BernoulliCellSampler(double, double, boolean) - Constructor for class org.apache.spark.util.random.BernoulliCellSampler
BernoulliSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.BernoulliSampler
bestModel() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
beta() - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
BETWEEN() - Static method in class org.apache.spark.sql.hive.HiveQl
BigDecimalSerializer - Class in org.apache.spark.sql.execution
BigDecimalSerializer() - Constructor for class org.apache.spark.sql.execution.BigDecimalSerializer
Bin - Class in org.apache.spark.mllib.tree.model: Used for "binning" the feature values for faster best split calculation.
Bin(Split, Split, Enumeration.Value, double) - Constructor for class org.apache.spark.mllib.tree.model.Bin
BINARY - Class in org.apache.spark.sql.columnar
BINARY() - Constructor for class org.apache.spark.sql.columnar.BINARY
BinaryClassificationEvaluator - Class in org.apache.spark.ml.evaluation: :: AlphaComponent :: Evaluator for binary classification, which expects two input columns: score and label.
BinaryClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
BinaryClassificationMetricComputer - Interface in org.apache.spark.mllib.evaluation.binary: Trait for a binary classification evaluation metric computer.
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation: :: Experimental :: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
BinaryColumnAccessor - Class in org.apache.spark.sql.columnar
BinaryColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.BinaryColumnAccessor
BinaryColumnBuilder - Class in org.apache.spark.sql.columnar
BinaryColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.BinaryColumnBuilder
BinaryColumnStats - Class in org.apache.spark.sql.columnar
BinaryColumnStats() - Constructor for class org.apache.spark.sql.columnar.BinaryColumnStats
BinaryConfusionMatrix - Interface in org.apache.spark.mllib.evaluation.binary: Trait for a binary confusion matrix.
BinaryConfusionMatrixImpl - Class in org.apache.spark.mllib.evaluation.binary: Implementation of BinaryConfusionMatrix.
BinaryConfusionMatrixImpl(BinaryLabelCounter, BinaryLabelCounter) - Constructor for class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
BinaryFileRDD<T> - Class in org.apache.spark.rdd
BinaryFileRDD(SparkContext, Class<? extends StreamFileInputFormat<T>>, Class<String>, Class<T>, Configuration, int) - Constructor for class org.apache.spark.rdd.BinaryFileRDD
binaryFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext: :: Experimental ::
binaryFiles(String, int) - Method in class org.apache.spark.SparkContext: :: Experimental ::
BinaryLabelCounter - Class in org.apache.spark.mllib.evaluation.binary: A counter for positives and negatives.
BinaryLabelCounter(long, long) - Constructor for class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators: Function to check if labels used for classification are either zero or one.
BinaryNode - Interface in org.apache.spark.sql.execution
binaryRecords(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: :: Experimental ::
binaryRecords(String, int, Configuration) - Method in class org.apache.spark.SparkContext: :: Experimental ::
BinaryType - Class in org.apache.spark.sql.api.java: The data type representing byte[] values.
BinaryType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the BinaryType object.
bind() - Method in class org.apache.spark.ui.WebUI: Bind to the HTTP server behind this web interface.
binnedFeatures() - Method in class org.apache.spark.mllib.tree.impl.TreePoint
BinomialBounds - Class in org.apache.spark.util.random: Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample size with high confidence when sampling without replacement.
BinomialBounds() - Constructor for class org.apache.spark.util.random.BinomialBounds
BITS_PER_LONG() - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
BLAS - Class in org.apache.spark.mllib.linalg: BLAS routines for MLlib's vectors and matrices.
BLAS() - Constructor for class org.apache.spark.mllib.linalg.BLAS
BLOCK_MANAGER() - Static method in class org.apache.spark.util.MetadataCleanerType
BlockAdditionEvent - Class in org.apache.spark.streaming.scheduler
BlockAdditionEvent(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.BlockAdditionEvent
BlockException - Exception in org.apache.spark.storage
BlockException(BlockId, String) - Constructor for exception org.apache.spark.storage.BlockException
BlockGenerator - Class in org.apache.spark.streaming.receiver: Generates batches of objects received by a Receiver and puts them into appropriately named blocks at regular intervals.
BlockGenerator(BlockGeneratorListener, int, SparkConf) - Constructor for class org.apache.spark.streaming.receiver.BlockGenerator
BlockGeneratorListener - Interface in org.apache.spark.streaming.receiver: Listener object for BlockGenerator events
blockId() - Method in class org.apache.spark.rdd.BlockRDDPartition
blockId() - Method in class org.apache.spark.scheduler.IndirectTaskResult
blockId() - Method in exception org.apache.spark.storage.BlockException
BlockId - Class in org.apache.spark.storage: :: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocations
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
blockId() - Method in class org.apache.spark.storage.BlockObjectWriter
blockId() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
blockId() - Method in interface org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchResult
blockId() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
blockId() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
blockId() - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedStoreResult
blockId() - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockStoreResult
blockId() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
blockIds() - Method in class org.apache.spark.rdd.BlockRDD
blockIds() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
blockIdsToBlockManagers(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
blockIdsToExecutorIds(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
blockIdsToHosts(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
blockifyObject(T, int, Serializer, Option<CompressionCodec>, ClassTag<T>) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
BlockInfo - Class in org.apache.spark.storage
BlockInfo(StorageLevel, boolean) - Constructor for class org.apache.spark.storage.BlockInfo
blockManager() - Method in class org.apache.spark.SparkEnv
BlockManager - Class in org.apache.spark.storage: Manager running on every node (driver and executors) which provides interfaces for putting and retrieving blocks both locally and remotely into various stores (memory, disk, and off-heap).
BlockManager(String, ActorSystem, BlockManagerMaster, Serializer, long, SparkConf, MapOutputTracker, ShuffleManager, BlockTransferService, SecurityManager, int) - Constructor for class org.apache.spark.storage.BlockManager
BlockManager(String, ActorSystem, BlockManagerMaster, Serializer, SparkConf, MapOutputTracker, ShuffleManager, BlockTransferService, SecurityManager, int) - Constructor for class org.apache.spark.storage.BlockManager: Construct a BlockManager with a memory limit set based on system properties.
blockManager() - Method in class org.apache.spark.storage.BlockManagerSource
blockManager() - Method in class org.apache.spark.storage.BlockStore
blockManagerAddedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockManagerAddedToJson(SparkListenerBlockManagerAdded) - Static method in class org.apache.spark.util.JsonProtocol
BlockManagerBasedBlockHandler - Class in org.apache.spark.streaming.receiver: Implementation of a ReceivedBlockHandler which stores the received blocks into a block manager with the specified storage level.
BlockManagerBasedBlockHandler(BlockManager, StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
BlockManagerBasedStoreResult - Class in org.apache.spark.streaming.receiver: Implementation of ReceivedBlockStoreResult that stores the metadata related to storage of blocks using BlockManagerBasedBlockHandler
BlockManagerBasedStoreResult(StreamBlockId) - Constructor for class org.apache.spark.streaming.receiver.BlockManagerBasedStoreResult
blockManagerId() - Method in class org.apache.spark.Heartbeat
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
blockManagerId() - Method in class org.apache.spark.storage.BlockManager
BlockManagerId - Class in org.apache.spark.storage: :: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerInfo
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetPeers
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
blockManagerIdFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
blockManagerIdToJson(BlockManagerId) - Static method in class org.apache.spark.util.JsonProtocol
BlockManagerInfo - Class in org.apache.spark.storage
BlockManagerInfo(BlockManagerId, long, long, ActorRef) - Constructor for class org.apache.spark.storage.BlockManagerInfo
BlockManagerMaster - Class in org.apache.spark.storage
BlockManagerMaster(ActorRef, SparkConf, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMaster
BlockManagerMasterActor - Class in org.apache.spark.storage: BlockManagerMasterActor is an actor on the master node to track statuses of all slaves' block managers.
BlockManagerMasterActor(boolean, SparkConf, LiveListenerBus) - Constructor for class org.apache.spark.storage.BlockManagerMasterActor
BlockManagerMessages - Class in org.apache.spark.storage
BlockManagerMessages() - Constructor for class org.apache.spark.storage.BlockManagerMessages
BlockManagerMessages.BlockManagerHeartbeat - Class in org.apache.spark.storage
BlockManagerMessages.BlockManagerHeartbeat(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
BlockManagerMessages.BlockManagerHeartbeat$ - Class in org.apache.spark.storage
BlockManagerMessages.BlockManagerHeartbeat$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
BlockManagerMessages.ExpireDeadHosts$ - Class in org.apache.spark.storage
BlockManagerMessages.ExpireDeadHosts$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.ExpireDeadHosts$
BlockManagerMessages.GetActorSystemHostPortForExecutor - Class in org.apache.spark.storage
BlockManagerMessages.GetActorSystemHostPortForExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor
BlockManagerMessages.GetActorSystemHostPortForExecutor$ - Class in org.apache.spark.storage
BlockManagerMessages.GetActorSystemHostPortForExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor$
BlockManagerMessages.GetBlockStatus - Class in org.apache.spark.storage
BlockManagerMessages.GetBlockStatus(BlockId, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
BlockManagerMessages.GetBlockStatus$ - Class in org.apache.spark.storage
BlockManagerMessages.GetBlockStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
BlockManagerMessages.GetLocations - Class in org.apache.spark.storage
BlockManagerMessages.GetLocations(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations
BlockManagerMessages.GetLocations$ - Class in org.apache.spark.storage
BlockManagerMessages.GetLocations$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations$
BlockManagerMessages.GetLocationsMultipleBlockIds - Class in org.apache.spark.storage
BlockManagerMessages.GetLocationsMultipleBlockIds(BlockId[]) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
BlockManagerMessages.GetLocationsMultipleBlockIds$ - Class in org.apache.spark.storage
BlockManagerMessages.GetLocationsMultipleBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
BlockManagerMessages.GetMatchingBlockIds - Class in org.apache.spark.storage
BlockManagerMessages.GetMatchingBlockIds(Function1<BlockId, Object>, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
BlockManagerMessages.GetMatchingBlockIds$ - Class in org.apache.spark.storage
BlockManagerMessages.GetMatchingBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
BlockManagerMessages.GetMemoryStatus$ - Class in org.apache.spark.storage
BlockManagerMessages.GetMemoryStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
BlockManagerMessages.GetPeers - Class in org.apache.spark.storage
BlockManagerMessages.GetPeers(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers
BlockManagerMessages.GetPeers$ - Class in org.apache.spark.storage
BlockManagerMessages.GetPeers$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers$
BlockManagerMessages.GetStorageStatus$ - Class in org.apache.spark.storage
BlockManagerMessages.GetStorageStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
BlockManagerMessages.RegisterBlockManager - Class in org.apache.spark.storage
BlockManagerMessages.RegisterBlockManager(BlockManagerId, long, ActorRef) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
BlockManagerMessages.RegisterBlockManager$ - Class in org.apache.spark.storage
BlockManagerMessages.RegisterBlockManager$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
BlockManagerMessages.RemoveBlock - Class in org.apache.spark.storage
BlockManagerMessages.RemoveBlock(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
BlockManagerMessages.RemoveBlock$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveBlock$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
BlockManagerMessages.RemoveBroadcast - Class in org.apache.spark.storage
BlockManagerMessages.RemoveBroadcast(long, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
BlockManagerMessages.RemoveBroadcast$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveBroadcast$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
BlockManagerMessages.RemoveExecutor - Class in org.apache.spark.storage
BlockManagerMessages.RemoveExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
BlockManagerMessages.RemoveExecutor$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
BlockManagerMessages.RemoveRdd - Class in org.apache.spark.storage
BlockManagerMessages.RemoveRdd(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
BlockManagerMessages.RemoveRdd$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveRdd$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
BlockManagerMessages.RemoveShuffle - Class in org.apache.spark.storage
BlockManagerMessages.RemoveShuffle(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
BlockManagerMessages.RemoveShuffle$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveShuffle$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
BlockManagerMessages.StopBlockManagerMaster$ - Class in org.apache.spark.storage
BlockManagerMessages.StopBlockManagerMaster$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
BlockManagerMessages.ToBlockManagerMaster - Interface in org.apache.spark.storage
BlockManagerMessages.ToBlockManagerSlave - Interface in org.apache.spark.storage
BlockManagerMessages.UpdateBlockInfo - Class in org.apache.spark.storage
BlockManagerMessages.UpdateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
BlockManagerMessages.UpdateBlockInfo() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
BlockManagerMessages.UpdateBlockInfo$ - Class in org.apache.spark.storage
BlockManagerMessages.UpdateBlockInfo$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
blockManagerRemovedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockManagerRemovedToJson(SparkListenerBlockManagerRemoved) - Static method in class org.apache.spark.util.JsonProtocol
BlockManagerSlaveActor - Class in org.apache.spark.storage: An actor to take commands from the master to execute options.
BlockManagerSlaveActor(BlockManager, MapOutputTracker) - Constructor for class org.apache.spark.storage.BlockManagerSlaveActor
BlockManagerSource - Class in org.apache.spark.storage
BlockManagerSource(BlockManager) - Constructor for class org.apache.spark.storage.BlockManagerSource
BlockNotFoundException - Exception in org.apache.spark.storage
BlockNotFoundException(String) - Constructor for exception org.apache.spark.storage.BlockNotFoundException
BlockObjectWriter - Class in org.apache.spark.storage: An interface for writing JVM objects to some underlying storage.
BlockObjectWriter(BlockId) - Constructor for class org.apache.spark.storage.BlockObjectWriter
blockPushingThread() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
BlockRDD<T> - Class in org.apache.spark.rdd
BlockRDD(SparkContext, BlockId[], ClassTag<T>) - Constructor for class org.apache.spark.rdd.BlockRDD
BlockRDDPartition - Class in org.apache.spark.rdd
BlockRDDPartition(BlockId, int) - Constructor for class org.apache.spark.rdd.BlockRDDPartition
BlockResult - Class in org.apache.spark.storage
BlockResult(Iterator<Object>, Enumeration.Value, long) - Constructor for class org.apache.spark.storage.BlockResult
blocks() - Method in class org.apache.spark.storage.BlockManagerInfo
blocks() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
blocks() - Method in class org.apache.spark.storage.StorageStatus: Return the blocks stored in this block manager.
BlockStatus - Class in org.apache.spark.storage
BlockStatus(StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
blockStatusFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockStatusToJson(BlockStatus) - Static method in class org.apache.spark.util.JsonProtocol
BlockStore - Class in org.apache.spark.storage: Abstract class to store blocks.
BlockStore(BlockManager) - Constructor for class org.apache.spark.storage.BlockStore
blockStoreResult() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
blockTransferService() - Method in class org.apache.spark.SparkEnv
BlockValues - Interface in org.apache.spark.storage
bmAddress() - Method in class org.apache.spark.FetchFailed
BOOLEAN - Class in org.apache.spark.sql.columnar
BOOLEAN() - Constructor for class org.apache.spark.sql.columnar.BOOLEAN
BooleanBitSet - Class in org.apache.spark.sql.columnar.compression
BooleanBitSet() - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet
BooleanBitSet.Decoder - Class in org.apache.spark.sql.columnar.compression
BooleanBitSet.Decoder(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
BooleanBitSet.Encoder - Class in org.apache.spark.sql.columnar.compression
BooleanBitSet.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
BooleanColumnAccessor - Class in org.apache.spark.sql.columnar
BooleanColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.BooleanColumnAccessor
BooleanColumnBuilder - Class in org.apache.spark.sql.columnar
BooleanColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.BooleanColumnBuilder
BooleanColumnStats - Class in org.apache.spark.sql.columnar
BooleanColumnStats() - Constructor for class org.apache.spark.sql.columnar.BooleanColumnStats
BooleanParam - Class in org.apache.spark.ml.param: Specialized version of Param[Boolean] for Java.
BooleanParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.BooleanParam
BooleanType - Class in org.apache.spark.sql.api.java: The data type representing boolean and Boolean values.
BooleanType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the BooleanType object.
booleanWritableConverter() - Static method in class org.apache.spark.SparkContext
boolToBoolWritable(boolean) - Static method in class org.apache.spark.SparkContext
BoostingStrategy - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Configuration options for GradientBoostedTrees.
BoostingStrategy(Strategy, Loss, int, double) - Constructor for class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Both() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges originating from *and* arriving at a vertex of interest.
BoundedDouble - Class in org.apache.spark.partial: :: Experimental :: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
BoundedPriorityQueue<A> - Class in org.apache.spark.util: Bounded priority queue.
BoundedPriorityQueue(int, Ordering<A>) - Constructor for class org.apache.spark.util.BoundedPriorityQueue
boundGenerator() - Method in class org.apache.spark.sql.execution.Generate
boundPort() - Method in class org.apache.spark.ui.ServerInfo
boundPort() - Method in class org.apache.spark.ui.WebUI: Return the actual port to which this server is bound.
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast: A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
broadcast() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
BROADCAST_VARS() - Static method in class org.apache.spark.util.MetadataCleanerType
BroadcastBlockId - Class in org.apache.spark.storage
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
broadcastCleaned(long) - Method in interface org.apache.spark.CleanerListener
broadcastedConf() - Method in class org.apache.spark.rdd.CheckpointRDD
BroadcastFactory - Interface in org.apache.spark.broadcast: :: DeveloperApi :: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
BroadcastHashJoin - Class in org.apache.spark.sql.execution.joins: :: DeveloperApi :: Performs an inner hash join of two child relations.
BroadcastHashJoin(Seq<Expression>, Seq<Expression>, org.apache.spark.sql.execution.joins.BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.BroadcastHashJoin
broadcastId() - Method in class org.apache.spark.CleanBroadcast
broadcastId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
BroadcastManager - Class in org.apache.spark.broadcast
BroadcastManager(boolean, SparkConf, SecurityManager) - Constructor for class org.apache.spark.broadcast.BroadcastManager
broadcastManager() - Method in class org.apache.spark.SparkEnv
BroadcastNestedLoopJoin - Class in org.apache.spark.sql.execution.joins: :: DeveloperApi ::
BroadcastNestedLoopJoin(SparkPlan, SparkPlan, org.apache.spark.sql.execution.joins.BuildSide, JoinType, Option<Expression>) - Constructor for class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
BroadcastNestedLoopJoin() - Method in class org.apache.spark.sql.execution.SparkStrategies
broadcastVars() - Method in class org.apache.spark.sql.execution.PythonUDF
buf() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
buffer() - Method in class org.apache.spark.storage.ArrayValues
buffer() - Method in class org.apache.spark.storage.ByteBufferValues
buffer() - Method in class org.apache.spark.util.SerializableBuffer
buffers() - Method in class org.apache.spark.sql.columnar.CachedBatch
build() - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Builds and returns all combinations of parameters specified by the param grid.
build(Node[]) - Method in class org.apache.spark.mllib.tree.model.Node: build the left node and right nodes if not leaf
build() - Method in class org.apache.spark.sql.api.java.MetadataBuilder
build() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
build() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder: Returns the final columnar byte buffer.
build() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
build() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
buildFilter() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
buildKeys() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
buildMetadata(RDD<LabeledPoint>, Strategy, int, String) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata: Construct a DecisionTreeMetadata instance for this dataset and parameters.
buildMetadata(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata: Version of buildMetadata() for DecisionTree.
buildNonNulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
buildPlan() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
buildPools() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
buildPools() - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
buildPools() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
buildProjection() - Method in class org.apache.spark.sql.execution.Project
buildRegistryName(Source) - Method in class org.apache.spark.metrics.MetricsSystem: Build a name that uniquely identifies each metric source.
buildScan() - Method in class org.apache.spark.sql.json.JSONRelation
buildScan(Seq<Attribute>, Seq<Expression>) - Method in class org.apache.spark.sql.parquet.ParquetRelation2
buildScan(Seq<Attribute>, Seq<Expression>) - Method in class org.apache.spark.sql.sources.CatalystScan
buildScan(String[], Filter[]) - Method in class org.apache.spark.sql.sources.PrunedFilteredScan
buildScan(String[]) - Method in class org.apache.spark.sql.sources.PrunedScan
buildScan() - Method in class org.apache.spark.sql.sources.TableScan
buildSide() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
buildSide() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
buildSide() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
buildSide() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
buildSide() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
buildSideKeyGenerator() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
BYTE - Class in org.apache.spark.sql.columnar
BYTE() - Constructor for class org.apache.spark.sql.columnar.BYTE
ByteArrayChunkOutputStream - Class in org.apache.spark.util.io: An OutputStream that writes to fixed-size chunks of byte arrays.
ByteArrayChunkOutputStream(int) - Constructor for class org.apache.spark.util.io.ByteArrayChunkOutputStream
ByteArrayColumnType<T extends org.apache.spark.sql.catalyst.types.DataType> - Class in org.apache.spark.sql.columnar
ByteArrayColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ByteArrayColumnType
byteBuffer() - Method in class org.apache.spark.streaming.receiver.ByteBufferBlock
ByteBufferBlock - Class in org.apache.spark.streaming.receiver: class representing a block received as an ByteBuffer
ByteBufferBlock(ByteBuffer) - Constructor for class org.apache.spark.streaming.receiver.ByteBufferBlock
ByteBufferData - Class in org.apache.spark.streaming.receiver
ByteBufferData(ByteBuffer) - Constructor for class org.apache.spark.streaming.receiver.ByteBufferData
ByteBufferInputStream - Class in org.apache.spark.util: Reads data from a ByteBuffer, and optionally cleans it up using BlockManager.dispose() at the end of the stream (e.g.
ByteBufferInputStream(ByteBuffer, boolean) - Constructor for class org.apache.spark.util.ByteBufferInputStream
ByteBufferValues - Class in org.apache.spark.storage
ByteBufferValues(ByteBuffer) - Constructor for class org.apache.spark.storage.ByteBufferValues
BytecodeUtils - Class in org.apache.spark.graphx.util: Includes an utility function to test whether a function accesses a specific attribute of an object.
BytecodeUtils() - Constructor for class org.apache.spark.graphx.util.BytecodeUtils
ByteColumnAccessor - Class in org.apache.spark.sql.columnar
ByteColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.ByteColumnAccessor
ByteColumnBuilder - Class in org.apache.spark.sql.columnar
ByteColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.ByteColumnBuilder
ByteColumnStats - Class in org.apache.spark.sql.columnar
ByteColumnStats() - Constructor for class org.apache.spark.sql.columnar.ByteColumnStats
bytes() - Method in class org.apache.spark.streaming.receiver.ByteBufferData
BYTES_FOR_PRECISION() - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter: Compute the FIXED_LEN_BYTE_ARRAY length needed to represent a given DECIMAL precision.
bytesToBytesWritable(byte[]) - Static method in class org.apache.spark.SparkContext
bytesToLines(InputStream) - Static method in class org.apache.spark.streaming.dstream.SocketReceiver: This methods translates the data from an inputstream (say, from a socket) to '\n' delimited strings and returns an iterator to access the strings.
bytesToString(long) - Static method in class org.apache.spark.util.Utils: Convert a quantity in bytes to a human-readable string such as "4.0 MB".
bytesWritableConverter() - Static method in class org.apache.spark.SparkContext
bytesWritten(long) - Method in interface org.apache.spark.util.logging.RollingPolicy: Notify that bytes have been written
bytesWritten(long) - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy: Increment the bytes that have been written in the current file
bytesWritten(long) - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
ByteType - Class in org.apache.spark.sql.api.java: The data type representing byte and Byte values.
ByteType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the ByteType object.

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.graphx.Graph: Caches the vertices and edges associated with this graph at the previously-specified target storage levels, which default to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl: Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.partial.StudentTCacher
cache() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.SchemaRDD: Overridden cache function will always use the in-memory columnar caching.
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
CachedBatch - Class in org.apache.spark.sql.columnar
CachedBatch(byte[][], Row) - Constructor for class org.apache.spark.sql.columnar.CachedBatch
cachedColumnBuffers() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
CachedData - Class in org.apache.spark.sql: Holds a cached logical plan and its data
CachedData(LogicalPlan, InMemoryRelation) - Constructor for class org.apache.spark.sql.CachedData
cachedData() - Method in interface org.apache.spark.sql.CacheManager
cachedRepresentation() - Method in class org.apache.spark.sql.CachedData
cacheLock() - Method in interface org.apache.spark.sql.CacheManager
CacheManager - Class in org.apache.spark: Spark class responsible for passing RDDs partition contents to the BlockManager and making sure a node doesn't load two copies of an RDD at once.
CacheManager(BlockManager) - Constructor for class org.apache.spark.CacheManager
cacheManager() - Method in class org.apache.spark.SparkEnv
CacheManager - Interface in org.apache.spark.sql: Provides support in a SQLContext for caching query results and automatically using these cached results when subsequent queries are executed.
cacheQuery(SchemaRDD, Option<String>, StorageLevel) - Method in interface org.apache.spark.sql.CacheManager: Caches the data produced by the logical representation of the given schema rdd.
cacheTable(String) - Method in interface org.apache.spark.sql.CacheManager: Caches the specified table in-memory.
CacheTableCommand - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
CacheTableCommand(String, Option<LogicalPlan>, boolean) - Constructor for class org.apache.spark.sql.execution.CacheTableCommand
cacheTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator: Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator: Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for regression
calculate() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator: Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator: Calculate the impurity from the stored sufficient statistics.
calculatedTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
calculateNumBatchesToRemember(Duration) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream: Calculate the number of last batches to remember, such that all the files selected in at least last MIN_REMEMBER_DURATION duration can be remembered.
calculateTotalMemory(SparkContext) - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
call(T1) - Method in interface org.apache.spark.api.java.function.Function
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
callSite() - Method in class org.apache.spark.scheduler.ActiveJob
callSite() - Method in class org.apache.spark.scheduler.JobSubmitted
callSite() - Method in class org.apache.spark.scheduler.Stage
CallSite - Class in org.apache.spark.util: CallSite represents a place in user code.
CallSite(String, String) - Constructor for class org.apache.spark.util.CallSite
canBeCodeGened(Seq<AggregateExpression>) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
cancel() - Method in class org.apache.spark.ComplexFutureAction
cancel() - Method in interface org.apache.spark.FutureAction: Cancels the execution of this action.
cancel(boolean) - Method in class org.apache.spark.JavaFutureActionWrapper
cancel() - Method in class org.apache.spark.scheduler.JobWaiter: Sends a signal to the DAGScheduler to cancel the job.
cancel() - Method in class org.apache.spark.SimpleFutureAction
cancel() - Method in class org.apache.spark.util.MetadataCleaner
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler: Cancel all jobs that are running or waiting in the queue.
cancelAllJobs() - Method in class org.apache.spark.SparkContext: Cancel all jobs that have been scheduled or are running.
cancelJob(int) - Method in class org.apache.spark.scheduler.DAGScheduler: Cancel a job that is running or waiting in the queue.
cancelJob(int) - Method in class org.apache.spark.SparkContext: Cancel a given job if it's scheduled or running
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.scheduler.DAGScheduler
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext: Cancel active jobs for the specified group.
cancelStage(int) - Method in class org.apache.spark.scheduler.DAGScheduler: Cancel all jobs associated with a running or scheduled stage.
cancelStage(int) - Method in class org.apache.spark.SparkContext: Cancel a given stage and all jobs associated with it
cancelTasks(int, boolean) - Method in interface org.apache.spark.scheduler.TaskScheduler
cancelTasks(int, boolean) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
canEqual(Object) - Method in class org.apache.spark.sql.api.java.Row
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
canFetchMoreResults(long) - Method in class org.apache.spark.scheduler.TaskSetManager: Check whether has enough quota to fetch the result with size bytes
capacity() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
CartesianPartition - Class in org.apache.spark.rdd
CartesianPartition(int, RDD<?>, RDD<?>, int, int) - Constructor for class org.apache.spark.rdd.CartesianPartition
CartesianProduct - Class in org.apache.spark.sql.execution.joins: :: DeveloperApi ::
CartesianProduct(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.CartesianProduct
CartesianProduct() - Method in class org.apache.spark.sql.execution.SparkStrategies
CartesianRDD<T,U> - Class in org.apache.spark.rdd
CartesianRDD(SparkContext, RDD<T>, RDD, ClassTag<T>, ClassTag) - Constructor for class org.apache.spark.rdd.CartesianRDD
CASE() - Static method in class org.apache.spark.sql.hive.HiveQl
caseSensitive() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
castChildOutput(InsertIntoTable, MetastoreRelation, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
CatalystArrayContainsNullConverter - Class in org.apache.spark.sql.parquet: A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array contains null (see ParquetTypesConverter) into an ArrayType.
CatalystArrayContainsNullConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
CatalystArrayContainsNullConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
CatalystArrayConverter - Class in org.apache.spark.sql.parquet: A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array (see ParquetTypesConverter) into an ArrayType.
CatalystArrayConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
CatalystArrayConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
CatalystConverter - Class in org.apache.spark.sql.parquet
CatalystConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystConverter
CatalystGroupConverter - Class in org.apache.spark.sql.parquet: A parquet.io.api.GroupConverter that is able to convert a Parquet record to a Row object.
CatalystGroupConverter(StructField[], int, CatalystConverter, ArrayBuffer<Object>, ArrayBuffer<Row>) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
CatalystGroupConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
CatalystGroupConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter: This constructor is used for the root converter only!
CatalystMapConverter - Class in org.apache.spark.sql.parquet: A parquet.io.api.GroupConverter that converts two-element groups that match the characteristics of a map (see ParquetTypesConverter) into an MapType.
CatalystMapConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystMapConverter
CatalystNativeArrayConverter - Class in org.apache.spark.sql.parquet: A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array (see ParquetTypesConverter) into an ArrayType.
CatalystNativeArrayConverter(NativeType, int, CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
CatalystPrimitiveConverter - Class in org.apache.spark.sql.parquet: A parquet.io.api.PrimitiveConverter that converts Parquet types to Catalyst types.
CatalystPrimitiveConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
CatalystPrimitiveRowConverter - Class in org.apache.spark.sql.parquet: A parquet.io.api.GroupConverter that is able to convert a Parquet record to a Row object.
CatalystPrimitiveRowConverter(StructField[], MutableRow) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
CatalystPrimitiveRowConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
CatalystScan - Class in org.apache.spark.sql.sources: ::Experimental:: An interface for experimenting with a more direct connection to the query planner.
CatalystScan() - Constructor for class org.apache.spark.sql.sources.CatalystScan
CatalystStructConverter - Class in org.apache.spark.sql.parquet: This converter is for multi-element groups of primitive or complex types that have repetition level optional or required (so struct fields).
CatalystStructConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystStructConverter
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
categories() - Method in class org.apache.spark.mllib.tree.model.Split
category() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
category() - Method in class org.apache.spark.mllib.tree.model.Bin
channelFactory() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
channelFactoryExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
checkEquals(ASTNode) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode: Throws an error if this is not equal to other.
checkHost(String, String) - Static method in class org.apache.spark.util.Utils
checkHostPort(String, String) - Static method in class org.apache.spark.util.Utils
checkMinimalPollingPeriod(TimeUnit, int) - Static method in class org.apache.spark.metrics.MetricsSystem
checkModifyPermissions(String) - Method in class org.apache.spark.SecurityManager: Checks the given user against the modify acl list to see if they have authorization to modify the application.
checkOutputSpecs(JobContext) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike: Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.Graph: Mark this Graph for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
checkpoint() - Method in class org.apache.spark.rdd.CheckpointRDD
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
checkpoint() - Method in class org.apache.spark.rdd.RDD: Mark this RDD for checkpointing.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
Checkpoint - Class in org.apache.spark.streaming
Checkpoint(StreamingContext, Time) - Constructor for class org.apache.spark.streaming.Checkpoint
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Enable periodic checkpointing of RDDs of this DStream
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext: Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointBackupFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint: Get the checkpoint backup file for the given checkpoint time
checkpointClock() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
checkpointData() - Method in class org.apache.spark.rdd.RDD
checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
checkpointDir() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
checkpointDir() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
checkpointDir() - Method in class org.apache.spark.SparkContext
checkpointDir() - Method in class org.apache.spark.streaming.Checkpoint
checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
checkpointDirToLogDir(String, int) - Static method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
checkpointDirToLogDir(String) - Static method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
checkpointDuration() - Method in class org.apache.spark.streaming.Checkpoint
checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
checkpointFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint: Get the checkpoint file for the given checkpoint time
CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
checkpointInProgress() - Method in class org.apache.spark.streaming.DStreamGraph
checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
checkpointInterval() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
checkpointPath() - Method in class org.apache.spark.rdd.CheckpointRDD
CheckpointRDD<T> - Class in org.apache.spark.rdd: This RDD represents a RDD checkpoint file (similar to HadoopRDD).
CheckpointRDD(SparkContext, String, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CheckpointRDD
checkpointRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
CheckpointRDDPartition - Class in org.apache.spark.rdd
CheckpointRDDPartition(int) - Constructor for class org.apache.spark.rdd.CheckpointRDDPartition
CheckpointReader - Class in org.apache.spark.streaming
CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
CheckpointState - Class in org.apache.spark.rdd: Enumeration to manage state transitions of an RDD through checkpointing [ Initialized --> marked for checkpointing --> checkpointing in progress --> checkpointed ]
CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
checkpointTime() - Method in class org.apache.spark.streaming.Checkpoint
CheckpointWriter - Class in org.apache.spark.streaming: Convenience class to handle the writing of graph checkpoint to file
CheckpointWriter(JobGenerator, SparkConf, String, Configuration) - Constructor for class org.apache.spark.streaming.CheckpointWriter
CheckpointWriter.CheckpointWriteHandler - Class in org.apache.spark.streaming
CheckpointWriter.CheckpointWriteHandler(Time, byte[]) - Constructor for class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.Pool
checkSpeculatableTasks() - Method in interface org.apache.spark.scheduler.Schedulable
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager: Check for tasks to be speculated and return true if there are any.
checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
checkTimeoutInterval() - Method in class org.apache.spark.storage.BlockManagerMasterActor
checkUIViewPermissions(String) - Method in class org.apache.spark.SecurityManager: Checks the given user against the view acl list to see if they have authorization to view the UI.
child() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
child() - Method in class org.apache.spark.sql.execution.Aggregate
child() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
child() - Method in class org.apache.spark.sql.execution.DescribeCommand
child() - Method in class org.apache.spark.sql.execution.Distinct
child() - Method in class org.apache.spark.sql.execution.EvaluatePython
child() - Method in class org.apache.spark.sql.execution.Exchange
child() - Method in class org.apache.spark.sql.execution.ExternalSort
child() - Method in class org.apache.spark.sql.execution.Filter
child() - Method in class org.apache.spark.sql.execution.Generate
child() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
child() - Method in class org.apache.spark.sql.execution.Limit
child() - Method in class org.apache.spark.sql.execution.OutputFaker
child() - Method in class org.apache.spark.sql.execution.Project
child() - Method in class org.apache.spark.sql.execution.Sample
child() - Method in class org.apache.spark.sql.execution.Sort
child() - Method in class org.apache.spark.sql.execution.TakeOrdered
child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
child() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
children() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
children() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
children() - Method in class org.apache.spark.sql.execution.ExecutedCommand
children() - Method in class org.apache.spark.sql.execution.LogicalRDD
children() - Method in class org.apache.spark.sql.execution.OutputFaker
children() - Method in class org.apache.spark.sql.execution.PythonUDF
children() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
children() - Method in class org.apache.spark.sql.execution.Union
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
children() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
children() - Method in class org.apache.spark.sql.hive.HiveUdaf
children() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution.
chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform distribution, with each category having an expected frequency of 1 / observed.size.
chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Conduct Pearson's independence test on the input contingency matrix, which cannot contain negative entries or columns or rows that sum up to 0.
chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Conduct Pearson's independence test for every feature against the label across the input RDD.
ChiSqTest - Class in org.apache.spark.mllib.stat.test: Conduct the chi-squared test for the input RDDs using the specified method.
ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test
ChiSqTest.Method(String, Function2<Object, Object, Object>) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method
ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
ChiSqTest.Method$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
ChiSqTest.NullHypothesis$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
ChiSqTestResult - Class in org.apache.spark.mllib.stat.test: :: Experimental :: Object containing the test results for the chi-squared hypothesis test.
ChiSqTestResult(double, int, double, String, String) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTestResult
chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest: Conduct Pearson's independence test for each feature against the label across the input RDD.
chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
chmod700(File) - Static method in class org.apache.spark.util.Utils: JDK equivalent of chmod 700 file.
classForName(String) - Static method in class org.apache.spark.util.Utils: Preferred alternative to Class.forName(className)
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
ClassificationModel - Interface in org.apache.spark.mllib.classification: :: Experimental :: Represents a classification model that predicts to which of a set of categories an example belongs.
classIsLoadable(String) - Static method in class org.apache.spark.util.Utils: Determines whether the provided class is loadable in the current thread.
classLoader() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
className() - Method in class org.apache.spark.ExceptionFailure
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
classTag() - Method in class org.apache.spark.api.java.JavaRDD
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
classTag() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
clean(F, boolean) - Method in class org.apache.spark.SparkContext: Clean a closure to make it ready to serialized and send to tasks (removes unreferenced variables in $outer's, updates REPL variables) If checkSerializable is set, clean will also proactively check to see if f is serializable and throw a SparkException if not.
clean(Object, boolean) - Static method in class org.apache.spark.util.ClosureCleaner
CleanBroadcast - Class in org.apache.spark
CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
cleaner() - Method in class org.apache.spark.SparkContext
CleanerListener - Interface in org.apache.spark: Listener class used for testing when any item has been cleaned by the Cleaner class.
CleanRDD - Class in org.apache.spark
CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
CleanShuffle - Class in org.apache.spark
CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
cleanup(long) - Method in class org.apache.spark.SparkContext: Called by MetadataCleaner to clean up the persistentRdds map periodically
cleanup(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData: Cleanup old checkpoint data.
cleanup(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
cleanUpAfterSchedulerStop() - Method in class org.apache.spark.scheduler.DAGScheduler
cleanupOldBatches(Time, boolean) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker: Clean up block information of old batches.
cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
CleanupOldBlocks - Class in org.apache.spark.streaming.receiver
CleanupOldBlocks(Time) - Constructor for class org.apache.spark.streaming.receiver.CleanupOldBlocks
cleanupOldBlocks(long) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler: Cleanup old blocks older than the given threshold time
cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
cleanupOldBlocksAndBatches(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker: Clean up the data and metadata of blocks and batches that are strictly older than the threshold time.
cleanupOldLogs(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager: Delete the log files that are older than the threshold time.
CleanupTask - Interface in org.apache.spark: Classes that represent cleaning tasks.
CleanupTaskWeakReference - Class in org.apache.spark: A WeakReference associated with a CleanupTask.
CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
clear() - Static method in class org.apache.spark.Accumulators
clear() - Method in interface org.apache.spark.sql.SQLConf
clear() - Method in class org.apache.spark.storage.BlockManagerInfo
clear() - Method in class org.apache.spark.storage.BlockStore
clear() - Method in class org.apache.spark.storage.MemoryStore
clear() - Method in class org.apache.spark.util.BoundedPriorityQueue
CLEAR_NULL_VALUES_INTERVAL() - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
clearActiveContext() - Static method in class org.apache.spark.SparkContext: Clears the active SparkContext metadata.
clearCache() - Method in interface org.apache.spark.sql.CacheManager
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext: Clear the thread-local property for overriding the call sites of actions and RDDs.
clearCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
clearCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
ClearCheckpointData - Class in org.apache.spark.streaming.scheduler
ClearCheckpointData(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearCheckpointData
clearDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
clearDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
clearDependencies() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
clearDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearFiles() - Method in class org.apache.spark.SparkContext: Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.SparkContext: Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext: Clear the current thread's job group ID and its description.
clearMetadata(Time) - Method in class org.apache.spark.streaming.dstream.DStream: Clear metadata that are older than rememberDuration of this DStream.
clearMetadata(Time) - Method in class org.apache.spark.streaming.DStreamGraph
ClearMetadata - Class in org.apache.spark.streaming.scheduler
ClearMetadata(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearMetadata
clearNullValues() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap: Remove entries with values that are no longer strongly reachable.
clearOldValues(long, Function2<A, B, BoxedUnit>) - Method in class org.apache.spark.util.TimeStampedHashMap
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashMap: Removes old key-value pairs that have timestamp earlier than `threshTime`.
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashSet: Removes old values that have timestamp earlier than threshTime
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap: Remove old key-value pairs with timestamps earlier than `threshTime`.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: :: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel: :: Experimental :: Clears the threshold so that predict will output raw prediction scores.
client() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
client() - Method in class org.apache.spark.storage.TachyonBlockManager
client() - Method in class org.apache.spark.streaming.flume.FlumeConnection
Clock - Interface in org.apache.spark: An abstract clock for measuring elapsed time.
clock() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
clock() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
Clock - Interface in org.apache.spark.streaming.util
Clock - Interface in org.apache.spark.util: An interface to represent clocks, so that they can be mocked out in unit tests.
clone() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
clone() - Method in class org.apache.spark.SparkConf: Copy this object
clone(JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType: Creates a duplicated copy of the value.
clone() - Method in class org.apache.spark.storage.StorageLevel
clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
clone() - Method in class org.apache.spark.util.random.PoissonSampler
clone() - Method in interface org.apache.spark.util.random.RandomSampler: return a copy of the RandomSampler object
clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils: Clone an object using a Spark serializer.
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler: Return a sampler that is the complement of the range specified of the current sampler.
close() - Method in class org.apache.spark.api.java.JavaSparkContext
close() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
close() - Method in class org.apache.spark.input.PortableDataStream: Close the file (if it is currently open)
close() - Method in class org.apache.spark.input.StreamBasedRecordReader
close() - Method in class org.apache.spark.input.WholeTextFileRecordReader
close() - Method in class org.apache.spark.serializer.DeserializationStream
close() - Method in class org.apache.spark.serializer.JavaDeserializationStream
close() - Method in class org.apache.spark.serializer.JavaSerializationStream
close() - Method in class org.apache.spark.serializer.KryoDeserializationStream
close() - Method in class org.apache.spark.serializer.KryoSerializationStream
close() - Method in class org.apache.spark.serializer.SerializationStream
close() - Method in class org.apache.spark.SparkHadoopWriter
close() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
close() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
close() - Method in class org.apache.spark.storage.BlockObjectWriter
close() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
close() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter
close() - Method in class org.apache.spark.util.FileLogger: Close the writer.
closeIfNeeded() - Method in class org.apache.spark.util.NextIterator: Calls the subclass-defined close method, but only once.
ClosureCleaner - Class in org.apache.spark.util
ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
closureSerializer() - Method in class org.apache.spark.SparkEnv
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
cmd() - Method in class org.apache.spark.sql.execution.ExecutedCommand
cn() - Method in class org.apache.spark.mllib.feature.VocabWord
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
CoalescedRDD<T> - Class in org.apache.spark.rdd: Represents a coalesced RDD that has fewer partitions than its parent RDD This class uses the PartitionCoalescer class to find a good partitioning of the parent RDD so that each new partition has roughly the same number of parent partitions and that the preferred location of each new partition overlaps with as many preferred locations of its parent partitions
CoalescedRDD(RDD<T>, int, double, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CoalescedRDD
CoalescedRDDPartition - Class in org.apache.spark.rdd: Class that captures a coalesced RDD by essentially keeping track of parent partitions
CoalescedRDDPartition(int, RDD<?>, int[], Option<String>) - Constructor for class org.apache.spark.rdd.CoalescedRDDPartition
CoarseGrainedClusterMessage - Interface in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillExecutors(Seq<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillTask(long, String, boolean) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.LaunchTask(SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.LaunchTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterClusterManager$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisteredExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutor(String, String, int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutorFailed(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutorFailed$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RemoveExecutor(String, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RemoveExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RequestExecutors(int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RequestExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
CoarseGrainedClusterMessages.RetrieveSparkProps$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RetrieveSparkProps$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster: Alternate factory method that takes a ByteBuffer directly for the data field
CoarseGrainedClusterMessages.ReviveOffers$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StatusUpdate(String, long, Enumeration.Value, SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StatusUpdate$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StopDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StopExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StopExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
CoarseGrainedSchedulerBackend - Class in org.apache.spark.scheduler.cluster: A scheduler backend that waits for coarse grained executors to connect to it through Akka.
CoarseGrainedSchedulerBackend(TaskSchedulerImpl, ActorSystem) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
CoarseGrainedSchedulerBackend.DriverActor - Class in org.apache.spark.scheduler.cluster
CoarseGrainedSchedulerBackend.DriverActor(Seq<Tuple2<String, String>>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
CoarseMesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos: A SchedulerBackend that runs tasks on Mesos, but uses "coarse-grained" tasks, where it holds onto each Mesos node for the duration of the Spark job instead of relinquishing cores whenever a task is done.
CoarseMesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
code() - Method in class org.apache.spark.mllib.feature.VocabWord
codegenEnabled() - Method in class org.apache.spark.sql.execution.SparkPlan
codegenEnabled() - Method in interface org.apache.spark.sql.SQLConf: When set to true, Spark SQL will use the Scala compiler at runtime to generate custom bytecode that evaluates expressions found in queries.
codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd: :: DeveloperApi :: A RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
CoGroupPartition - Class in org.apache.spark.rdd
CoGroupPartition(int, CoGroupSplitDep[]) - Constructor for class org.apache.spark.rdd.CoGroupPartition
cogroupResult2ToJava(RDD<Tuple2<K, Tuple3<Iterable<V>, Iterable<W1>, Iterable<W2>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
cogroupResult3ToJava(RDD<Tuple2<K, Tuple4<Iterable<V>, Iterable<W1>, Iterable<W2>, Iterable<W3>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
cogroupResultToJava(RDD<Tuple2<K, Tuple2<Iterable<V>, Iterable<W>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
CoGroupSplitDep - Interface in org.apache.spark.rdd
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
collect() - Method in class org.apache.spark.rdd.RDD: Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return an RDD that contains all matching values by applying f.
collect() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
collect() - Method in class org.apache.spark.sql.SchemaRDD
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD: Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of collect, which returns a future for retrieving an array containing all of the elements in this RDD.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving all elements of this RDD.
collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.
collectedStatistics() - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.ByteColumnStats
collectedStatistics() - Method in interface org.apache.spark.sql.columnar.ColumnStats: Column statistics represented as a single row, currently including closed lower bound, closed upper bound and null count.
collectedStatistics() - Method in class org.apache.spark.sql.columnar.DateColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.FloatColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.GenericColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.IntColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.LongColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.NoopColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.ShortColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.StringColumnStats
collectedStatistics() - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
CollectionsUtils - Class in org.apache.spark.util
CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Collect the neighbor vertex ids for each vertex.
collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Collect the neighbor vertex attributes for each vertex.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in a specific partition of this RDD.
collectPartitions() - Method in class org.apache.spark.rdd.RDD: A private method for tests, to look at the contents of each partition
collectToPython() - Method in class org.apache.spark.sql.SchemaRDD: Serializes the Array[Row] returned by SchemaRDD's optimized collect(), using the same format as javaToPython.
colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Computes column-wise summary statistics for the input RDD[Vector].
ColumnAccessor - Interface in org.apache.spark.sql.columnar: An Iterator like trait used to extract values from columnar byte buffer.
columnBatchSize() - Method in interface org.apache.spark.sql.SQLConf: The number of rows that will be
ColumnBuilder - Interface in org.apache.spark.sql.columnar
columnNameOfCorruptRecord() - Method in interface org.apache.spark.sql.SQLConf
columnOrdinals() - Method in class org.apache.spark.sql.hive.MetastoreRelation: An attribute map for determining the ordinal for non-partition columns.
columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Compute similarities between columns of this matrix using a sampling approach.
columnSimilaritiesDIMSUM(double[], double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Find all similar columns using the DIMSUM sampling algorithm, described in two papers
ColumnStatisticsSchema - Class in org.apache.spark.sql.columnar
ColumnStatisticsSchema(Attribute) - Constructor for class org.apache.spark.sql.columnar.ColumnStatisticsSchema
columnStats() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
columnStats() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder: Column statistics information
ColumnStats - Interface in org.apache.spark.sql.columnar: Used to collect statistical information when building in-memory columns.
columnStats() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
columnType() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
ColumnType<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar: An abstract class that represents type of a column.
ColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ColumnType
columnType() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Combine elements of each key in DStream's RDDs using custom functions.
combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
Command - Interface in org.apache.spark.sql.execution
command() - Method in class org.apache.spark.sql.execution.PythonUDF
commands() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
commit() - Method in class org.apache.spark.SparkHadoopWriter
commitAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter: Flush the partial writes and commit them as a single atomic block.
commitAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
commitJob() - Method in class org.apache.spark.SparkHadoopWriter
commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
commonHeaderNodes() - Static method in class org.apache.spark.ui.UIUtils
comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FairSchedulingAlgorithm
comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
comparator(Schedulable, Schedulable) - Method in interface org.apache.spark.scheduler.SchedulingAlgorithm
compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
compatibilityBlackList() - Static method in class org.apache.spark.sql.hive.HiveShim
compatibleType(DataType, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD: Returns the most general data type for two given data types.
completedIndices() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
completedStageIndices() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
completedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
completion() - Method in class org.apache.spark.util.CompletionIterator
CompletionEvent - Class in org.apache.spark.scheduler
CompletionEvent(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.CompletionEvent
CompletionIterator<A,I extends scala.collection.Iterator<A>> - Class in org.apache.spark.util: Wrapper around an iterator which calls a completion method after it successfully iterates through all the elements.
CompletionIterator(I) - Constructor for class org.apache.spark.util.CompletionIterator
completionTime() - Method in class org.apache.spark.scheduler.StageInfo: Time when all tasks in the stage completed or when the stage was cancelled.
ComplexColumnBuilder<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
ComplexColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.ComplexColumnBuilder
ComplexFutureAction<T> - Class in org.apache.spark: A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
compress(ByteBuffer, ByteBuffer) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
compressCodec() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
compressed() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
CompressedMapStatus - Class in org.apache.spark.scheduler: A MapStatus implementation that tracks the size of each block.
CompressedMapStatus(BlockManagerId, byte[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
CompressedMapStatus(BlockManagerId, long[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
compressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
CompressibleColumnAccessor<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
CompressibleColumnBuilder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression: A stackable trait that builds optionally compressed byte buffer for a column.
COMPRESSION_CODEC_PREFIX() - Static method in class org.apache.spark.scheduler.EventLoggingListener
CompressionCodec - Interface in org.apache.spark.io: :: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compressionCodec() - Method in class org.apache.spark.scheduler.EventLoggingInfo
compressionCodec() - Method in class org.apache.spark.streaming.CheckpointWriter
compressionEncoders() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
compressionRatio() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
CompressionScheme - Interface in org.apache.spark.sql.columnar.compression
compressType() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD: Provides the RDD[(VertexId, VD)] equivalent output.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater: Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomVectorRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.BlockRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CartesianRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CheckpointRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoalescedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.EmptyRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FilteredRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FlatMappedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FlatMappedValuesRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.GlommedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MapPartitionsRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MappedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MappedValuesRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PipedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: :: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SampledRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SubtractedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.sql.SchemaRDD
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Method that generates a RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream: Method that generates a RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream: Finds the files that were modified since the last time this method was called and makes a union RDD out of them.
compute(Time) - Method in class org.apache.spark.streaming.dstream.FilteredDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.GlommedDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.MappedDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: Generates RDDs with blocks received by the receiver of this stream.
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.StateDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.TransformedDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.UnionDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD: Gets the partition data by getting the corresponding block from the block manager.
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes column-wise summary statistics.
computeCorrelation(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation: Compute correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation: Compute the Pearson correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation: Compute Spearman's correlation for two datasets.
computeCorrelationMatrix(RDD<Vector>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation: Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation: Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation: Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation: Compute the Pearson correlation matrix from the covariance matrix.
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation: Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the correlation implementation for RDD[Vector].
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the covariance matrix, treating each row as an observation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError: Method to calculate loss of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss: Method to calculate loss of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss: Method to calculate error of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError: Method to calculate loss of the base learner for the gradient boosting calculation.
computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils: Returns a sampling rate that guarantees a sample of size >= sampleSizeLowerBound 99.99% of the time.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the Gramian matrix A^T A.
computeOrReadCheckpoint(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: Compute an RDD partition or read it from a checkpoint if the RDD is checkpointing.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo: Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the top k principal components.
computeSplitSize(long, long, long) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat: This input format overrides computeSplitSize() to make sure that each split only contains full records.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Computes the singular value decomposition of this IndexedRowMatrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes singular value decomposition of this matrix.
computeSVD(int, boolean, double, int, double, String) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: The actual SVD implementation, visible for testing.
computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils: Given the result returned by getCounts, determine the threshold for accepting items to generate exact sample size.
condition() - Method in class org.apache.spark.sql.execution.Filter
condition() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
condition() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
condition() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
conditionEvaluator() - Method in class org.apache.spark.sql.execution.Filter
conf() - Method in class org.apache.spark.rdd.RDD
conf() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
conf() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
conf() - Method in class org.apache.spark.scheduler.TaskSetManager
conf() - Method in class org.apache.spark.SparkContext
conf() - Method in class org.apache.spark.SparkEnv
conf() - Method in class org.apache.spark.sql.parquet.ParquetRelation
conf() - Method in class org.apache.spark.storage.BlockManager
conf() - Method in class org.apache.spark.streaming.StreamingContext
conf() - Method in class org.apache.spark.ui.SparkUI
confidence() - Method in class org.apache.spark.partial.BoundedDouble
configFile() - Method in class org.apache.spark.metrics.MetricsConfig
configTestLog4j(String) - Static method in class org.apache.spark.util.Utils: config a log4j properties used for testsuite
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD: Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in "labels"
connected(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
connectedComponents() - Method in class org.apache.spark.graphx.GraphOps: Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
ConnectedComponents - Class in org.apache.spark.graphx.lib: Connected components algorithm.
ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
CONSOLE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
CONSOLE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
CONSOLE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
CONSOLE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
ConsoleProgressBar - Class in org.apache.spark.ui: ConsoleProgressBar shows the progress of stages in the next line of the console.
ConsoleProgressBar(SparkContext) - Constructor for class org.apache.spark.ui.ConsoleProgressBar
ConsoleSink - Class in org.apache.spark.metrics.sink
ConsoleSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.ConsoleSink
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream: An input stream that always returns the same RDD on each timestep.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
constructURIForAuthentication(URI, SecurityManager) - Static method in class org.apache.spark.util.Utils: Construct a URI container information used for authentication.
consumerConnector() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap: Checks whether a parameter is explicitly specified.
contains(String) - Method in class org.apache.spark.SparkConf: Does the configuration contain a given parameter?
contains(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster: Check if block manager master has a block.
contains(BlockId) - Method in class org.apache.spark.storage.BlockStore
contains(BlockId) - Method in class org.apache.spark.storage.DiskStore
contains(BlockId) - Method in class org.apache.spark.storage.MemoryStore
contains(BlockId) - Method in class org.apache.spark.storage.TachyonStore
contains(A) - Method in class org.apache.spark.util.TimeStampedHashSet
containsBlock(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager: Check if disk block manager has a block.
containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus: Return whether the given block is stored in this block manager in O(1) time.
containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
containsShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster: Check if the given shuffle is being tracked
contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
context() - Method in interface org.apache.spark.api.java.JavaRDDLike: The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
context() - Method in class org.apache.spark.rdd.RDD: The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.sql.execution.SparkStrategies.CommandStrategy
context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
context() - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream: Return the StreamingContext associated with this DStream
ContextCleaner - Class in org.apache.spark: An asynchronous cleaner for RDD, shuffle, and broadcast state.
ContextCleaner(SparkContext) - Constructor for class org.apache.spark.ContextCleaner
ContextWaiter - Class in org.apache.spark.streaming
ContextWaiter() - Constructor for class org.apache.spark.streaming.ContextWaiter
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
convert() - Method in class org.apache.spark.WritableConverter
convertCatalystToJava(Object) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions: Converts Java objects to catalyst rows / types
convertFromAttributes(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
convertFromString(String) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
convertJavaToCatalyst(Object, DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions: Converts Java objects to catalyst rows / types
convertMetastoreParquet() - Method in class org.apache.spark.sql.hive.HiveContext: When true, enables an experimental feature where metastore tables that use the parquet SerDe are automatically converted to use the Spark SQL parquet table scan, instead of the Hive SerDe.
convertSplitLocationInfo(Object[]) - Static method in class org.apache.spark.rdd.HadoopRDD
convertToAttributes(Type, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
convertToBaggedRDD(RDD<Datum>, double, int, boolean, int) - Static method in class org.apache.spark.mllib.tree.impl.BaggedPoint: Convert an input dataset into its BaggedPoint representation, choosing subsamplingRate counts for each instance.
convertToString(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
convertToTreeRDD(RDD<LabeledPoint>, Bin[][], DecisionTreeMetadata) - Static method in class org.apache.spark.mllib.tree.impl.TreePoint: Convert an input dataset into its TreePoint representation, binning feature values in preparation for DecisionTree training.
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed: :: Experimental :: Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
copiesRunning() - Method in class org.apache.spark.scheduler.TaskSetManager
copy() - Method in class org.apache.spark.ml.param.ParamMap: Make a copy of this param map.
copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS: y = x
copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
copy() - Method in interface org.apache.spark.mllib.linalg.Matrix: Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
copy() - Method in interface org.apache.spark.mllib.linalg.Vector: Makes a deep copy of this vector.
copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator: Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the class when applicable for non-locking concurrent usage.
copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy: Returns a shallow copy of this instance.
copy() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator: Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator: Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator: Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator: Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.util.StatCounter: Clone this StatCounter
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
copyField(Row, int, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType: Copies from(fromOrdinal) to to(toOrdinal).
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.STRING
copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils: Copy all data from an InputStream to an OutputStream.
cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
cores() - Method in class org.apache.spark.scheduler.WorkerOffer
coresByTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Compute the Pearson correlation matrix for the input RDD of Vectors.
corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Compute the correlation matrix for the input RDD of Vectors using the specified method.
corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Compute the Pearson correlation for the input RDDs.
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: :: Experimental :: Compute the correlation for the input RDDs using the specified method.
Correlation - Interface in org.apache.spark.mllib.stat.correlation: Trait for correlation algorithms.
CorrelationNames - Class in org.apache.spark.mllib.stat.correlation: Maintains supported and default correlation names.
CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
Correlations - Class in org.apache.spark.mllib.stat.correlation: Delegates computation to the specific correlation object based on the input method name.
Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
count() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: The number of edges in the RDD.
count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl: The number of vertices in the RDD.
count() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
count() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample size.
count() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator: Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator: Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator: Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator: Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.rdd.RDD: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
count() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
count() - Method in interface org.apache.spark.sql.columnar.ColumnStats
COUNT() - Static method in class org.apache.spark.sql.hive.HiveQl
count() - Method in class org.apache.spark.sql.SchemaRDD: :: Experimental :: Return the number of elements in the RDD.
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.util.StatCounter
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD: Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental ::
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of count, which returns a future for counting the number of elements in this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Count the number of elements for each key, collecting the results to a local Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the count of each unique value in this RDD as a local map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: (Experimental) Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: (Experimental) Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
counter() - Method in class org.apache.spark.partial.MeanEvaluator
counter() - Method in class org.apache.spark.partial.SumEvaluator
CountEvaluator - Class in org.apache.spark.partial: An ApproximateEvaluator for counts.
CountEvaluator(int, double) - Constructor for class org.apache.spark.partial.CountEvaluator
cpFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
cpRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
cpState() - Method in class org.apache.spark.rdd.RDDCheckpointData
CPUS_PER_TASK() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
CR() - Method in class org.apache.spark.ui.ConsoleProgressBar
create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Deprecated.
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Create a new StorageLevel object.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD: Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD: Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD: Create a PartitionPruningRDD.
create(Object...) - Static method in class org.apache.spark.sql.api.java.Row: Creates a Row with the given values.
create(Seq<Object>) - Static method in class org.apache.spark.sql.api.java.Row: Creates a Row with the given values.
create(String, LogicalPlan, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation: Creates a new ParquetRelation and underlying Parquetfile for the given LogicalPlan.
create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
createActorSystem(String, String, int, SparkConf, SecurityManager) - Static method in class org.apache.spark.util.AkkaUtils: Creates an ActorSystem ready for remoting, with various Spark features.
createArrayType(DataType) - Static method in class org.apache.spark.sql.api.java.DataType: Creates an ArrayType by specifying the data type of elements (elementType).
createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType: Creates an ArrayType by specifying the data type of elements (elementType) and whether the array contains null values (containsNull).
createCombiner() - Method in class org.apache.spark.Aggregator
createCommand(Protos.Offer, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
createCompiledClass(String, File, String) - Static method in class org.apache.spark.TestUtils: Creates a compiled class with the given name.
createDecimal(BigDecimal) - Static method in class org.apache.spark.sql.hive.HiveShim
createDefaultDBIfNeeded(HiveContext) - Static method in class org.apache.spark.sql.hive.HiveShim
createDirectory(String, String) - Static method in class org.apache.spark.util.Utils: Create a directory inside the given parent directory.
createDriverEnv(SparkConf, boolean, LiveListenerBus) - Static method in class org.apache.spark.SparkEnv: Create a SparkEnv for the driver.
createDriverResultsArray() - Static method in class org.apache.spark.sql.hive.HiveShim
createEmpty(String, Seq<Attribute>, boolean, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation: Creates an empty ParquetRelation and underlying Parquetfile that only consists of the Metadata for the given schema.
createExecutorEnv(SparkConf, String, String, int, int, boolean, ActorSystem) - Static method in class org.apache.spark.SparkEnv: Create a SparkEnv for an executor.
createExecutorInfo(String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
createFilter(Expression) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
createFunction() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
createHistoryUI(SparkConf, SparkListenerBus, SecurityManager, String, String) - Static method in class org.apache.spark.ui.SparkUI
createJar(Seq<File>, File) - Static method in class org.apache.spark.TestUtils: Create a jar file that contains this set of files.
createJarWithClasses(Seq<String>, String) - Static method in class org.apache.spark.TestUtils: Create a jar that defines classes with the given names.
createJobID(Date, int) - Static method in class org.apache.spark.SparkHadoopWriter
createLiveUI(SparkContext, SparkConf, SparkListenerBus, JobProgressListener, SecurityManager, String) - Static method in class org.apache.spark.ui.SparkUI
createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a MapType by specifying the data type of keys (keyType) and values (keyType).
createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a MapType by specifying the data type of keys (keyType), the data type of values (keyType), and whether values contain any null value (valueContainsNull).
createMesosTask(TaskDescription, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend: Turn a Spark TaskDescription into a Mesos task
createMetricsSystem(String, SparkConf, SecurityManager) - Static method in class org.apache.spark.metrics.MetricsSystem
createNewSparkContext(SparkConf) - Static method in class org.apache.spark.streaming.StreamingContext
createNewSparkContext(String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.streaming.StreamingContext
createParquetFile(Class<?>, String, boolean, Configuration) - Method in class org.apache.spark.sql.api.java.JavaSQLContext: :: Experimental :: Creates an empty parquet file with the schema of class beanClass, which can be registered as a table.
createParquetFile(String, boolean, Configuration, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext: :: Experimental :: Creates an empty parquet file with the schema of class A, which can be registered as a table.
createPathFromString(String, JobConf) - Static method in class org.apache.spark.SparkHadoopWriter
createPathFromString(String, JobConf) - Static method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
createPlan(String) - Static method in class org.apache.spark.sql.hive.HiveQl: Creates LogicalPlan for a given HiveQL string.
createPlanForView(Table, Option<String>) - Static method in class org.apache.spark.sql.hive.HiveQl: Creates LogicalPlan for a given VIEW
createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPythonWorker(String, Map<String, String>) - Method in class org.apache.spark.SparkEnv
createRecordFilter(Seq<Expression>) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat: Create a FixedLengthBinaryRecordReader
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamFileInputFormat
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamInputFormat
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileInputFormat
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String) - Static method in class org.apache.spark.ui.JettyUtils: Create a handler that always redirects the user to the given path
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.json.DefaultSource: Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.parquet.DefaultSource: Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider: Returns a new base relation with the given parameters.
createRoutingTables(EdgeRDD<?>, Partitioner) - Static method in class org.apache.spark.graphx.VertexRDD
createSchemaRDD(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext: Creates a SchemaRDD from an RDD of case classes.
createServlet(JettyUtils.ServletParams<T>, SecurityManager, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
createServletHandler(String, JettyUtils.ServletParams<T>, SecurityManager, String, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils: Create a context handler that responds to a request with the given path prefix
createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils: Create a context handler that responds to a request with the given path prefix
createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils: Create a handler for serving files from a static directory
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Create a input stream from a Flume source.
createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from a Kafka Broker.
createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an InputDStream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create a Java-friendly InputDStream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a StructField by specifying the name (name), data type (dataType) and whether values of this field can be null values (nullable).
createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a StructField with empty metadata.
createStructType(List<StructField>) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a StructType with the given list of StructFields (fields).
createStructType(StructField[]) - Static method in class org.apache.spark.sql.api.java.DataType: Creates a StructType with the given StructField array (fields).
createTable(String, boolean, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.hive.HiveContext: Creates a table using the schema of the given class.
createTable(String, String, Seq<Attribute>, boolean, Option<CreateTableDesc>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog: Create table with specified database, table name, table description and schema
CreateTableAsSelect - Class in org.apache.spark.sql.hive.execution: :: Experimental :: Create table and insert the query result into it.
CreateTableAsSelect(String, String, LogicalPlan, boolean, Option<CreateTableDesc>) - Constructor for class org.apache.spark.sql.hive.execution.CreateTableAsSelect
CreateTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
CreateTableUsing - Class in org.apache.spark.sql.sources
CreateTableUsing(String, String, Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CreateTableUsing
createTempDir(String, String) - Static method in class org.apache.spark.util.Utils: Create a temporary directory inside the given parent directory.
createTempLocalBlock() - Method in class org.apache.spark.storage.DiskBlockManager: Produces a unique block id and File suitable for storing local intermediate results.
createTempShuffleBlock() - Method in class org.apache.spark.storage.DiskBlockManager: Produces a unique block id and File suitable for storing shuffled intermediate results.
createTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
createUsingIndex(Iterator<Product2<Object, VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps: Similar effect as aggregateUsingIndex((a, b) => a)
createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
creationSite() - Method in class org.apache.spark.rdd.RDD: User code that created this RDD (e.g.
creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
credentialsProvider() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
CrossValidator - Class in org.apache.spark.ml.tuning: :: AlphaComponent :: K-fold cross validation.
CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
CrossValidatorModel - Class in org.apache.spark.ml.tuning: :: AlphaComponent :: Model from k-fold cross validation.
CrossValidatorModel(CrossValidator, ParamMap, Model<?>) - Constructor for class org.apache.spark.ml.tuning.CrossValidatorModel
CrossValidatorParams - Interface in org.apache.spark.ml.tuning: Params for CrossValidator and CrossValidatorModel.
CSV_DEFAULT_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
CSV_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
CSV_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
CSV_KEY_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
CSV_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
CSV_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
CsvSink - Class in org.apache.spark.metrics.sink
CsvSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.CsvSink
currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
currentInterval(Duration) - Static method in class org.apache.spark.streaming.Interval
currentLocalityIndex() - Method in class org.apache.spark.scheduler.TaskSetManager
currentResult() - Method in interface org.apache.spark.partial.ApproximateEvaluator
currentResult() - Method in class org.apache.spark.partial.CountEvaluator
currentResult() - Method in class org.apache.spark.partial.GroupedCountEvaluator
currentResult() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
currentResult() - Method in class org.apache.spark.partial.GroupedSumEvaluator
currentResult() - Method in class org.apache.spark.partial.MeanEvaluator
currentResult() - Method in class org.apache.spark.partial.SumEvaluator
currentTime() - Method in interface org.apache.spark.streaming.util.Clock
currentTime() - Method in class org.apache.spark.streaming.util.ManualClock
currentTime() - Method in class org.apache.spark.streaming.util.SystemClock
currentUnrollMemory() - Method in class org.apache.spark.storage.MemoryStore: Return the amount of memory currently occupied for unrolling blocks across all threads.
currentUnrollMemoryForThisThread() - Method in class org.apache.spark.storage.MemoryStore: Return the amount of memory currently occupied for unrolling blocks by this thread.
currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer

D

DAGScheduler - Class in org.apache.spark.scheduler: The high-level scheduling layer that implements stage-oriented scheduling.
DAGScheduler(SparkContext, TaskScheduler, LiveListenerBus, MapOutputTrackerMaster, BlockManagerMaster, SparkEnv, Clock) - Constructor for class org.apache.spark.scheduler.DAGScheduler
DAGScheduler(SparkContext, TaskScheduler) - Constructor for class org.apache.spark.scheduler.DAGScheduler
DAGScheduler(SparkContext) - Constructor for class org.apache.spark.scheduler.DAGScheduler
dagScheduler() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
dagScheduler() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
dagScheduler() - Method in class org.apache.spark.SparkContext
DAGSchedulerActorSupervisor - Class in org.apache.spark.scheduler
DAGSchedulerActorSupervisor(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerActorSupervisor
DAGSchedulerEvent - Interface in org.apache.spark.scheduler: Types of events that can be handled by the DAGScheduler.
DAGSchedulerEventProcessActor - Class in org.apache.spark.scheduler
DAGSchedulerEventProcessActor(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerEventProcessActor
DAGSchedulerSource - Class in org.apache.spark.scheduler
DAGSchedulerSource(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerSource
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
data() - Method in class org.apache.spark.storage.BlockResult
data() - Method in class org.apache.spark.storage.PutResult
data() - Method in class org.apache.spark.util.Distribution
data() - Method in class org.apache.spark.util.random.GapSamplingIterator
data() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
database() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
databaseName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
dataDeserialize(BlockId, ByteBuffer, Serializer) - Method in class org.apache.spark.storage.BlockManager: Deserializes a ByteBuffer into an iterator of values and disposes of it when the end of the iterator is reached.
dataIncludesKey() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
dataSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
dataSerialize(BlockId, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager: Serializes into a byte buffer.
dataSerializeStream(BlockId, OutputStream, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager: Serializes into a stream.
DataSinks() - Method in interface org.apache.spark.sql.hive.HiveStrategies
DataSourceStrategy - Class in org.apache.spark.sql.sources: A Strategy for planning scans over data sources defined using the sources API.
DataSourceStrategy() - Constructor for class org.apache.spark.sql.sources.DataSourceStrategy
DataType - Class in org.apache.spark.sql.api.java: The base type of all Spark SQL data types.
DataType() - Constructor for class org.apache.spark.sql.api.java.DataType
dataType() - Method in class org.apache.spark.sql.columnar.NativeColumnType
dataType() - Method in class org.apache.spark.sql.execution.PythonUDF
dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
dataType() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
dataType() - Method in class org.apache.spark.sql.hive.HiveUdaf
DataTypeConversions - Class in org.apache.spark.sql.types.util
DataTypeConversions() - Constructor for class org.apache.spark.sql.types.util.DataTypeConversions
DataValidators - Class in org.apache.spark.mllib.util: :: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
DATE - Class in org.apache.spark.sql.columnar
DATE() - Constructor for class org.apache.spark.sql.columnar.DATE
DateColumnAccessor - Class in org.apache.spark.sql.columnar
DateColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DateColumnAccessor
DateColumnBuilder - Class in org.apache.spark.sql.columnar
DateColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DateColumnBuilder
DateColumnStats - Class in org.apache.spark.sql.columnar
DateColumnStats() - Constructor for class org.apache.spark.sql.columnar.DateColumnStats
DateType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the DateType object.
DateType - Class in org.apache.spark.sql.api.java: The data type representing java.sql.Date values.
datum() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
DDLParser - Class in org.apache.spark.sql.sources: A parser for foreign DDL commands.
DDLParser() - Constructor for class org.apache.spark.sql.sources.DDLParser
dead(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
decimalMetadata() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
decimalMetastoreString(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
DecimalType - Class in org.apache.spark.sql.api.java: The data type representing java.math.BigDecimal values.
DecimalType(int, int) - Constructor for class org.apache.spark.sql.api.java.DecimalType
DecimalType() - Constructor for class org.apache.spark.sql.api.java.DecimalType
decimalTypeInfo(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
decimalTypeInfoToCatalyst(PrimitiveObjectInspector) - Static method in class org.apache.spark.sql.hive.HiveShim
DecisionTree - Class in org.apache.spark.mllib.tree: :: Experimental :: A class which implements a decision tree learning algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
DecisionTreeMetadata - Class in org.apache.spark.mllib.tree.impl: Learning and dataset metadata for DecisionTree.
DecisionTreeMetadata(int, long, int, int, Map<Object, Object>, Set<Object>, int[], Impurity, Enumeration.Value, int, int, double, int, int) - Constructor for class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model: :: Experimental :: Decision tree model for classification or regression.
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
decoder() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
decoder(ByteBuffer, NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
decreaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
deepCopy() - Method in class org.apache.spark.mllib.tree.model.Node: Returns a deep copy of the subtree rooted at this node.
DEFAULT_BUFFER_SIZE() - Static method in class org.apache.spark.util.logging.RollingFileAppender
DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
DEFAULT_LOG_DIR() - Static method in class org.apache.spark.scheduler.EventLoggingListener
DEFAULT_MINIMUM_SHARE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
DEFAULT_POOL_NAME() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
DEFAULT_PORT() - Static method in class org.apache.spark.ui.SparkUI
DEFAULT_PREFIX() - Method in class org.apache.spark.metrics.MetricsConfig
DEFAULT_RETAINED_JOBS() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
DEFAULT_SCHEDULER_FILE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
DEFAULT_SCHEDULING_MODE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
DEFAULT_WEIGHT() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
defaultCorrName() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
defaultFilter(Path) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
As of Spark 1.0.0, defaultMinSplits is deprecated, use JavaSparkContext.defaultMinPartitions() instead
defaultMinSplits() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext: Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
defaultParallelism() - Method in class org.apache.spark.scheduler.local.LocalBackend
defaultParallelism() - Method in interface org.apache.spark.scheduler.SchedulerBackend
defaultParallelism() - Method in interface org.apache.spark.scheduler.TaskScheduler
defaultParallelism() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
defaultParallelism() - Method in class org.apache.spark.SparkContext: Default level of parallelism to use when not given by user (e.g.
defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy: Returns default configuration for the boosting algorithm
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner: Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultPartitioner(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
defaultProbabilities() - Method in class org.apache.spark.util.Distribution
defaultSize() - Method in class org.apache.spark.sql.columnar.ColumnType
defaultSizeInBytes() - Method in interface org.apache.spark.sql.SQLConf: The default size in bytes to assign to a logical operator's estimation statistics.
DefaultSource - Class in org.apache.spark.sql.json
DefaultSource() - Constructor for class org.apache.spark.sql.json.DefaultSource
DefaultSource - Class in org.apache.spark.sql.parquet: Allows creation of parquet based tables using the syntax CREATE TEMPORARY TABLE ...
DefaultSource() - Constructor for class org.apache.spark.sql.parquet.DefaultSource
defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy: Construct a default set of parameters for DecisionTree
defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
defaultValue() - Method in class org.apache.spark.ml.param.Param
DeferredObjectAdapter - Class in org.apache.spark.sql.hive
DeferredObjectAdapter(ObjectInspector) - Constructor for class org.apache.spark.sql.hive.DeferredObjectAdapter
degrees() - Method in class org.apache.spark.graphx.GraphOps: The degree of each vertex in the graph.
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Returns the degree(s) of freedom of the hypothesis test.
delaySeconds() - Method in class org.apache.spark.streaming.Checkpoint
delegate() - Method in class org.apache.spark.InterruptibleIterator
deleteAllCheckpoints() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache: Call this after training is finished to delete any remaining checkpoints.
deleteOldFiles() - Method in class org.apache.spark.util.logging.RollingFileAppender: Retain only last few files
deleteRecursively(File) - Static method in class org.apache.spark.util.Utils: Delete a file or directory and its contents recursively.
deleteRecursively(TachyonFile, TachyonFS) - Static method in class org.apache.spark.util.Utils: Delete a file or directory and its contents recursively.
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Creates a column-major dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from a double array.
DenseMatrix - Class in org.apache.spark.mllib.linalg: Column-major dense matrix.
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
DenseVector - Class in org.apache.spark.mllib.linalg: A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
dependencies() - Method in class org.apache.spark.rdd.RDD: Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream: List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.MappedDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.StateDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.UnionDStream
dependencies() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
Dependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies.
Dependency() - Constructor for class org.apache.spark.Dependency
deps() - Method in class org.apache.spark.rdd.CoGroupPartition
depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Get depth of tree.
DeregisterReceiver - Class in org.apache.spark.streaming.scheduler
DeregisterReceiver(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.DeregisterReceiver
desc() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
DescribeCommand - Class in org.apache.spark.sql.execution: :: DeveloperApi ::
DescribeCommand(SparkPlan, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.DescribeCommand
describedTable() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
DescribeHiveTableCommand - Class in org.apache.spark.sql.hive.execution: Implementation for "describe [extended] table".
DescribeHiveTableCommand(MetastoreRelation, Seq<Attribute>, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
description() - Method in class org.apache.spark.ExceptionFailure
description() - Method in class org.apache.spark.storage.StorageLevel
description() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
DeserializationStream - Class in org.apache.spark.serializer: :: DeveloperApi :: A stream for reading serialized objects.
DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
deserialize(Object) - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper: Convert a SQL datum to the user type
deserialize(Object) - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper: Convert a SQL datum to the user type
deserialize(Object) - Method in class org.apache.spark.sql.api.java.UserDefinedType: Convert a SQL datum to the user type
deserialize(byte[], ClassTag<T>) - Static method in class org.apache.spark.sql.execution.SparkSqlSerializer
deserialize(Writable) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
deserialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
deserialize(byte[]) - Static method in class org.apache.spark.util.Utils: Deserialize an object using Java serialization
deserialize(byte[], ClassLoader) - Static method in class org.apache.spark.util.Utils: Deserialize an object using Java serialization and the given ClassLoader
deserialized() - Method in class org.apache.spark.storage.MemoryEntry
deserialized() - Method in class org.apache.spark.storage.StorageLevel
deserializeFilterExpressions(Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters: Note: Inside the Hadoop API we only have access to Configuration, not to SparkContext, so we cannot use broadcasts to convey the actual filter predicate.
deserializeLongValue(byte[]) - Static method in class org.apache.spark.util.Utils: Deserialize a Long value (used for PythonPartitioner)
deserializeMapStatuses(byte[]) - Static method in class org.apache.spark.MapOutputTracker
deserializePlan(InputStream, Class<?>) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
deserializeStream(InputStream, ClassLoader) - Method in class org.apache.spark.serializer.JavaSerializerInstance
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
deserializeViaNestedStream(InputStream, SerializerInstance, Function1<DeserializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Deserialize via nested stream using specific serializer
deserializeWithDependencies(ByteBuffer) - Static method in class org.apache.spark.scheduler.Task: Deserialize the list of dependencies in a task serialized with serializeWithDependencies, and return the task itself as a serialized ByteBuffer.
destinationToken() - Static method in class org.apache.spark.sql.hive.HiveQl
destroy() - Method in class org.apache.spark.broadcast.Broadcast: Destroy all data and metadata related to this broadcast variable.
destroy(boolean) - Method in class org.apache.spark.broadcast.Broadcast: Destroy all data and metadata related to this broadcast variable.
destroyPythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
destTableId() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
details() - Method in class org.apache.spark.scheduler.Stage
details() - Method in class org.apache.spark.scheduler.StageInfo
determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner: Determines the bounds for range partitioning from candidates with weights indicating how many items each represents.
DeveloperApi - Annotation Type in org.apache.spark.annotation: A lower-level, unstable API intended for developers.
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a diagonal matrix in DenseMatrix format from the supplied values.
dialect() - Method in class org.apache.spark.sql.hive.HiveContext
dialect() - Method in interface org.apache.spark.sql.SQLConf: The SQL dialect that is used when parsing queries.
DictionaryEncoding - Class in org.apache.spark.sql.columnar.compression
DictionaryEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding
DictionaryEncoding.Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
DictionaryEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
DictionaryEncoding.Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
DictionaryEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
diff(Self) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps: Hides vertices that are the same between this and other.
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD: Hides vertices that are the same between this and other; for vertices that are different, keeps the values from other.
dir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
dir() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
DirectTaskResult<T> - Class in org.apache.spark.scheduler: A TaskResult that contains the task's return value and accumulator updates.
DirectTaskResult(ByteBuffer, Map<Object, Object>, TaskMetrics) - Constructor for class org.apache.spark.scheduler.DirectTaskResult
DirectTaskResult() - Constructor for class org.apache.spark.scheduler.DirectTaskResult
disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions: Allows for the spark.hadoop.validateOutputSpecs checks to be disabled on a case-by-case basis; see SPARK-4835 for more details.
disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
disconnected() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
diskBlockManager() - Method in class org.apache.spark.storage.BlockManager
DiskBlockManager - Class in org.apache.spark.storage: Creates and maintains the logical mapping between logical blocks and physical on-disk locations.
DiskBlockManager(BlockManager, SparkConf) - Constructor for class org.apache.spark.storage.DiskBlockManager
DiskBlockObjectWriter - Class in org.apache.spark.storage: BlockObjectWriter which writes directly to a file on disk.
DiskBlockObjectWriter(BlockId, File, Serializer, int, Function1<OutputStream, OutputStream>, boolean, ShuffleWriteMetrics) - Constructor for class org.apache.spark.storage.DiskBlockObjectWriter
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
diskSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
diskSize() - Method in class org.apache.spark.storage.BlockStatus
diskSize() - Method in class org.apache.spark.storage.RDDInfo
diskStore() - Method in class org.apache.spark.storage.BlockManager
DiskStore - Class in org.apache.spark.storage: Stores BlockManager blocks on disk.
DiskStore(BlockManager, DiskBlockManager) - Constructor for class org.apache.spark.storage.DiskStore
diskUsed() - Method in class org.apache.spark.storage.StorageStatus: Return the disk space used by this block manager.
diskUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus: Return the disk space used by the given RDD in this block manager in O(1) time.
dispose(ByteBuffer) - Static method in class org.apache.spark.storage.BlockManager: Attempt to clean up a ByteBuffer if it is memory-mapped.
dist(Vector) - Method in class org.apache.spark.util.Vector
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD: Return a new RDD containing the distinct elements in this RDD.
Distinct - Class in org.apache.spark.sql.execution: :: DeveloperApi :: Computes the set of distinct input rows using a HashSet.
Distinct(boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Distinct
distinct() - Method in class org.apache.spark.sql.SchemaRDD
distinct(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed: Represents a distributively stored matrix backed by one or more RDDs.
Distribution - Class in org.apache.spark.util: Util for getting some stats from a small sample of numeric values, with some handy summary functions.
Distribution(double[], int, int) - Constructor for class org.apache.spark.util.Distribution
Distribution(Traversable<Object>) - Constructor for class org.apache.spark.util.Distribution
DIV() - Static method in class org.apache.spark.sql.hive.HiveQl
div(Duration) - Method in class org.apache.spark.streaming.Duration
divide(double) - Method in class org.apache.spark.util.Vector
doc() - Method in class org.apache.spark.ml.param.Param
doCancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
doCheckpoint() - Method in class org.apache.spark.rdd.RDD: Performs the checkpointing of this RDD by saving this.
doCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
DoCheckpoint - Class in org.apache.spark.streaming.scheduler
DoCheckpoint(Time) - Constructor for class org.apache.spark.streaming.scheduler.DoCheckpoint
doCleanupBroadcast(long, boolean) - Method in class org.apache.spark.ContextCleaner: Perform broadcast cleanup.
doCleanupRDD(int, boolean) - Method in class org.apache.spark.ContextCleaner: Perform RDD cleanup.
doCleanupShuffle(int, boolean) - Method in class org.apache.spark.ContextCleaner: Perform shuffle cleanup, asynchronously.
doesDirectoryContainAnyNewFiles(File, long) - Static method in class org.apache.spark.util.Utils: Determines if a directory contains any files newer than cutoff seconds.
doKillExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend: Request that the ApplicationMaster kill the specified executors.
doRequestTotalExecutors(int) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend: Request executors from the ApplicationMaster by specifying the total number desired.
dot(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS: dot(x, y)
dot(Vector) - Method in class org.apache.spark.util.Vector
DOUBLE - Class in org.apache.spark.sql.columnar
DOUBLE() - Constructor for class org.apache.spark.sql.columnar.DOUBLE
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
DoubleColumnAccessor - Class in org.apache.spark.sql.columnar
DoubleColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DoubleColumnAccessor
DoubleColumnBuilder - Class in org.apache.spark.sql.columnar
DoubleColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnBuilder
DoubleColumnStats - Class in org.apache.spark.sql.columnar
DoubleColumnStats() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnStats
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleParam - Class in org.apache.spark.ml.param: Specialized version of Param[Double] for Java.
DoubleParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleRDDFunctions - Class in org.apache.spark.rdd: Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
DoubleType - Static variable in class org.apache.spark.sql.api.java.DataType: Gets the DoubleType object.
DoubleType - Class in org.apache.spark.sql.api.java: The data type representing double and Double values.
doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
driver() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
driver() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
DRIVER_AKKA_ACTOR_NAME() - Method in class org.apache.spark.storage.BlockManagerMaster
DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
driverActor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
driverActor() - Method in class org.apache.spark.storage.BlockManagerMaster
driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
driverSideSetup() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
dropFromMemory(BlockId, Either<Object[], ByteBuffer>) - Method in class org.apache.spark.storage.BlockManager: Drop a block from memory, possibly putting it on disk if applicable.
droppedBlocks() - Method in class org.apache.spark.storage.PutResult
droppedBlocks() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
DropTable - Class in org.apache.spark.sql.hive
DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.DropTable
DropTable - Class in org.apache.spark.sql.hive.execution: :: DeveloperApi :: Drops a table from the metastore and removes it if it is cached.
DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DropTable
dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext: Drops the temporary table with the given table name in the catalog.
Dst - Static variable in class org.apache.spark.graphx.TripletFields: Expose the destination and edge fields but not the source field.
dstAttr() - Method in class org.apache.spark.graphx.EdgeContext: The vertex attribute of the edge's destination vertex.
dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet: The destination vertex attribute
dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
dstId() - Method in class org.apache.spark.graphx.Edge
dstId() - Method in class org.apache.spark.graphx.EdgeContext: The vertex id of the edge's destination vertex.
dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
dstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
DStream<T> - Class in org.apache.spark.streaming.dstream: A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
DStreamCheckpointData<T> - Class in org.apache.spark.streaming.dstream
DStreamCheckpointData(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStreamCheckpointData
DStreamGraph - Class in org.apache.spark.streaming
DStreamGraph() - Constructor for class org.apache.spark.streaming.DStreamGraph
DTStatsAggregator - Class in org.apache.spark.mllib.tree.impl: DecisionTree statistics aggregator for a node.
DTStatsAggregator(DecisionTreeMetadata, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.impl.DTStatsAggregator
DummyCategoricalSplit - Class in org.apache.spark.mllib.tree.model: Split with no acceptable feature values for categorical features.
DummyCategoricalSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyCategoricalSplit
DummyHighSplit - Class in org.apache.spark.mllib.tree.model: Split with maximum threshold for continuous features.
DummyHighSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyHighSplit
DummyLowSplit - Class in org.apache.spark.mllib.tree.model: Split with minimum threshold for continuous features.
DummyLowSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyLowSplit
dumpTree(Node, StringBuilder, int) - Static method in class org.apache.spark.sql.hive.HiveQl
duration() - Method in class org.apache.spark.scheduler.TaskInfo
Duration - Class in org.apache.spark.streaming
Duration(long) - Constructor for class org.apache.spark.streaming.Duration
duration() - Method in class org.apache.spark.streaming.Interval
Durations - Class in org.apache.spark.streaming
Durations() - Constructor for class org.apache.spark.streaming.Durations





E

e() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
e() - Method in class org.apache.spark.streaming.scheduler.ErrorReported
 
Edge<ED> - Class in org.apache.spark.graphx

A single directed edge consisting of a source id, target id,
 and the data associated with the edge.

Edge(long, long, ED) - Constructor for class org.apache.spark.graphx.Edge
 
EdgeActiveness - Enum in org.apache.spark.graphx.impl

Criteria for filtering edges based on activeness.

edgeArraySortDataFormat() - Static method in class org.apache.spark.graphx.Edge
 
edgeArraySortDataFormat() - Static method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
EdgeContext<VD,ED,A> - Class in org.apache.spark.graphx

Represents an edge along with its neighboring vertices and allows sending messages along the
 edge.

EdgeContext() - Constructor for class org.apache.spark.graphx.EdgeContext
 
EdgeDirection - Class in org.apache.spark.graphx

The direction of a directed edge relative to a vertex.

edgeListFile(SparkContext, String, boolean, int, StorageLevel, StorageLevel) - Static method in class org.apache.spark.graphx.GraphLoader

Loads a graph from an edge list formatted file where each line contains two integers: a source
 id and a target id.

EdgeOnly - Static variable in class org.apache.spark.graphx.TripletFields

Expose only the edge field and not the source or destination field.

EdgePartition<ED,VD> - Class in org.apache.spark.graphx.impl

A collection of edges, along with referenced vertex attributes and an optional active vertex set
 for filtering computation on the edges.

EdgePartition(int[], int[], Object, GraphXPrimitiveKeyOpenHashMap<Object, Object>, GraphXPrimitiveKeyOpenHashMap<Object, Object>, long[], Object, Option<OpenHashSet<Object>>, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgePartition
 
EdgePartitionBuilder<ED,VD> - Class in org.apache.spark.graphx.impl

Constructs an EdgePartition from scratch.

EdgePartitionBuilder(int, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgePartitionBuilder
 
edgePartitionToMsgs(int, EdgePartition<?, ?>) - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition

Generate a `RoutingTableMessage` for each vertex referenced in `edgePartition`.

EdgeRDD<ED> - Class in org.apache.spark.graphx

EdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each
 partition for performance.

EdgeRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.EdgeRDD
 
EdgeRDDImpl<ED,VD> - Class in org.apache.spark.graphx.impl
 
EdgeRDDImpl(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, StorageLevel, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgeRDDImpl
 
edges() - Method in class org.apache.spark.graphx.Graph

An RDD containing the edges and their associated attributes.

edges() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
edges() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
EdgeTriplet<VD,ED> - Class in org.apache.spark.graphx

An edge triplet represents an edge along with the vertex attributes of its neighboring vertices.

EdgeTriplet() - Constructor for class org.apache.spark.graphx.EdgeTriplet
 
EdgeWithLocalIds<ED> - Class in org.apache.spark.graphx.impl

Add a new edge to the partition.

EdgeWithLocalIds(long, long, int, int, ED) - Constructor for class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
EigenValueDecomposition - Class in org.apache.spark.mllib.linalg

:: Experimental ::
 Compute eigen-decomposition.

EigenValueDecomposition() - Constructor for class org.apache.spark.mllib.linalg.EigenValueDecomposition
 
Either() - Static method in class org.apache.spark.graphx.EdgeDirection

Edges originating from *or* arriving at a vertex of interest.

elementClassTag() - Method in class org.apache.spark.rdd.RDD
 
elementIds() - Method in class org.apache.spark.mllib.recommendation.InLinkBlock
 
elementIds() - Method in class org.apache.spark.mllib.recommendation.OutLinkBlock
 
elements() - Method in class org.apache.spark.util.Vector
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
emittedTaskSizeWarning() - Method in class org.apache.spark.scheduler.TaskSetManager
 
empty() - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition
 
empty() - Static method in class org.apache.spark.ml.param.ParamMap

Returns an empty param map.

empty() - Static method in class org.apache.spark.scheduler.EventLoggingInfo
 
empty() - Static method in class org.apache.spark.storage.BlockStatus
 
empty() - Method in class org.apache.spark.util.TimeStampedHashMap
 
empty() - Method in class org.apache.spark.util.TimeStampedHashSet
 
empty() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
emptyJson() - Static method in class org.apache.spark.util.Utils

Return an empty JSON object

emptyNode(int) - Static method in class org.apache.spark.mllib.tree.model.Node

Return a node with the given node id (but nothing else set).

emptyRDD() - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD that has no partitions or elements.

EmptyRDD<T> - Class in org.apache.spark.rdd

An RDD that has no partitions and no elements.

EmptyRDD(SparkContext, ClassTag<T>) - Constructor for class org.apache.spark.rdd.EmptyRDD
 
emptyRDD(ClassTag<T>) - Method in class org.apache.spark.SparkContext

Get an RDD that has no partitions or elements.

enableLogForwarding() - Static method in class org.apache.spark.sql.parquet.ParquetRelation
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
encoder(NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
end() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystStructConverter
 
endIdx() - Method in class org.apache.spark.util.Distribution
 
endTime() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
endTime() - Method in class org.apache.spark.streaming.Interval
 
endTime() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
endTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
enforceCorrectType(Object, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD
 
enqueueFailedTask(TaskSetManager, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskResultGetter
 
enqueueSuccessfulTask(TaskSetManager, long, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskResultGetter
 
EnsembleCombiningStrategy - Class in org.apache.spark.mllib.tree.configuration

Enum to select ensemble combining strategy for base learners

EnsembleCombiningStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
entries() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
Entropy - Class in org.apache.spark.mllib.tree.impurity

:: Experimental ::
 Class for calculating entropy during
 binary classification.

Entropy() - Constructor for class org.apache.spark.mllib.tree.impurity.Entropy
 
EntropyAggregator - Class in org.apache.spark.mllib.tree.impurity

Class for updating views of a vector of sufficient statistics,
 in order to compute impurity from a sample.

EntropyAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.EntropyAggregator
 
EntropyCalculator - Class in org.apache.spark.mllib.tree.impurity

Stores statistics for one (node, feature, bin) for calculating impurity.

EntropyCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.EntropyCalculator
 
entrySet() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
env() - Method in class org.apache.spark.api.java.JavaSparkContext
 
env() - Method in class org.apache.spark.scheduler.TaskSetManager
 
env() - Method in class org.apache.spark.SparkContext
 
env() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
env() - Method in class org.apache.spark.streaming.StreamingContext
 
environmentDetails() - Method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
environmentDetails(SparkConf, String, Seq<String>, Seq<String>) - Static method in class org.apache.spark.SparkEnv

Return a map representation of jvm information, Spark properties, system properties, and
 class paths.

EnvironmentListener - Class in org.apache.spark.ui.env

:: DeveloperApi ::
 A SparkListener that prepares information to be displayed on the EnvironmentTab

EnvironmentListener() - Constructor for class org.apache.spark.ui.env.EnvironmentListener
 
environmentListener() - Method in class org.apache.spark.ui.SparkUI
 
EnvironmentPage - Class in org.apache.spark.ui.env
 
EnvironmentPage(EnvironmentTab) - Constructor for class org.apache.spark.ui.env.EnvironmentPage
 
EnvironmentTab - Class in org.apache.spark.ui.env
 
EnvironmentTab(SparkUI) - Constructor for class org.apache.spark.ui.env.EnvironmentTab
 
environmentUpdateFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
environmentUpdateToJson(SparkListenerEnvironmentUpdate) - Static method in class org.apache.spark.util.JsonProtocol
 
envVars() - Method in class org.apache.spark.sql.execution.PythonUDF
 
epoch() - Method in class org.apache.spark.scheduler.Task
 
epoch() - Method in class org.apache.spark.scheduler.TaskSetManager
 
EPSILON() - Static method in class org.apache.spark.mllib.util.MLUtils
 
equals(Object) - Method in class org.apache.spark.graphx.EdgeDirection
 
equals(Object) - Method in class org.apache.spark.HashPartitioner
 
equals(Object) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
equals(Object) - Method in interface org.apache.spark.mllib.linalg.Vector
 
equals(Object) - Method in class org.apache.spark.mllib.recommendation.ALSPartitioner
 
equals(Object) - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
equals(Object) - Method in class org.apache.spark.RangePartitioner
 
equals(Object) - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
equals(Object) - Method in class org.apache.spark.scheduler.AccumulableInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.InputFormatInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.SplitInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.Stage
 
equals(Object) - Method in class org.apache.spark.sql.api.java.ArrayType
 
equals(Object) - Method in class org.apache.spark.sql.api.java.DecimalType
 
equals(Object) - Method in class org.apache.spark.sql.api.java.MapType
 
equals(Object) - Method in class org.apache.spark.sql.api.java.Row
 
equals(Object) - Method in class org.apache.spark.sql.api.java.StructField
 
equals(Object) - Method in class org.apache.spark.sql.api.java.StructType
 
equals(Object) - Method in class org.apache.spark.sql.api.java.UserDefinedType
 
equals(Object) - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
equals(Object) - Method in class org.apache.spark.sql.sources.LogicalRelation
 
equals(Object) - Method in class org.apache.spark.storage.BlockId
 
equals(Object) - Method in class org.apache.spark.storage.BlockManagerId
 
equals(Object) - Method in class org.apache.spark.storage.StorageLevel
 
EqualTo - Class in org.apache.spark.sql.sources
 
EqualTo(String, Object) - Constructor for class org.apache.spark.sql.sources.EqualTo
 
error(SchedulerDriver, String) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
error(SchedulerDriver, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
error(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
error() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
error() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
errorMessage() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
ErrorReported - Class in org.apache.spark.streaming.scheduler
 
ErrorReported(String, Throwable) - Constructor for class org.apache.spark.streaming.scheduler.ErrorReported
 
estimate(Object) - Static method in class org.apache.spark.util.SizeEstimator
 
Estimator<M extends Model<M>> - Class in org.apache.spark.ml

:: AlphaComponent ::
 Abstract class for estimators that fit models to data.

Estimator() - Constructor for class org.apache.spark.ml.Estimator
 
estimator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams

param for the estimator to be cross-validated

estimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams

param for estimator param maps

eval(Row) - Method in class org.apache.spark.sql.execution.PythonUDF
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
evaluate(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
evaluate(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Evaluator

Evaluates the output.

EvaluatePython - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Evaluates a PythonUDF, appending the result to the end of the input tuple.

EvaluatePython(PythonUDF, LogicalPlan, AttributeReference) - Constructor for class org.apache.spark.sql.execution.EvaluatePython
 
Evaluator - Class in org.apache.spark.ml

:: AlphaComponent ::
 Abstract class for evaluators that compute metrics from predictions.

Evaluator() - Constructor for class org.apache.spark.ml.Evaluator
 
evaluator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams

param for the evaluator for selection

event() - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
eventLogDir() - Method in class org.apache.spark.SparkContext
 
eventLogger() - Method in class org.apache.spark.SparkContext
 
EventLoggingInfo - Class in org.apache.spark.scheduler

Information needed to process the event logs associated with an application.

EventLoggingInfo(Seq<Path>, String, Option<CompressionCodec>, boolean) - Constructor for class org.apache.spark.scheduler.EventLoggingInfo
 
EventLoggingListener - Class in org.apache.spark.scheduler

A SparkListener that logs events to persistent storage.

EventLoggingListener(String, String, SparkConf, Configuration) - Constructor for class org.apache.spark.scheduler.EventLoggingListener
 
EventLoggingListener(String, String, SparkConf) - Constructor for class org.apache.spark.scheduler.EventLoggingListener
 
eventProcessActor() - Method in class org.apache.spark.scheduler.DAGScheduler
 
EventTransformer - Class in org.apache.spark.streaming.flume

A simple object that provides the implementation of readExternal and writeExternal for both
 the wrapper classes for Flume-style Events.

EventTransformer() - Constructor for class org.apache.spark.streaming.flume.EventTransformer
 
ExamplePoint - Class in org.apache.spark.sql.test

An example class to demonstrate UDT in Scala, Java, and Python.

ExamplePoint(double, double) - Constructor for class org.apache.spark.sql.test.ExamplePoint
 
ExamplePointUDT - Class in org.apache.spark.sql.test

User-defined type for ExamplePoint.

ExamplePointUDT() - Constructor for class org.apache.spark.sql.test.ExamplePointUDT
 
Except - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Returns a table with the elements from left that are not in right using
 the built-in spark subtract function.

Except(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Except
 
except(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD

Performs a relational except on two SchemaRDDs

exception() - Method in class org.apache.spark.scheduler.JobFailed
 
EXCEPTION_PRINT_INTERVAL() - Method in class org.apache.spark.scheduler.TaskSetManager
 
ExceptionFailure - Class in org.apache.spark

:: DeveloperApi ::
 Task failed due to a runtime exception.

ExceptionFailure(String, String, StackTraceElement[], String, Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
 
ExceptionFailure(Throwable, Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
 
exceptionFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
exceptionString(Throwable) - Static method in class org.apache.spark.util.Utils

Return a nice string representation of the exception.

exceptionToJson(Exception) - Static method in class org.apache.spark.util.JsonProtocol
 
Exchange - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

Exchange(Partitioning, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Exchange
 
execArgs() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
execId() - Method in class org.apache.spark.ExecutorLostFailure
 
execId() - Method in class org.apache.spark.scheduler.ExecutorAdded
 
execId() - Method in class org.apache.spark.scheduler.ExecutorLost
 
execId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
execId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
 
execute() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
execute() - Method in class org.apache.spark.sql.execution.Aggregate

Substituted version of aggregateExpressions expressions which are used to compute final
 output rows given a group and the result of all aggregate computations.

execute() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
 
execute() - Method in interface org.apache.spark.sql.execution.Command
 
execute() - Method in class org.apache.spark.sql.execution.Distinct
 
execute() - Method in class org.apache.spark.sql.execution.Except
 
execute() - Method in class org.apache.spark.sql.execution.Exchange
 
execute() - Method in class org.apache.spark.sql.execution.ExecutedCommand
 
execute() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
execute() - Method in class org.apache.spark.sql.execution.ExternalSort
 
execute() - Method in class org.apache.spark.sql.execution.Filter
 
execute() - Method in class org.apache.spark.sql.execution.Generate
 
execute() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
execute() - Method in class org.apache.spark.sql.execution.Intersect
 
execute() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
execute() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
execute() - Method in class org.apache.spark.sql.execution.joins.CartesianProduct
 
execute() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
execute() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
execute() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
execute() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
execute() - Method in class org.apache.spark.sql.execution.Limit
 
execute() - Method in class org.apache.spark.sql.execution.OutputFaker
 
execute() - Method in class org.apache.spark.sql.execution.PhysicalRDD
 
execute() - Method in class org.apache.spark.sql.execution.Project
 
execute() - Method in class org.apache.spark.sql.execution.Sample
 
execute() - Method in class org.apache.spark.sql.execution.Sort
 
execute() - Method in class org.apache.spark.sql.execution.SparkPlan

Runs this query returning the result as an RDD.

execute() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
execute() - Method in class org.apache.spark.sql.execution.Union
 
execute() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
execute() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
execute() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
execute() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable

Inserts all rows into the Parquet file.

execute() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
execute(Seq<String>, File) - Static method in class org.apache.spark.util.Utils

Execute a command in the given working directory, throwing an exception if it completes
 with an exit code other than 0.

executeAndGetOutput(Seq<String>, File, Map<String, String>) - Static method in class org.apache.spark.util.Utils

Execute a command and get its output, throwing an exception if it yields a code other than 0.

executeCollect() - Method in interface org.apache.spark.sql.execution.Command
 
executeCollect() - Method in class org.apache.spark.sql.execution.ExecutedCommand
 
executeCollect() - Method in class org.apache.spark.sql.execution.Limit

A custom implementation modeled after the take function on RDDs but which never runs any job
 locally.

executeCollect() - Method in class org.apache.spark.sql.execution.SparkPlan

Runs this query returning the result as an array.

executeCollect() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
ExecutedCommand - Class in org.apache.spark.sql.execution
 
ExecutedCommand(RunnableCommand) - Constructor for class org.apache.spark.sql.execution.ExecutedCommand
 
executePlan(LogicalPlan) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
executor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
executor() - Method in class org.apache.spark.scheduler.local.LocalActor
 
executor() - Method in class org.apache.spark.streaming.CheckpointWriter
 
executor_() - Method in class org.apache.spark.streaming.receiver.Receiver

Handler object that runs the receiver.

executorActor() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
executorActorSystemName() - Static method in class org.apache.spark.SparkEnv
 
executorAdded(String, String, String, int, int) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
executorAdded(String, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
ExecutorAdded - Class in org.apache.spark.scheduler
 
ExecutorAdded(String, String) - Constructor for class org.apache.spark.scheduler.ExecutorAdded
 
executorAdded(String, String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
executorAdded() - Method in class org.apache.spark.scheduler.TaskSetManager
 
executorAddress() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
ExecutorAllocationClient - Interface in org.apache.spark

A client that communicates with the cluster manager to request or kill executors.

ExecutorAllocationManager - Class in org.apache.spark

An agent that dynamically allocates and removes executors based on the workload.

ExecutorAllocationManager(ExecutorAllocationClient, LiveListenerBus, SparkConf) - Constructor for class org.apache.spark.ExecutorAllocationManager
 
executorAllocationManager() - Method in class org.apache.spark.SparkContext
 
ExecutorCacheTaskLocation - Class in org.apache.spark.scheduler

A location that includes both a host and an executor id on that host.

ExecutorCacheTaskLocation(String, String) - Constructor for class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
ExecutorData - Class in org.apache.spark.scheduler.cluster

Grouping of data for an executor used by CoarseGrainedSchedulerBackend.

ExecutorData(ActorRef, Address, String, int, int) - Constructor for class org.apache.spark.scheduler.cluster.ExecutorData
 
executorEnvs() - Method in class org.apache.spark.SparkContext
 
ExecutorExited - Class in org.apache.spark.scheduler
 
ExecutorExited(int) - Constructor for class org.apache.spark.scheduler.ExecutorExited
 
executorHeartbeatReceived(String, Tuple4<Object, Object, Object, TaskMetrics>[], BlockManagerId) - Method in class org.apache.spark.scheduler.DAGScheduler

Update metrics for in-progress tasks and let the master know that the BlockManager is still
 alive.

executorHeartbeatReceived(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Method in interface org.apache.spark.scheduler.TaskScheduler

Update metrics for in-progress tasks and let the master know that the BlockManager is still
 alive.

executorHeartbeatReceived(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl

Update metrics for in-progress tasks and let the master know that the BlockManager is still
 alive.

executorHost() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
executorId() - Method in class org.apache.spark.Heartbeat
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
executorId() - Method in class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
executorId() - Method in class org.apache.spark.scheduler.TaskDescription
 
executorId() - Method in class org.apache.spark.scheduler.TaskInfo
 
executorId() - Method in class org.apache.spark.scheduler.WorkerOffer
 
executorId() - Method in class org.apache.spark.SparkEnv
 
executorId() - Method in class org.apache.spark.storage.BlockManagerId
 
executorId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor
 
executorIds() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
 
executorIdToBlockManagerId() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
executorIdToStorageStatus() - Method in class org.apache.spark.storage.StorageStatusListener
 
ExecutorLossReason - Class in org.apache.spark.scheduler

Represents an explanation for a executor or whole slave failing or exiting.

ExecutorLossReason(String) - Constructor for class org.apache.spark.scheduler.ExecutorLossReason
 
executorLost(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
executorLost(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, int) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
executorLost(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
ExecutorLost - Class in org.apache.spark.scheduler
 
ExecutorLost(String) - Constructor for class org.apache.spark.scheduler.ExecutorLost
 
executorLost(String, String) - Method in class org.apache.spark.scheduler.Pool
 
executorLost(String, String) - Method in interface org.apache.spark.scheduler.Schedulable
 
executorLost(String, ExecutorLossReason) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
executorLost(String, String) - Method in class org.apache.spark.scheduler.TaskSetManager

Called by TaskScheduler when an executor is lost so we can re-enqueue our tasks

ExecutorLostFailure - Class in org.apache.spark

:: DeveloperApi ::
 The task failed because the executor that it was running on was lost.

ExecutorLostFailure(String) - Constructor for class org.apache.spark.ExecutorLostFailure
 
executorMemory() - Method in class org.apache.spark.SparkContext
 
executorPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
executorRemoved(String, String, Option<Object>) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
executorRunTime() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
executorSideSetup(int, int, int) - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
ExecutorsListener - Class in org.apache.spark.ui.exec

:: DeveloperApi ::
 A SparkListener that prepares information to be displayed on the ExecutorsTab

ExecutorsListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.exec.ExecutorsListener
 
executorsListener() - Method in class org.apache.spark.ui.SparkUI
 
ExecutorsPage - Class in org.apache.spark.ui.exec
 
ExecutorsPage(ExecutorsTab, boolean) - Constructor for class org.apache.spark.ui.exec.ExecutorsPage
 
ExecutorsTab - Class in org.apache.spark.ui.exec
 
ExecutorsTab(SparkUI) - Constructor for class org.apache.spark.ui.exec.ExecutorsTab
 
executorSummary() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
ExecutorSummaryInfo - Class in org.apache.spark.ui.exec

Summary information about an executor to display in the UI.

ExecutorSummaryInfo(String, String, int, long, long, int, int, int, int, long, long, long, long, long) - Constructor for class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
ExecutorTable - Class in org.apache.spark.ui.jobs

Stage summary grouped by executors.

ExecutorTable(int, int, StagesTab) - Constructor for class org.apache.spark.ui.jobs.ExecutorTable
 
ExecutorThreadDumpPage - Class in org.apache.spark.ui.exec
 
ExecutorThreadDumpPage(ExecutorsTab) - Constructor for class org.apache.spark.ui.exec.ExecutorThreadDumpPage
 
executorToDuration() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToOutputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksActive() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksComplete() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksFailed() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
ExistingEdgePartitionBuilder<ED,VD> - Class in org.apache.spark.graphx.impl

Constructs an EdgePartition from an existing EdgePartition with the same vertex set.

ExistingEdgePartitionBuilder(GraphXPrimitiveKeyOpenHashMap<Object, Object>, long[], Object, Option<OpenHashSet<Object>>, int, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
 
ExistingRdd - Class in org.apache.spark.sql.execution
 
ExistingRdd(Seq<Attribute>, RDD<Row>) - Constructor for class org.apache.spark.sql.execution.ExistingRdd
 
exitCode() - Method in class org.apache.spark.scheduler.ExecutorExited
 
Experimental - Annotation Type in org.apache.spark.annotation

An experimental user-facing API.

ExplainCommand - Class in org.apache.spark.sql.execution

An explain command for users to see how a command will be executed.

ExplainCommand(LogicalPlan, Seq<Attribute>, boolean, SQLContext) - Constructor for class org.apache.spark.sql.execution.ExplainCommand
 
explainedVariance() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics

Returns the explained variance regression score.

explainParams() - Method in interface org.apache.spark.ml.param.Params

Returns the documentation of all params.

explode() - Static method in class org.apache.spark.sql.hive.HiveQl
 
exprs() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
extended() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
ExtendedHiveQlParser - Class in org.apache.spark.sql.hive

A parser that recognizes all HiveQL constructs together with Spark SQL specific extensions.

ExtendedHiveQlParser() - Constructor for class org.apache.spark.sql.hive.ExtendedHiveQlParser
 
externalShuffleServiceEnabled() - Method in class org.apache.spark.storage.BlockManager
 
ExternalSort - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Performs a sort, spilling to disk as needed.

ExternalSort(Seq<SortOrder>, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.ExternalSort
 
externalSortEnabled() - Method in interface org.apache.spark.sql.SQLConf

When true the planner will use the external sort, which may spill to disk.

extraCoresPerSlave() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
extract(ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
extract(ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType

Extracts a value out of the buffer at the buffer's current position.

extract(ByteBuffer, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType

Extracts a value out of the buffer at the buffer's current position and stores in
 row(ordinal).

extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
extractDistribution(Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
extractDoubleDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
extractFn() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
extractLongDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
extractMultiClassCategories(int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree

Nested method to extract list of eligible categories given an index.

ExtractPythonUdfs - Class in org.apache.spark.sql.execution

Extracts PythonUDFs from operators, rewriting the query plan so that the UDF can be evaluated
 alone in a batch.

ExtractPythonUdfs() - Constructor for class org.apache.spark.sql.execution.ExtractPythonUdfs
 
extractSingle(MutableRow, int) - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
extractSingle(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
extractTo(MutableRow, int) - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
extractTo(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
extractTo(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
extraStrategies() - Method in class org.apache.spark.sql.SQLContext

:: DeveloperApi ::
 Allows extra strategies to be injected into the query planner at runtime.

eye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices

Generate an Identity Matrix in DenseMatrix format.





F

f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns document-based f1-measure averaged by the number of documents

f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns f1-measure for a given label (category)

failed() - Method in class org.apache.spark.scheduler.TaskInfo
 
FAILED() - Static method in class org.apache.spark.TaskState
 
failedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
failedStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
FailedStageTable - Class in org.apache.spark.ui.jobs
 
FailedStageTable(Seq<StageInfo>, String, JobProgressListener, boolean) - Constructor for class org.apache.spark.ui.jobs.FailedStageTable
 
failedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
failedTasks() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
failure() - Method in class org.apache.spark.partial.ApproximateActionListener
 
failureReason() - Method in class org.apache.spark.scheduler.StageInfo

If the stage failed, the reason why.

failuresBySlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
FAIR_SCHEDULER_PROPERTIES() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
FairSchedulableBuilder - Class in org.apache.spark.scheduler
 
FairSchedulableBuilder(Pool, SparkConf) - Constructor for class org.apache.spark.scheduler.FairSchedulableBuilder
 
FairSchedulingAlgorithm - Class in org.apache.spark.scheduler
 
FairSchedulingAlgorithm() - Constructor for class org.apache.spark.scheduler.FairSchedulingAlgorithm
 
fakeClassTag() - Static method in class org.apache.spark.api.java.JavaSparkContext

Produces a ClassTag[T], which is actually just a casted ClassTag[AnyRef].

fakeOutput(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.PhysicalPlanHacks
 
FakeParquetSerDe - Class in org.apache.spark.sql.hive.parquet

A placeholder that allows Spark SQL users to create metastore tables that are stored as
 parquet files.

FakeParquetSerDe() - Constructor for class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
FALSE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
FalsePositiveRate - Class in org.apache.spark.mllib.evaluation.binary

False positive rate.

FalsePositiveRate() - Constructor for class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
 
falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns false positive rate for a given label (category)

fastSquaredDistance(VectorWithNorm, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans

Returns the squared Euclidean distance between two vectors computed by
 MLUtils.fastSquaredDistance(org.apache.spark.mllib.linalg.Vector, double, org.apache.spark.mllib.linalg.Vector, double, double).

fastSquaredDistance(Vector, double, Vector, double, double) - Static method in class org.apache.spark.mllib.util.MLUtils

Returns the squared Euclidean distance between two vectors.

feature() - Method in class org.apache.spark.mllib.tree.model.Split
 
featureArity() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
featuresCol() - Method in interface org.apache.spark.ml.param.HasFeaturesCol

param for features column name

featureSubset() - Method in class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
FeatureType - Class in org.apache.spark.mllib.tree.configuration

:: Experimental ::
 Enum to describe whether a feature is "continuous" or "categorical"

FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
 
featureType() - Method in class org.apache.spark.mllib.tree.model.Bin
 
featureType() - Method in class org.apache.spark.mllib.tree.model.Split
 
featureUpdate(int, int, double, double) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator

Faster version of update.

FetchFailed - Class in org.apache.spark

:: DeveloperApi ::
 Task failed to fetch shuffle data from a remote node.

FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
 
fetchFile(String, File, SparkConf, SecurityManager, Configuration, long, boolean) - Static method in class org.apache.spark.util.Utils

Download a file to target directory.

fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
field() - Method in class org.apache.spark.storage.BroadcastBlockId
 
FieldAccessFinder - Class in org.apache.spark.util
 
FieldAccessFinder(Map<Class<?>, Set<String>>) - Constructor for class org.apache.spark.util.FieldAccessFinder
 
FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
FIFOSchedulableBuilder - Class in org.apache.spark.scheduler
 
FIFOSchedulableBuilder(Pool) - Constructor for class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
FIFOSchedulingAlgorithm - Class in org.apache.spark.scheduler
 
FIFOSchedulingAlgorithm() - Constructor for class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
 
file() - Method in class org.apache.spark.storage.FileSegment
 
file() - Method in class org.apache.spark.storage.TachyonFileSegment
 
FileAppender - Class in org.apache.spark.util.logging

Continuously appends the data from an input stream into the given file.

FileAppender(InputStream, File, int) - Constructor for class org.apache.spark.util.logging.FileAppender
 
fileDir() - Method in class org.apache.spark.HttpFileServer
 
fileExists(TachyonFile) - Method in class org.apache.spark.storage.TachyonBlockManager
 
fileIndex() - Method in class org.apache.spark.util.FileLogger
 
FileInputDStream<K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> - Class in org.apache.spark.streaming.dstream

This class represents an input stream that monitors a Hadoop-compatible filesystem for new
 files and creates a stream out of them.

FileInputDStream(StreamingContext, String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Constructor for class org.apache.spark.streaming.dstream.FileInputDStream
 
FileInputDStream.FileInputDStreamCheckpointData - Class in org.apache.spark.streaming.dstream

A custom version of the DStreamCheckpointData that stores names of
 Hadoop files as checkpoint data.

FileInputDStream.FileInputDStreamCheckpointData() - Constructor for class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
FileLogger - Class in org.apache.spark.util

A generic class for logging information to file.

FileLogger(String, SparkConf, Configuration, int, boolean, boolean, Option<FsPermission>) - Constructor for class org.apache.spark.util.FileLogger
 
FileLogger(String, SparkConf, boolean, boolean) - Constructor for class org.apache.spark.util.FileLogger
 
FileLogger(String, SparkConf, boolean) - Constructor for class org.apache.spark.util.FileLogger
 
FileLogger(String, SparkConf) - Constructor for class org.apache.spark.util.FileLogger
 
fileName() - Method in class org.apache.spark.sql.json.JSONRelation
 
filePath() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
filePath() - Method in class org.apache.spark.sql.hive.AddFile
 
files() - Method in class org.apache.spark.SparkContext
 
files() - Method in class org.apache.spark.sql.parquet.Partition
 
fileSegment() - Method in class org.apache.spark.storage.BlockObjectWriter

Returns the file segment of committed data that this Writer has written.

fileSegment() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
FileSegment - Class in org.apache.spark.storage

References a particular segment of a file (potentially the entire file),
 based off an offset and a length.

FileSegment(File, long, long) - Constructor for class org.apache.spark.storage.FileSegment
 
fileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format.

fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext

Create a input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format.

fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext

Create a input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them using the given key-value types and input format.

FileSystemHelper - Class in org.apache.spark.sql.parquet
 
FileSystemHelper() - Constructor for class org.apache.spark.sql.parquet.FileSystemHelper
 
fillObject(Iterator<Writable>, Deserializer, Seq<Tuple2<Attribute, Object>>, MutableRow) - Static method in class org.apache.spark.sql.hive.HadoopTableReader

Transform all given raw Writables into Rows.

filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return a new RDD containing only the elements that satisfy a predicate.

filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD

Return a new RDD containing only the elements that satisfy a predicate.

filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD

Return a new RDD containing only the elements that satisfy a predicate.

filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps

Filter the graph by computing some values to filter on, and applying the predicates.

filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Construct a new edge partition containing only the edges matching epred and where both
 vertices match vpred.

filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
filter(Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps

Restrict the vertex set to the set of vertices satisfying the given predicate.

filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD

Restricts the vertex set to the set of vertices satisfying the given predicate.

filter(Params) - Method in class org.apache.spark.ml.param.ParamMap

Filters this param map for the given parent.

filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD

Return a new RDD containing only the elements that satisfy a predicate.

filter(Function<Row, Boolean>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Return a new RDD containing only the elements that satisfy a predicate.

Filter - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

Filter(Expression, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Filter
 
filter(Function1<Row, Object>) - Method in class org.apache.spark.sql.SchemaRDD
 
Filter - Class in org.apache.spark.sql.sources
 
Filter() - Constructor for class org.apache.spark.sql.sources.Filter
 
filter() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream

Return a new DStream containing only the elements that satisfy a predicate.

filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream containing only the elements that satisfy a predicate.

filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream containing only the elements that satisfy a predicate.

filter(Function1<Tuple2<A, B>, Object>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
filter(Function1<Tuple2<A, B>, Object>) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
FilteredDStream<T> - Class in org.apache.spark.streaming.dstream
 
FilteredDStream(DStream<T>, Function1<T, Object>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.FilteredDStream
 
FilteredRDD<T> - Class in org.apache.spark.rdd
 
FilteredRDD(RDD<T>, Function1<T, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.FilteredRDD
 
FilteringParquetRowInputFormat - Class in org.apache.spark.sql.parquet

We extend ParquetInputFormat in order to have more control over which
 RecordFilter we want to use.

FilteringParquetRowInputFormat() - Constructor for class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
filterName() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
filterParams() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD

Filters this RDD with p, where p takes an additional parameter of type A.

finalRDD() - Method in class org.apache.spark.scheduler.JobSubmitted
 
finalStage() - Method in class org.apache.spark.scheduler.ActiveJob
 
findBestSplits(RDD<BaggedPoint<TreePoint>>, DecisionTreeMetadata, Node[], Map<Object, Node[]>, Map<Object, Map<Object, RandomForest.NodeIndexInfo>>, Split[][], Bin[][], Queue<Tuple2<Object, Node>>, TimeTracker, Option<NodeIdCache>) - Static method in class org.apache.spark.mllib.tree.DecisionTree

Given a group of nodes, this finds the best split for each node.

findClass(String) - Method in class org.apache.spark.util.ParentClassLoader
 
findClosest(TraversableOnce<VectorWithNorm>, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans

Returns the index of the closest center to the given point, as well as the squared distance.

findMaxTaskId(String, Configuration) - Static method in class org.apache.spark.sql.parquet.FileSystemHelper

Finds the maximum taskid in the output file names at the given path.

findSplitsForContinuousFeature(double[], DecisionTreeMetadata, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree

Find splits for a continuous feature
 NOTE: Returned number of splits is set based on featureSamples and
       could be different from the specified numSplits.

findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel

Find synonyms of a word

findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel

Find synonyms of the vector representation of a word

finishAll() - Method in class org.apache.spark.ui.ConsoleProgressBar

Mark all the stages as finished, clear the progress bar if showed, then the progress will not
 interweave with output of jobs.

finished() - Method in class org.apache.spark.scheduler.ActiveJob
 
finished() - Method in class org.apache.spark.scheduler.TaskInfo
 
FINISHED() - Static method in class org.apache.spark.TaskState
 
FINISHED_STATES() - Static method in class org.apache.spark.TaskState
 
finishedTasks() - Method in class org.apache.spark.partial.ApproximateActionListener
 
finishTime() - Method in class org.apache.spark.scheduler.TaskInfo

The time when the task has completed successfully (including the time to remotely fetch
 results, if necessary).

first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
first() - Method in class org.apache.spark.api.java.JavaPairRDD
 
first() - Method in interface org.apache.spark.api.java.JavaRDDLike

Return the first element in this RDD.

first() - Method in class org.apache.spark.rdd.RDD

Return the first element in this RDD.

FIRST_DELAY() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
firstAvailableClass(String, String) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
firstAvailableClass(String, String) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
fit(SchemaRDD, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator

Fits a single model to the input data with optional parameters.

fit(JavaSchemaRDD, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator

Fits a single model to the input data with optional parameters.

fit(SchemaRDD, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator

Fits a single model to the input data with optional parameters.

fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Estimator

Fits a single model to the input data with provided parameter map.

fit(SchemaRDD, ParamMap[]) - Method in class org.apache.spark.ml.Estimator

Fits multiple models to the input data with multiple sets of parameters.

fit(JavaSchemaRDD, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator

Fits a single model to the input data with optional parameters.

fit(JavaSchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Estimator

Fits a single model to the  input data with provided parameter map.

fit(JavaSchemaRDD, ParamMap[]) - Method in class org.apache.spark.ml.Estimator

Fits multiple models to the input data with multiple sets of parameters.

fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
 
fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Pipeline

Fits the pipeline to the input dataset with additional parameters.

fit(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF

Computes the inverse document frequency.

fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF

Computes the inverse document frequency.

fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler

Computes the mean and variance and stores as a model to be used for later scaling.

fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec

Computes the vector representation of each word in vocabulary.

fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec

Computes the vector representation of each word in vocabulary (Java version).

fittingParamMap() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
fittingParamMap() - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
fittingParamMap() - Method in class org.apache.spark.ml.Model

Fitting parameters, such that parent.fit(..., fittingParamMap) could reproduce the model.

fittingParamMap() - Method in class org.apache.spark.ml.PipelineModel
 
fittingParamMap() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
FixedLengthBinaryInputFormat - Class in org.apache.spark.input
 
FixedLengthBinaryInputFormat() - Constructor for class org.apache.spark.input.FixedLengthBinaryInputFormat
 
FixedLengthBinaryRecordReader - Class in org.apache.spark.input

FixedLengthBinaryRecordReader is returned by FixedLengthBinaryInputFormat.

FixedLengthBinaryRecordReader() - Constructor for class org.apache.spark.input.FixedLengthBinaryRecordReader
 
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by first applying a function to all elements of this
  RDD, and then flattening the results.

flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

Return a new RDD by first applying a function to all elements of this
  RDD, and then flattening the results.

flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream by applying a function to all elements of this DStream,
 and then flattening the results

flatMap(Function1<T, Traversable<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream by applying a function to all elements of this DStream,
 and then flattening the results

FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function

A function that returns zero or more output records from each input record.

FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function

A function that takes two inputs and returns zero or more output records.

FlatMappedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
FlatMappedDStream(DStream<T>, Function1<T, Traversable<U>>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.FlatMappedDStream
 
FlatMappedRDD<U,T> - Class in org.apache.spark.rdd
 
FlatMappedRDD(RDD<T>, Function1<T, TraversableOnce<U>>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.FlatMappedRDD
 
FlatMappedValuesRDD<K,V,U> - Class in org.apache.spark.rdd
 
FlatMappedValuesRDD(RDD<? extends Product2<K, V>>, Function1<V, TraversableOnce<U>>) - Constructor for class org.apache.spark.rdd.FlatMappedValuesRDD
 
flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by first applying a function to all elements of this
  RDD, and then flattening the results.

flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by first applying a function to all elements of this
  RDD, and then flattening the results.

flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream by applying a function to all elements of this DStream,
 and then flattening the results

FlatMapValuedDStream<K,V,U> - Class in org.apache.spark.streaming.dstream
 
FlatMapValuedDStream(DStream<Tuple2<K, V>>, Function1<V, TraversableOnce<U>>, ClassTag<K>, ClassTag<V>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD

Pass each value in the key-value pair RDD through a flatMap function without changing the
 keys; this also retains the original RDD's partitioning.

flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Pass each value in the key-value pair RDD through a flatMap function without changing the
 keys; this also retains the original RDD's partitioning.

flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying a flatmap function to the value of each key-value pairs in
 'this' DStream without changing the key.

flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying a flatmap function to the value of each key-value pairs in
 'this' DStream without changing the key.

flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

FlatMaps f over this RDD, where f takes an additional parameter of type A.

FLOAT - Class in org.apache.spark.sql.columnar
 
FLOAT() - Constructor for class org.apache.spark.sql.columnar.FLOAT
 
FloatColumnAccessor - Class in org.apache.spark.sql.columnar
 
FloatColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.FloatColumnAccessor
 
FloatColumnBuilder - Class in org.apache.spark.sql.columnar
 
FloatColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.FloatColumnBuilder
 
FloatColumnStats - Class in org.apache.spark.sql.columnar
 
FloatColumnStats() - Constructor for class org.apache.spark.sql.columnar.FloatColumnStats
 
FloatParam - Class in org.apache.spark.ml.param

Specialized version of Param[Float] for Java.

FloatParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
 
floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
 
FloatType - Static variable in class org.apache.spark.sql.api.java.DataType

Gets the FloatType object.

FloatType - Class in org.apache.spark.sql.api.java

The data type representing float and Float values.

floatWritableConverter() - Static method in class org.apache.spark.SparkContext
 
floor(Duration) - Method in class org.apache.spark.streaming.Time
 
FlumeBatchFetcher - Class in org.apache.spark.streaming.flume

This class implements the core functionality of FlumePollingReceiver.

FlumeBatchFetcher(FlumePollingReceiver) - Constructor for class org.apache.spark.streaming.flume.FlumeBatchFetcher
 
FlumeConnection - Class in org.apache.spark.streaming.flume

A wrapper around the transceiver and the Avro IPC API.

FlumeConnection(NettyTransceiver, SparkFlumeProtocol.Callback) - Constructor for class org.apache.spark.streaming.flume.FlumeConnection
 
FlumeEventServer - Class in org.apache.spark.streaming.flume

A simple server that implements Flume's Avro protocol.

FlumeEventServer(FlumeReceiver) - Constructor for class org.apache.spark.streaming.flume.FlumeEventServer
 
FlumeInputDStream<T> - Class in org.apache.spark.streaming.flume
 
FlumeInputDStream(StreamingContext, String, int, StorageLevel, boolean, ClassTag<T>) - Constructor for class org.apache.spark.streaming.flume.FlumeInputDStream
 
FlumePollingInputDStream<T> - Class in org.apache.spark.streaming.flume

A ReceiverInputDStream that can be used to read data from several Flume agents running
 SparkSinks.

FlumePollingInputDStream(StreamingContext, Seq<InetSocketAddress>, int, int, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
FlumePollingReceiver - Class in org.apache.spark.streaming.flume
 
FlumePollingReceiver(Seq<InetSocketAddress>, int, int, StorageLevel) - Constructor for class org.apache.spark.streaming.flume.FlumePollingReceiver
 
FlumeReceiver - Class in org.apache.spark.streaming.flume

A NetworkReceiver which listens for events using the
 Flume Avro interface.

FlumeReceiver(String, int, StorageLevel, boolean) - Constructor for class org.apache.spark.streaming.flume.FlumeReceiver
 
FlumeReceiver.CompressionChannelPipelineFactory - Class in org.apache.spark.streaming.flume

A Netty Pipeline factory that will decompress incoming data from 
 and the Netty client and compress data going back to the client.

FlumeReceiver.CompressionChannelPipelineFactory() - Constructor for class org.apache.spark.streaming.flume.FlumeReceiver.CompressionChannelPipelineFactory
 
FlumeUtils - Class in org.apache.spark.streaming.flume
 
FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
 
flush() - Method in class org.apache.spark.serializer.JavaSerializationStream
 
flush() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
flush() - Method in class org.apache.spark.serializer.SerializationStream
 
flush() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
flush() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
flush() - Method in class org.apache.spark.util.FileLogger

Flush the writer to disk manually.

FMeasure - Class in org.apache.spark.mllib.evaluation.binary

F-Measure.

FMeasure(double) - Constructor for class org.apache.spark.mllib.evaluation.binary.FMeasure
 
fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns f-measure for a given label (category)

fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns f1-measure for a given label (category)

fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns f-measure
 (equals to precision and recall because precision equals recall)

fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

Returns the (threshold, F-Measure) curve.

fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

Returns the (threshold, F-Measure) curve with beta = 1.0.

fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Aggregate the elements of each partition, and then the results for all the partitions, using a
 given associative function and a neutral "zero value".

fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD

Aggregate the elements of each partition, and then the results for all the partitions, using a
 given associative function and a neutral "zero value".

foldable() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
foldable() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).

foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).

foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Merge the values for each key using an associative function and a neutral "zero value"
 which may be added to the result an arbitrary number of times, and must not change the result
 (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).

foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).

foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).

foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Merge the values for each key using an associative function and a neutral "zero value" which
 may be added to the result an arbitrary number of times, and must not change the result
 (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).

forAttribute() - Method in class org.apache.spark.sql.columnar.PartitionStatistics
 
foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Applies a function f to all elements of this RDD.

foreach(Function1<Edge<ED>, BoxedUnit>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Apply the function f to all edges in this partition.

foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD

Applies a function f to all elements of this RDD.

foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Deprecated.
As of release 0.9.0, replaced by foreachRDD


foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Deprecated.
As of release 0.9.0, replaced by foreachRDD


foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream

Apply a function to each RDD in this DStream.

foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream

Apply a function to each RDD in this DStream.

foreach(Function1<Tuple2<A, B>, U>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
foreach(Function1<A, U>) - Method in class org.apache.spark.util.TimeStampedHashSet
 
foreach(Function1<Tuple2<A, B>, U>) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseVector
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseVector
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector

Applies a function f to all the active elements of dense and sparse vector.

foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

The asynchronous version of the foreach action, which
 applies a function f to all the elements of this RDD.

foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions

Applies a function f to all elements of this RDD.

ForEachDStream<T> - Class in org.apache.spark.streaming.dstream
 
ForEachDStream(DStream<T>, Function2<RDD<T>, Time, BoxedUnit>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ForEachDStream
 
foreachListener(Function1<SparkListener, BoxedUnit>) - Method in interface org.apache.spark.scheduler.SparkListenerBus

Apply the given function to all attached listeners, catching and logging any exception.

foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Applies a function f to each partition of this RDD.

foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD

Applies a function f to each partition of this RDD.

foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike

The asynchronous version of the foreachPartition action, which
 applies a function f to each partition of this RDD.

foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions

Applies a function f to each partition of this RDD.

foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Apply a function to each RDD in this DStream.

foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Apply a function to each RDD in this DStream.

foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream

Apply a function to each RDD in this DStream.

foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream

Apply a function to each RDD in this DStream.

foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD

Applies f to each element of this RDD, where f takes an additional parameter of type A.

foreachWithinEdgePartition(int, boolean, boolean, Function1<Object, BoxedUnit>) - Method in class org.apache.spark.graphx.impl.RoutingTablePartition

Runs f on each vertex id to be sent to the specified edge partition.

formatDate(Date) - Static method in class org.apache.spark.ui.UIUtils
 
formatDate(long) - Static method in class org.apache.spark.ui.UIUtils
 
formatDuration(long) - Static method in class org.apache.spark.ui.UIUtils
 
formatDurationVerbose(long) - Static method in class org.apache.spark.ui.UIUtils

Generate a verbose human-readable string representing a duration such as "5 second 35 ms"

formatNumber(double) - Static method in class org.apache.spark.ui.UIUtils

Generate a human-readable string representing a number (e.g.

formatter() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
formatWindowsPath(String) - Static method in class org.apache.spark.util.Utils

Format a Windows path such that it can be safely passed to a URI.

fraction() - Method in class org.apache.spark.sql.execution.Sample
 
framework() - Method in class org.apache.spark.streaming.Checkpoint
 
frameworkMessage(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, byte[]) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
frameworkMessage(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, byte[]) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
freeCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
freeMemory() - Method in class org.apache.spark.storage.MemoryStore

Free memory not occupied by existing blocks.

fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
fromBreeze(Matrix<Object>) - Static method in class org.apache.spark.mllib.linalg.Matrices

Creates a Matrix instance from a breeze matrix.

fromBreeze(Vector<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors

Creates a vector instance from a breeze vector.

fromDataType(DataType, String, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter

Converts a given Catalyst DataType into
 the corresponding Parquet Type.

fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream

Convert a scala DStream to a Java-friendly
 JavaDStream.

fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD

Creates an EdgeRDD from already-constructed edge partitions.

fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl

Create a graph from EdgePartitions, setting referenced vertices to `defaultVertexAttr`.

fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD

Creates an EdgeRDD from a set of edges.

fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph

Construct a graph from a collection of edges.

fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD

Constructs a VertexRDD containing all vertices referred to in edges.

fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph

Construct a graph from a collection of edges encoded as vertex id pairs.

fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl

Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the
 vertices.

fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream

Convert a scala InputDStream to a Java-friendly
 JavaInputDStream.

fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream

Convert a scala InputDStream of pairs to a
 Java-friendly JavaPairInputDStream.

fromJava(Object, DataType) - Static method in class org.apache.spark.sql.execution.EvaluatePython
 
fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD

Convert a JavaRDD of key-value pairs to JavaPairRDD.

fromMesos(Protos.TaskState) - Static method in class org.apache.spark.TaskState
 
fromMsgs(int, Iterator<Tuple2<Object, Object>>) - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition

Build a `RoutingTablePartition` from `RoutingTableMessage`s.

fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
fromPrimitiveDataType(DataType) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter

For a given Catalyst DataType return
 the name of the corresponding Parquet primitive type or None if the given type
 is not primitive.

fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
 
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions

Implicit conversion from an RDD to RDDFunctions.

fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
 
fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream

Convert a scala ReceiverInputDStream to a Java-friendly
 JavaReceiverInputDStream.

fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream

Convert a scala ReceiverInputDStream to a Java-friendly
 JavaReceiverInputDStream.

fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
 
fromStage(Stage, Option<Object>) - Static method in class org.apache.spark.scheduler.StageInfo

Construct a StageInfo from a Stage.

fromString(String) - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
fromString(String) - Static method in class org.apache.spark.mllib.tree.impurity.Impurities
 
fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
 
fromString(String) - Static method in class org.apache.spark.storage.StorageLevel

:: DeveloperApi ::
 Return the StorageLevel object with the specified name.

fromWeakReference(WeakReference<V>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceIterator(Iterator<Tuple2<K, WeakReference<V>>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceMap(Map<K, WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceOption(Option<WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceTuple(Tuple2<K, WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fs() - Method in class org.apache.spark.rdd.CheckpointRDD
 
fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a full outer join of this and other.

fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a full outer join of this and other.

fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a full outer join of this and other.

fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a full outer join of this and other.

fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a full outer join of this and other.

fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a full outer join of this and other.

fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'full outer join' between RDDs of this DStream and
 other DStream.

fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'full outer join' between RDDs of this DStream and
 other DStream.

fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'full outer join' between RDDs of this DStream and
 other DStream.

fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'full outer join' between RDDs of this DStream and
 other DStream.

fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'full outer join' between RDDs of this DStream and
 other DStream.

fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'full outer join' between RDDs of this DStream and
 other DStream.

fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
 
func() - Method in class org.apache.spark.scheduler.ActiveJob
 
func() - Method in class org.apache.spark.scheduler.JobSubmitted
 
Function<T1,R> - Interface in org.apache.spark.api.java.function

Base interface for functions whose return types do not create special RDDs.

function() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
function() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function

A two-argument function that takes arguments of type T1 and T2 and returns an R.

Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function

A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.

functionClassName() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
FutureAction<T> - Interface in org.apache.spark

A future for the result of an action to support cancellation.





G

gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
GapSamplingIterator<T> - Class in org.apache.spark.util.random
 
GapSamplingIterator(Iterator<T>, double, Random, double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.GapSamplingIterator
 
GapSamplingReplacementIterator<T> - Class in org.apache.spark.util.random

advance to first sample as part of object construction.

GapSamplingReplacementIterator(Iterator<T>, double, Random, double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.GapSamplingReplacementIterator
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
gatherCompressibilityStats(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
gatherCompressibilityStats(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.ByteColumnStats
 
gatherStats(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnStats

Gathers statistics information from row(ordinal).

gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.DateColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.FloatColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.GenericColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.IntColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.LongColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.NoopColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.ShortColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.StringColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
 
GC_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
gemm(boolean, boolean, double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS

C := alpha * A * B + beta * C

gemm(double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS

C := alpha * A * B + beta * C

gemv(boolean, double, Matrix, DenseVector, double, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS

y := alpha * A * x + beta * y

gemv(double, Matrix, DenseVector, double, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS

y := alpha * A * x + beta * y

GeneralHashedRelation - Class in org.apache.spark.sql.execution.joins

A general HashedRelation backed by a hash map that maps the key into a sequence of values.

GeneralHashedRelation(HashMap<Row, CompactBuffer<Row>>) - Constructor for class org.apache.spark.sql.execution.joins.GeneralHashedRelation
 
GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression

:: DeveloperApi ::
 GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).

GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
 
GeneralizedLinearModel - Class in org.apache.spark.mllib.regression

:: DeveloperApi ::
 GeneralizedLinearModel (GLM) represents a model trained using
 GeneralizedLinearAlgorithm.

GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
generate(String, String, int, int) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
Generate - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Applies a Generator to a stream of input rows, combining the
 output of each into a new stream of rows.

Generate(Generator, boolean, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Generate
 
generate(Generator, boolean, boolean, Option<String>) - Method in class org.apache.spark.sql.SchemaRDD

:: Experimental ::
 Applies the given Generator, or table generating function, to this relation.

GeneratedAggregate - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Alternate version of aggregation that leverages projection and thus code generation.

GeneratedAggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.GeneratedAggregate
 
generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
 
generateJob(Time) - Method in class org.apache.spark.streaming.dstream.DStream

Generate a SparkStreaming job for the given time.

generateJob(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
generateJobs(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
GenerateJobs - Class in org.apache.spark.streaming.scheduler
 
GenerateJobs(Time) - Constructor for class org.apache.spark.streaming.scheduler.GenerateJobs
 
generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator

Generate an RDD containing test data for KMeans.

generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
 
generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator

Return a Java List of synthetic data randomly generated according to a multi
 collinear model.

generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator

Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso,
 and uregularized variants.

generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator

Generate an RDD containing test data for LogisticRegression.

generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
generateRolledOverFileSuffix() - Method in interface org.apache.spark.util.logging.RollingPolicy

Get the desired name of the rollover file

generateRolledOverFileSuffix() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy

Get the desired name of the rollover file

generateRolledOverFileSuffix() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
generator() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
generator() - Method in class org.apache.spark.sql.execution.Generate
 
GENERIC - Class in org.apache.spark.sql.columnar
 
GENERIC() - Constructor for class org.apache.spark.sql.columnar.GENERIC
 
GenericColumnAccessor - Class in org.apache.spark.sql.columnar
 
GenericColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.GenericColumnAccessor
 
GenericColumnBuilder - Class in org.apache.spark.sql.columnar
 
GenericColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.GenericColumnBuilder
 
GenericColumnStats - Class in org.apache.spark.sql.columnar
 
GenericColumnStats() - Constructor for class org.apache.spark.sql.columnar.GenericColumnStats
 
get(Object) - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
get() - Method in interface org.apache.spark.FutureAction

Blocks and returns the result of this job.

get() - Method in class org.apache.spark.JavaFutureActionWrapper
 
get(long, TimeUnit) - Method in class org.apache.spark.JavaFutureActionWrapper
 
get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap

Optionally returns the value associated with a param or its default.

get(Param<T>) - Method in interface org.apache.spark.ml.param.Params

Gets the value of a parameter in the embedded param map.

get(long) - Method in class org.apache.spark.partial.StudentTCacher
 
get(String) - Method in class org.apache.spark.SparkConf

Get a parameter; throws a NoSuchElementException if it's not set

get(String, String) - Method in class org.apache.spark.SparkConf

Get a parameter, falling back to a default if not set

get() - Static method in class org.apache.spark.SparkEnv

Returns the SparkEnv.

get(String) - Static method in class org.apache.spark.SparkFiles

Get the absolute path of a file added through SparkContext.addFile().

get(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column `i`.

get(Row) - Method in class org.apache.spark.sql.execution.joins.GeneralHashedRelation
 
get(Row) - Method in interface org.apache.spark.sql.execution.joins.HashedRelation
 
get(Row) - Method in class org.apache.spark.sql.execution.joins.UniqueKeyHashedRelation
 
get() - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
get(BlockId) - Method in class org.apache.spark.storage.BlockManager

Get a block from the block manager (either local or remote).

get() - Static method in class org.apache.spark.TaskContext

Return the currently active TaskContext.

get(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
get(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getAcceptanceResults(RDD<Tuple2<K, V>>, boolean, Map<K, Object>, Option<Map<K, Object>>, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils

Count the number of items instantly accepted and generate the waitlist for each stratum.

getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker

Returns an array containing the ids of all active jobs.

getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker

Returns an array containing the ids of all active jobs.

getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker

Returns an array containing the ids of all active stages.

getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker

Returns an array containing the ids of all active stages.

getActorSystemHostPortForExecutor(String) - Method in class org.apache.spark.storage.BlockManagerMaster
 
getAddressHostName(String) - Static method in class org.apache.spark.util.Utils
 
getAkkaConf() - Method in class org.apache.spark.SparkConf

Get all akka conf variables set on this SparkConf

getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getAll() - Method in class org.apache.spark.SparkConf

Get all parameters as a list of pairs

getAllBlocks() - Method in class org.apache.spark.storage.DiskBlockManager

List all the blocks currently stored on disk by the disk manager.

getAllConfs() - Method in interface org.apache.spark.sql.SQLConf

Return all the configuration properties that have been set (i.e.

getAllFiles() - Method in class org.apache.spark.storage.DiskBlockManager

List all the files currently stored on disk by the disk manager.

getAllPartitionsOf(Hive, Table) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getAllPools() - Method in class org.apache.spark.SparkContext

:: DeveloperApi ::
 Return pools for fair scheduler

getAppId() - Method in class org.apache.spark.SparkConf

Returns the Spark application id, valid in the Driver after TaskScheduler registration and
 from the start in the Executor.

getAppName() - Method in class org.apache.spark.ui.SparkUI
 
getAst(String) - Static method in class org.apache.spark.sql.hive.HiveQl

Returns the AST for the given SQL string.

getBasePath() - Method in class org.apache.spark.ui.WebUI
 
getBernoulliSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils

Return the per partition sampling function used for sampling without replacement.

getBinaryWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus

Return the given block stored in this block manager in O(1) time.

getBlockData(BlockId) - Method in class org.apache.spark.storage.BlockManager

Interface to get local block data.

getBlocksOfBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker

Get the blocks allocated to the given batch.

getBlocksOfBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker

Get the blocks for the given batch and all input streams.

getBlocksOfBatchAndStream(Time, int) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker

Get the blocks allocated to the given batch and stream.

getBlocksOfBatchAndStream(Time, int) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker

Get the blocks allocated to the given batch and stream.

getBlocksOfStream(int) - Method in class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
getBlockStatus(BlockId, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster

Return the block's status on all block managers, if any.

getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf

Get a parameter as a boolean, falling back to a default if not set

getBoolean(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column i as a bool.

getBooleanWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getByte(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column i as a byte.

getBytes(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getBytes(FileSegment) - Method in class org.apache.spark.storage.DiskStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getByteWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
 
getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD

The three methods below are helpers for accessing the local map, a property of the SparkEnv of
 the local process.

getCachedStorageLevel(StorageLevel) - Static method in class org.apache.spark.storage.StorageLevel
 
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.EntropyAggregator

Get an ImpurityCalculator for a (node, feature, bin).

getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.GiniAggregator

Get an ImpurityCalculator for a (node, feature, bin).

getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator

Get an ImpurityCalculator for a (node, feature, bin).

getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.VarianceAggregator

Get an ImpurityCalculator for a (node, feature, bin).

getCallSite() - Method in class org.apache.spark.SparkContext

Capture the current user callsite and return a formatted version for printing.

getCallSite(Function1<String, Object>) - Static method in class org.apache.spark.util.Utils

When called inside a class in the spark package, returns the name of the user code class
 (outside the spark package) that called into Spark, as well as which Spark method they called.

getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
 
getCheckpointDir() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getCheckpointDir() - Method in class org.apache.spark.SparkContext
 
getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike

Gets the name of the file to which this RDD was checkpointed

getCheckpointFile() - Method in class org.apache.spark.rdd.RDD

Gets the name of the file to which this RDD was checkpointed

getCheckpointFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getCheckpointFiles(String, FileSystem) - Static method in class org.apache.spark.streaming.Checkpoint

Get checkpoint files present in the give directory, ordered by oldest-first

getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getClause(String, Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
getClauseOption(String, Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
getClientSideSplits(Configuration, List<Footer>, Long, Long, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getCombOp() - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils

Returns the function used combine results returned by seqOp from different partitions.

getCommandProcessor(String[], HiveConf) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getConf() - Method in class org.apache.spark.api.java.JavaSparkContext

Return a copy of this JavaSparkContext's configuration.

getConf() - Method in class org.apache.spark.input.WholeCombineFileRecordReader
 
getConf() - Method in class org.apache.spark.input.WholeTextFileInputFormat
 
getConf() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getConf() - Method in class org.apache.spark.rdd.HadoopRDD
 
getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getConf() - Method in class org.apache.spark.SparkContext

Return a copy of this SparkContext's configuration.

getConf(String) - Method in interface org.apache.spark.sql.SQLConf

Return the value of Spark SQL configuration property for the given key.

getConf(String, String) - Method in interface org.apache.spark.sql.SQLConf

Return the value of Spark SQL configuration property for the given key.

getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
 
getConnections() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
getContextOrSparkClassLoader() - Static method in class org.apache.spark.util.Utils

Get the Context ClassLoader on this thread or, if not present, the ClassLoader that
 loaded Spark.

getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
getCorrelationFromName(String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
getCreationSite() - Method in class org.apache.spark.rdd.RDD
 
getCreationSite() - Static method in class org.apache.spark.streaming.dstream.DStream

Get the creation site of a DStream from the stack trace of when the DStream is created.

getCurrentKey() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getCurrentKey() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getCurrentKey() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystConverter

Should only be called in the root (group) converter!

getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.RowRecordMaterializer
 
getCurrentValue() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getCurrentValue() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getCurrentValue() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getDataLocationPath(Partition) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDataType() - Method in class org.apache.spark.sql.api.java.StructField
 
getDateWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDecimalWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDefaultPropertiesFile(Map<String, String>) - Static method in class org.apache.spark.util.Utils

Return the path of the default Spark properties file.

getDefaultWorkFile(TaskAttemptContext, String) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
getDelaySeconds(SparkConf) - Static method in class org.apache.spark.util.MetadataCleaner
 
getDelaySeconds(SparkConf, Enumeration.Value) - Static method in class org.apache.spark.util.MetadataCleaner
 
getDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
 
getDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
getDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
 
getDirName() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
getDiskWriter(BlockId, File, Serializer, int, ShuffleWriteMetrics) - Method in class org.apache.spark.storage.BlockManager

A short circuited method to get a block writer that can write data directly to disk.

getDouble(String, double) - Method in class org.apache.spark.SparkConf

Get a parameter as a double, falling back to a default if not set

getDouble(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column i as a double.

getDoubleWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getElementType() - Method in class org.apache.spark.sql.api.java.ArrayType
 
getEntrySet() - Method in class org.apache.spark.util.TimeStampedHashMap
 
getenv(String) - Method in class org.apache.spark.SparkConf

By using this instead of System.getenv(), environment variables can be mocked
 in unit tests.

getEpoch() - Method in class org.apache.spark.MapOutputTracker

Called to get current epoch number.

getEstimator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getEstimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getEvaluator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getExecutorEnv() - Method in class org.apache.spark.SparkConf

Get all executor environment variables set on this SparkConf

getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext

Return a map from the slave to the max memory available for caching and the remaining
 memory available for caching.

getExecutorsAliveOnHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext

:: DeveloperApi ::
 Return information about blocks stored in all of the slaves

getExecutorThreadDump(String) - Method in class org.apache.spark.SparkContext

Called by the web UI to obtain executor thread dumps.

getExternalTmpPath(Context, Path) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getFeatureOffset(int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator

Pre-compute feature offset for use with featureUpdate.

getFeaturesCol() - Method in interface org.apache.spark.ml.param.HasFeaturesCol
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BINARY
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
getField(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType

Returns row(ordinal).

getField(Row, int) - Static method in class org.apache.spark.sql.columnar.DATE
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.GENERIC
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.INT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
getFields() - Method in class org.apache.spark.sql.api.java.StructType
 
getFile(long) - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
getFile(String) - Method in class org.apache.spark.storage.DiskBlockManager

Looks up a file by hashing it into one of our local subdirectories.

getFile(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
 
getFile(String) - Method in class org.apache.spark.storage.TachyonBlockManager
 
getFile(BlockId) - Method in class org.apache.spark.storage.TachyonBlockManager
 
getFilePath(File, String) - Static method in class org.apache.spark.util.Utils

Return the absolute path of a file in the given directory.

getFileSegmentLocations(String, long, long, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils

Get the locations of the HDFS blocks containing the given file segment.

getFileSystemForPath(Path, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getFinalValue() - Method in class org.apache.spark.partial.PartialResult

Blocking method to wait for and return the final value.

getFloat(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column i as a float.

getFloatWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getFooters(JobContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getFormattedClassName(Object) - Static method in class org.apache.spark.util.Utils

Return the class name of the given object, removing all dollar signs

getFunctionInfo(String) - Method in class org.apache.spark.sql.hive.HiveFunctionRegistry
 
getHadoopFileSystem(URI, Configuration) - Static method in class org.apache.spark.util.Utils

Return a Hadoop FileSystem with the scheme encoded in the given path.

getHadoopFileSystem(String, Configuration) - Static method in class org.apache.spark.util.Utils

Return a Hadoop FileSystem with the scheme encoded in the given path.

getHandlers() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
getHandlers() - Method in class org.apache.spark.ui.WebUI
 
getHiveFile(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
getHttpUser() - Method in class org.apache.spark.SecurityManager

Gets the user used for authenticating HTTP connections.

getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getImpurityCalculator(int, int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator

Get an ImpurityCalculator for a given (node, feature, bin).

getInputCol() - Method in interface org.apache.spark.ml.param.HasInputCol
 
getInputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getInputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getInstance(String) - Method in class org.apache.spark.metrics.MetricsConfig
 
getInt(String, int) - Method in class org.apache.spark.SparkConf

Get a parameter as an integer, falling back to a default if not set

getInt(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column i as an int.

getIntWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getIteratorSize(Iterator<T>) - Static method in class org.apache.spark.util.Utils

Counts the number of elements of an iterator using a while loop rather than calling
 TraversableOnce.size() because it uses a for loop, which is slightly slower
 in the current version of Scala.

getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker

Return a list of all known jobs in a particular job group.

getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker

Return a list of all known jobs in a particular job group.

getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker

Returns job information, or null if the job info could not be found or was garbage collected.

getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker

Returns job information, or None if the job info could not be found or was garbage collected.

getKeyType() - Method in class org.apache.spark.sql.api.java.MapType
 
getLabelCol() - Method in interface org.apache.spark.ml.param.HasLabelCol
 
getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getLeastGroupHash(String) - Method in class org.apache.spark.rdd.PartitionCoalescer

Sorts and gets the least element of the list associated with key in groupHash
 The returned PartitionGroup is the least loaded of all groups that represent the machine "key"

getLeftRightFeatureOffsets(int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator

Pre-compute feature offset for use with featureUpdate.

getLocal(BlockId) - Method in class org.apache.spark.storage.BlockManager

Get block from local block manager.

getLocalBytes(BlockId) - Method in class org.apache.spark.storage.BlockManager

Get block from the local block manager as serialized bytes.

getLocalDir(SparkConf) - Static method in class org.apache.spark.util.Utils

Get the path of a temporary directory.

getLocalFileWriter(Row) - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
getLocalFileWriter(Row) - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
getLocalityIndex(Enumeration.Value) - Method in class org.apache.spark.scheduler.TaskSetManager

Find the index in myLocalityLevels for a given locality.

getLocalProperties() - Method in class org.apache.spark.SparkContext
 
getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext

Get a local property set in this thread, or null if it is missing.

getLocalProperty(String) - Method in class org.apache.spark.SparkContext

Get a local property set in this thread, or null if it is missing.

getLocation() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
getLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
getLocations(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster

Get locations of the blockId from the driver

getLocations(BlockId[]) - Method in class org.apache.spark.storage.BlockManagerMaster

Get locations of multiple blockIds from the driver

getLogDirPath(String, String) - Static method in class org.apache.spark.scheduler.EventLoggingListener

Return a file-system-safe path to the log directory for the given application.

getLong(String, long) - Method in class org.apache.spark.SparkConf

Get a parameter as a long, falling back to a default if not set

getLong(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column i as a long.

getLongWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getLowerBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds

Returns a threshold p such that if we conduct n Bernoulli trials with success rate = p,
 it is very unlikely to have more than fraction * n successes.

getLowerBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds

Returns a lambda such that Pr[X > s] is very small, where X ~ Pois(lambda).

GetMapOutputStatuses - Class in org.apache.spark
 
GetMapOutputStatuses(int) - Constructor for class org.apache.spark.GetMapOutputStatuses
 
getMatchingBlockIds(Function1<BlockId, Object>) - Method in class org.apache.spark.storage.BlockManager

Get the ids of existing blocks that match the given filter.

getMatchingBlockIds(Function1<BlockId, Object>, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster

Return a list of ids of existing blocks such that the ids match the given filter.

getMaxBatchSize() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxInputStreamRememberDuration() - Method in class org.apache.spark.streaming.DStreamGraph

Get the maximum remember duration across all the input streams.

getMaxIter() - Method in interface org.apache.spark.ml.param.HasMaxIter
 
getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxResultSize(SparkConf) - Static method in class org.apache.spark.util.Utils
 
getMemoryStatus() - Method in class org.apache.spark.storage.BlockManagerMaster

Return the memory status for each block manager, in the form of a map from
 the block manager's id to two long values.

getMessage() - Method in exception org.apache.spark.util.TaskCompletionListenerException
 
getMetadata() - Method in class org.apache.spark.sql.api.java.StructField
 
getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
getMetricsSnapshot(HttpServletRequest) - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getModel(Estimator<M>) - Method in class org.apache.spark.ml.PipelineModel

Gets the model produced by the input estimator.

getModifyAcls() - Method in class org.apache.spark.SecurityManager
 
getName() - Method in class org.apache.spark.sql.api.java.StructField
 
getNarrowAncestors() - Method in class org.apache.spark.rdd.RDD

Return the ancestors of the given RDD that are related to it only through a sequence of
 narrow dependencies.

getNewReceiverStreamId() - Method in class org.apache.spark.streaming.StreamingContext
 
getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node

Traces down from a root node to get the node with the given node index.

getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
 
getNumFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getObjectInspector() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
getOption(String) - Method in class org.apache.spark.SparkConf

Get a parameter as an Option

getOrCompute(RDD<T>, Partition, TaskContext, StorageLevel) - Method in class org.apache.spark.CacheManager

Gets or computes an RDD partition.

getOrCompute(Time) - Method in class org.apache.spark.streaming.dstream.DStream

Get the RDD corresponding to the given time; either retrieve it from cache
 or compute-and-cache it.

getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.

getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.

getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.

getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext

Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.

getOrCreateLocalRootDirs(SparkConf) - Static method in class org.apache.spark.util.Utils

Gets or creates the directories listed in spark.local.dir or SPARK_LOCAL_DIRS,
 and returns only the directories that exist / could be created.

getOutputCol() - Method in interface org.apache.spark.ml.param.HasOutputCol
 
getOutputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getOutputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getParam(String) - Method in interface org.apache.spark.ml.param.Params

Gets a param by its name.

getParents(int) - Method in class org.apache.spark.NarrowDependency

Get the parent partitions for a child partition.

getParents(int) - Method in class org.apache.spark.OneToOneDependency
 
getParents(int) - Method in class org.apache.spark.RangeDependency
 
getParents(int) - Method in class org.apache.spark.rdd.PruneDependency
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
 
getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy

Returns the partition number for a given edge.

getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
 
getPartition(Object) - Method in class org.apache.spark.HashPartitioner
 
getPartition(Object) - Method in class org.apache.spark.mllib.recommendation.ALSPartitioner
 
getPartition(Object) - Method in class org.apache.spark.Partitioner
 
getPartition(Object) - Method in class org.apache.spark.RangePartitioner
 
getPartitions() - Method in class org.apache.spark.mllib.rdd.RandomRDD
 
getPartitions() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
getPartitions() - Method in class org.apache.spark.rdd.BinaryFileRDD
 
getPartitions() - Method in class org.apache.spark.rdd.BlockRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CartesianRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CheckpointRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CoalescedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.EmptyRDD
 
getPartitions() - Method in class org.apache.spark.rdd.FilteredRDD
 
getPartitions() - Method in class org.apache.spark.rdd.FlatMappedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.FlatMappedValuesRDD
 
getPartitions() - Method in class org.apache.spark.rdd.GlommedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
 
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
 
getPartitions() - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
getPartitions() - Method in class org.apache.spark.rdd.MappedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.MappedValuesRDD
 
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PipedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getPartitions() - Method in class org.apache.spark.rdd.SampledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.SubtractedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.WholeTextFileRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
getPartitions() - Method in class org.apache.spark.sql.SchemaRDD
 
getPartitions() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
 
getPath() - Method in class org.apache.spark.input.PortableDataStream
 
getPeers(BlockManagerId) - Method in class org.apache.spark.storage.BlockManagerMaster

Get ids of other nodes in the cluster from the driver

getPendingTimes() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
getPersistentRDDs() - Method in class org.apache.spark.SparkContext

Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.

getPipeEnvVars() - Method in class org.apache.spark.rdd.HadoopPartition

Get any environment variables that should be added to the users environment when running pipes

getPipeline() - Method in class org.apache.spark.streaming.flume.FlumeReceiver.CompressionChannelPipelineFactory
 
getPointIterator(RandomRDDPartition<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RandomRDD
 
getPoissonSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils

Return the per partition sampling function used for sampling with replacement.

getPoolForName(String) - Method in class org.apache.spark.SparkContext

:: DeveloperApi ::
 Return the pool associated with the given name, if one exists

getPrecision() - Method in class org.apache.spark.sql.api.java.DecimalType

Return the precision, or -1 if no precision is set

getPredictionCol() - Method in interface org.apache.spark.ml.param.HasPredictionCol
 
getPreferredLocations(Partition) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.BlockRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CartesianRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CheckpointRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CoalescedRDD

Returns the preferred machine for the partition.

getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.SampledRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD

Get the preferred location of the partition.

getPreferredLocs(RDD<?>, int) - Method in class org.apache.spark.scheduler.DAGScheduler

Synchronized method that might be called from other threads.

getPreferredLocs(RDD<?>, int) - Method in class org.apache.spark.SparkContext

Gets the locality information associated with the partition in a particular rdd

getPrimitiveNullWritableConstantObjectInspector() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getProgress() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getProgress() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getProgress() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getPropertiesFromFile(String) - Static method in class org.apache.spark.util.Utils

Load properties present in the given file.

getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getQuantiles(Traversable<Object>) - Method in class org.apache.spark.util.Distribution

Get the value of the distribution at the given probabilities.

getRackForHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
getRddBlockLocations(int, Seq<StorageStatus>) - Static method in class org.apache.spark.storage.StorageUtils

Return a mapping from block ID to its locations for each block that belongs to the given RDD.

getRDDStorageInfo() - Method in class org.apache.spark.SparkContext

:: DeveloperApi ::
 Return information about what RDDs are cached, if they are in mem or on disk, how much space
 they take, etc.

getReceiver() - Method in class org.apache.spark.streaming.dstream.PluggableInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.dstream.RawInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream

Gets the receiver object that will be sent to the worker nodes
 to receive data.

getReceiver() - Method in class org.apache.spark.streaming.dstream.SocketInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.flume.FlumeInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.kafka.KafkaInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.mqtt.MQTTInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.twitter.TwitterInputDStream
 
getReceiverInputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getRecordLength(JobContext) - Static method in class org.apache.spark.input.FixedLengthBinaryInputFormat

Retrieves the record length property from a Hadoop configuration

getReference(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getRegParam() - Method in interface org.apache.spark.ml.param.HasRegParam
 
getRemote(BlockId) - Method in class org.apache.spark.storage.BlockManager

Get block from remote block managers.

getRemoteBytes(BlockId) - Method in class org.apache.spark.storage.BlockManager

Get block from remote block managers as serialized bytes.

getResource(List<Protos.Resource>, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend

Helper function to pull out a resource from a Mesos Resources protobuf

getRestartTime(long) - Method in class org.apache.spark.streaming.util.RecurringTimer

Get the time when the timer will fire if it is restarted right now.

getRootConverter() - Method in class org.apache.spark.sql.parquet.RowRecordMaterializer
 
getRootDirectory() - Static method in class org.apache.spark.SparkFiles

Get the root directory that contains files added through SparkContext.addFile().

getSaslUser() - Method in class org.apache.spark.SecurityManager

Gets the user used for authenticating SASL connections.

getSaslUser(String) - Method in class org.apache.spark.SecurityManager
 
getScale() - Method in class org.apache.spark.sql.api.java.DecimalType

Return the scale, or -1 if no precision is set

getSchedulableByName(String) - Method in class org.apache.spark.scheduler.Pool
 
getSchedulableByName(String) - Method in interface org.apache.spark.scheduler.Schedulable
 
getSchedulableByName(String) - Method in class org.apache.spark.scheduler.TaskSetManager
 
getSchedulingMode() - Method in class org.apache.spark.SparkContext

Return current scheduling mode

getSchema(Configuration) - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
getScoreCol() - Method in interface org.apache.spark.ml.param.HasScoreCol
 
getSecretKey() - Method in class org.apache.spark.SecurityManager

Gets the secret key.

getSecretKey(String) - Method in class org.apache.spark.SecurityManager
 
getSecurityManager() - Method in class org.apache.spark.ui.WebUI
 
getSeqOp(boolean, Map<K, Object>, StratifiedSamplingUtils.RandomDataGenerator, Option<Map<K, Object>>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils

Returns the function used by aggregate to collect sampling statistics for each partition.

getSerDeStats() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
getSerializedClass() - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
getSerializedMapOutputStatuses(int) - Method in class org.apache.spark.MapOutputTrackerMaster
 
getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
 
getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
 
getServerStatuses(int, int) - Method in class org.apache.spark.MapOutputTracker

Called from executors to get the server URIs and output sizes of the map outputs of
 a given shuffle.

getServletHandlers() - Method in class org.apache.spark.metrics.MetricsSystem

Get any UI handlers used by this metrics system; can only be called after start().

getShort(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column i as a short.

getShortWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getSingle(BlockId) - Method in class org.apache.spark.storage.BlockManager

Read a block consisting of a single object.

getSize(BlockId) - Method in class org.apache.spark.storage.BlockStore

Return the size of a block in bytes.

getSize(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getSize(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getSize(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getSizeForBlock(int) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
getSizeForBlock(int) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
getSizeForBlock(int) - Method in interface org.apache.spark.scheduler.MapStatus

Estimated size for the reduce block, in bytes.

getSizesOfActiveStateTrackingCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSizesOfHardSizeLimitedCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSizesOfSoftSizeLimitedCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSortedRolledOverFiles(String, String) - Static method in class org.apache.spark.util.logging.RollingFileAppender

Get the sorted list of rolled over files.

getSortedTaskSetQueue() - Method in class org.apache.spark.scheduler.Pool
 
getSortedTaskSetQueue() - Method in interface org.apache.spark.scheduler.Schedulable
 
getSortedTaskSetQueue() - Method in class org.apache.spark.scheduler.TaskSetManager
 
getSparkClassLoader() - Static method in class org.apache.spark.util.Utils

Get the ClassLoader which loaded Spark.

getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext

Get Spark's home location from either a value set through the constructor,
 or the spark.home Java property, or the SPARK_HOME environment variable
 (in that order of preference).

getSparkHome() - Method in class org.apache.spark.SparkContext

Get Spark's home location from either a value set through the constructor,
 or the spark.home Java property, or the SPARK_HOME environment variable
 (in that order of preference).

getSparkOrYarnConfig(SparkConf, String, String) - Static method in class org.apache.spark.util.Utils

Return the value of a config either through the SparkConf or the Hadoop configuration
 if this is Yarn mode.

getSparkUI(StreamingContext) - Static method in class org.apache.spark.streaming.ui.StreamingTab
 
getSplits(Configuration, List<Footer>) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker

Returns stage information, or null if the stage info could not be found or was
 garbage collected.

getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker

Returns stage information, or None if the stage info could not be found or was
 garbage collected.

getStages() - Method in class org.apache.spark.ml.Pipeline
 
getStartTime() - Method in class org.apache.spark.streaming.util.RecurringTimer

Get the time when this timer will fire if it is started right now.

getStatsSetupConstRawDataSize() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getStatsSetupConstTotalSize() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getStatus(BlockId) - Method in class org.apache.spark.storage.BlockManager

Get the BlockStatus for the block identified by the given ID, if it exists.

getStatus(BlockId) - Method in class org.apache.spark.storage.BlockManagerInfo
 
getStderr(Process, long) - Static method in class org.apache.spark.util.Utils

Return the stderr of a process after waiting for the process to terminate.

getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike

Get the RDD's current storage level, or StorageLevel.NONE if none is set.

getStorageLevel() - Method in class org.apache.spark.rdd.RDD

Get the RDD's current storage level, or StorageLevel.NONE if none is set.

getStorageStatus() - Method in class org.apache.spark.storage.BlockManagerMaster
 
getString(int) - Method in class org.apache.spark.sql.api.java.Row

Returns the value of column i as a String.

getStringWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getSystemProperties() - Static method in class org.apache.spark.util.Utils

Returns a copy of the system properties that is thread-safe to iterator over.

getTableDesc(Class<? extends Deserializer>, Class<? extends InputFormat<?, ?>>, Class<?>, Properties) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getTabs() - Method in class org.apache.spark.ui.WebUI
 
getTaskSideSplits(Configuration, List<Footer>, Long, Long, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getThreadDump() - Static method in class org.apache.spark.util.Utils

Return a thread dump of all threads' stacktraces.

getThreadLocal() - Static method in class org.apache.spark.SparkEnv

Returns the ThreadLocal SparkEnv.

getThreshold() - Method in interface org.apache.spark.ml.param.HasThreshold
 
getTime() - Method in interface org.apache.spark.util.Clock
 
getTime() - Static method in class org.apache.spark.util.SystemClock
 
getTimeMillis() - Method in interface org.apache.spark.Clock
 
getTimeMillis() - Method in class org.apache.spark.RealClock
 
getTimeMillis() - Method in class org.apache.spark.TestClock
 
getTimestamp(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
getTimestamp(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getTimeStampedValue(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
getTimestampWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
 
GettingResultEvent - Class in org.apache.spark.scheduler
 
GettingResultEvent(TaskInfo) - Constructor for class org.apache.spark.scheduler.GettingResultEvent
 
gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo

The time when the task started remotely getting the result.

getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getUIPort(SparkConf) - Static method in class org.apache.spark.ui.SparkUI
 
getUnallocatedBlocks(int) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker

Get blocks that have been added but not yet allocated to any batch.

getUpperBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds

Returns a threshold p such that if we conduct n Bernoulli trials with success rate = p,
 it is very unlikely to have less than fraction * n successes.

getUpperBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds

Returns a lambda such that Pr[X < s] is very small, where X ~ Pois(lambda).

getUsedTimeMs(long) - Static method in class org.apache.spark.util.Utils

Return the string to tell how long has passed in milliseconds.

getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getValue(Row) - Method in class org.apache.spark.sql.execution.joins.UniqueKeyHashedRelation
 
getValues(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
getValues(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getValues(BlockId, Serializer) - Method in class org.apache.spark.storage.DiskStore

A version of getValues that allows a custom serializer.

getValues(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getValues(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getValueType() - Method in class org.apache.spark.sql.api.java.MapType
 
getVectorIterator(RandomRDDPartition<Object>, int) - Static method in class org.apache.spark.mllib.rdd.RandomRDD
 
getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel

Returns a map of words to their vector representations.

getViewAcls() - Method in class org.apache.spark.SecurityManager
 
Gini - Class in org.apache.spark.mllib.tree.impurity

:: Experimental ::
 Class for calculating the
 Gini impurity
 during binary classification.

Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
 
GiniAggregator - Class in org.apache.spark.mllib.tree.impurity

Class for updating views of a vector of sufficient statistics,
 in order to compute impurity from a sample.

GiniAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.GiniAggregator
 
GiniCalculator - Class in org.apache.spark.mllib.tree.impurity

Stores statistics for one (node, feature, bin) for calculating impurity.

GiniCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.GiniCalculator
 
global() - Method in class org.apache.spark.sql.execution.ExternalSort
 
global() - Method in class org.apache.spark.sql.execution.Sort
 
glom() - Method in interface org.apache.spark.api.java.JavaRDDLike

Return an RDD created by coalescing all elements within each partition into an array.

glom() - Method in class org.apache.spark.rdd.RDD

Return an RDD created by coalescing all elements within each partition into an array.

glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying glom() to each RDD of
 this DStream.

glom() - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD is generated by applying glom() to each RDD of
 this DStream.

GlommedDStream<T> - Class in org.apache.spark.streaming.dstream
 
GlommedDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.GlommedDStream
 
GlommedRDD<T> - Class in org.apache.spark.rdd
 
GlommedRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.GlommedRDD
 
goodnessOfFit() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
grad() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
Gradient - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Class used to compute the gradient for a loss function, given a single data point.

Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
 
gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError

Method to calculate the gradients for the gradient boosting calculation for least
 absolute error calculation.

gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss

Method to calculate the loss gradients for the gradient boosting calculation for binary
 classification
 The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x)))

gradient(TreeEnsembleModel, LabeledPoint) - Method in interface org.apache.spark.mllib.tree.loss.Loss

Method to calculate the gradients for the gradient boosting calculation.

gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError

Method to calculate the gradients for the gradient boosting calculation for least
 squares error calculation.

GradientBoostedTrees - Class in org.apache.spark.mllib.tree

:: Experimental ::
 A class that implements
 Stochastic Gradient Boosting
 for regression and binary classification.

GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
 
GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model

:: Experimental ::
 Represents a gradient boosted trees model.

GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
GradientDescent - Class in org.apache.spark.mllib.optimization

Class used to solve an optimization problem using Gradient Descent.

GradientDescent(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.GradientDescent
 
Graph<VD,ED> - Class in org.apache.spark.graphx

The Graph abstractly represents a graph with arbitrary objects
 associated with vertices and edges.

graph() - Method in class org.apache.spark.streaming.Checkpoint
 
graph() - Method in class org.apache.spark.streaming.dstream.DStream
 
graph() - Method in class org.apache.spark.streaming.StreamingContext
 
GraphGenerators - Class in org.apache.spark.graphx.util

A collection of graph generating functions.

GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
 
GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl

An implementation of Graph to support computation on graphs.

graphite() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_PREFIX() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_HOST() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PORT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PREFIX() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GraphiteSink - Class in org.apache.spark.metrics.sink
 
GraphiteSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.GraphiteSink
 
GraphKryoRegistrator - Class in org.apache.spark.graphx

Registers GraphX classes with Kryo for improved performance.

GraphKryoRegistrator() - Constructor for class org.apache.spark.graphx.GraphKryoRegistrator
 
GraphLoader - Class in org.apache.spark.graphx

Provides utilities for loading Graphs from files.

GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
 
GraphOps<VD,ED> - Class in org.apache.spark.graphx

Contains additional functionality for Graph.

GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
 
graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph

Implicitly extracts the GraphOps member from a graph.

GraphXUtils - Class in org.apache.spark.graphx
 
GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
 
greater(Duration) - Method in class org.apache.spark.streaming.Duration
 
greater(Time) - Method in class org.apache.spark.streaming.Time
 
greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
 
greaterEq(Time) - Method in class org.apache.spark.streaming.Time
 
GreaterThan - Class in org.apache.spark.sql.sources
 
GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
 
GreaterThanOrEqual - Class in org.apache.spark.sql.sources
 
GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
 
gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators

Create rows by cols grid graph with each vertex connected to its
 row+1 and col+1 neighbors.

groupArr() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return an RDD of grouped elements.

groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return an RDD of grouped elements.

groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD

Return an RDD of grouped items.

groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD

Return an RDD of grouped elements.

groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD

Return an RDD of grouped items.

groupBy(Seq<Expression>, Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD

Performs a grouping followed by an aggregation.

groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Group the values for each key in the RDD into a single sequence.

groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD

Group the values for each key in the RDD into a single sequence.

groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD

Group the values for each key in the RDD into a single sequence.

groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions

Group the values for each key in the RDD into a single sequence.

groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions

Group the values for each key in the RDD into a single sequence.

groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions

Group the values for each key in the RDD into a single sequence.

groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying groupByKey to each RDD.

groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying groupByKey to each RDD.

groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying groupByKey on each RDD of this DStream.

groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying groupByKey to each RDD.

groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying groupByKey to each RDD.

groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying groupByKey on each RDD.

groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying groupByKey over a sliding window.

groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying groupByKey over a sliding window.

groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying groupByKey over a sliding window on this DStream.

groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying groupByKey over a sliding window on this DStream.

groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying groupByKey over a sliding window.

groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying groupByKey over a sliding window.

groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying groupByKey over a sliding window on this DStream.

groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Create a new DStream by applying groupByKey over a sliding window on this DStream.

groupByResultToJava(RDD<Tuple2<K, Iterable<T>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
GroupedCountEvaluator<T> - Class in org.apache.spark.partial

An ApproximateEvaluator for counts by key.

GroupedCountEvaluator(int, double, ClassTag<T>) - Constructor for class org.apache.spark.partial.GroupedCountEvaluator
 
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph

Merges multiple edges between two vertices into a single edge.

groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Merge all the edges with the same src and dest id into a single
 edge using the merge function

groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
GroupedMeanEvaluator<T> - Class in org.apache.spark.partial

An ApproximateEvaluator for means by key.

GroupedMeanEvaluator(int, double) - Constructor for class org.apache.spark.partial.GroupedMeanEvaluator
 
GroupedSumEvaluator<T> - Class in org.apache.spark.partial

An ApproximateEvaluator for sums by key.

GroupedSumEvaluator(int, double) - Constructor for class org.apache.spark.partial.GroupedSumEvaluator
 
groupHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
groupId() - Method in class org.apache.spark.scheduler.JobGroupCancelled
 
groupingExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
 
groupingExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD

Alias for cogroup.

groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD

Alias for cogroup.

groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD

Alias for cogroup.

groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Alias for cogroup.

groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Alias for cogroup.

groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Alias for cogroup.

groupWriter() - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
GrowableAccumulableParam<R,T> - Class in org.apache.spark
 
GrowableAccumulableParam(Function1<R, Growable<T>>, ClassTag<R>) - Constructor for class org.apache.spark.GrowableAccumulableParam
 




H

hadoopConfiguration() - Method in class org.apache.spark.api.java.JavaSparkContext

Returns the Hadoop configuration used for the Hadoop code (e.g.

hadoopConfiguration() - Method in class org.apache.spark.SparkContext

A default Hadoop Configuration for the Hadoop code (e.g.

hadoopFile(String, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD for a Hadoop file with an arbitrary InputFormat.

hadoopFile(String, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD for a Hadoop file with an arbitrary InputFormat

hadoopFile(String, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext

Get an RDD for a Hadoop file with an arbitrary InputFormat

hadoopFile(String, int, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext

Smarter version of hadoopFile() that uses class tags to figure out the classes of keys,
 values and the InputFormat so that users don't need to pass them directly.

hadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext

Smarter version of hadoopFile() that uses class tags to figure out the classes of keys,
 values and the InputFormat so that users don't need to pass them directly.

hadoopFiles() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
hadoopJobMetadata() - Method in class org.apache.spark.SparkEnv
 
HadoopPartition - Class in org.apache.spark.rdd

A Spark split class that wraps around a Hadoop InputSplit.

HadoopPartition(int, int, InputSplit) - Constructor for class org.apache.spark.rdd.HadoopPartition
 
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any
 other necessary info (e.g.

hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any
 other necessary info (e.g.

HadoopRDD<K,V> - Class in org.apache.spark.rdd

:: DeveloperApi ::
 An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS,
 sources in HBase, or S3), using the older MapReduce API (org.apache.hadoop.mapred).

HadoopRDD(SparkContext, Broadcast<SerializableWritable<Configuration>>, Option<Function1<JobConf, BoxedUnit>>, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
 
HadoopRDD(SparkContext, JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
 
hadoopRDD(JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext

Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other
 necessary info (e.g.

HadoopRDD.HadoopMapPartitionsWithSplitRDD<U,T> - Class in org.apache.spark.rdd

Analogous to MapPartitionsRDD, but passes in an InputSplit to
 the given function rather than the index of the partition.

HadoopRDD.HadoopMapPartitionsWithSplitRDD(RDD<T>, Function2<InputSplit, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
HadoopRDD.HadoopMapPartitionsWithSplitRDD$ - Class in org.apache.spark.rdd
 
HadoopRDD.HadoopMapPartitionsWithSplitRDD$() - Constructor for class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$
 
HadoopRDD.SplitInfoReflections - Class in org.apache.spark.rdd
 
HadoopRDD.SplitInfoReflections() - Constructor for class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
HadoopTableReader - Class in org.apache.spark.sql.hive

Helper class for scanning tables stored in Hadoop - e.g., to read Hive tables that reside in the
 data warehouse directory.

HadoopTableReader(Seq<Attribute>, MetastoreRelation, HiveContext, HiveConf) - Constructor for class org.apache.spark.sql.hive.HadoopTableReader
 
hammingLoss() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns Hamming-loss

handle() - Method in class org.apache.spark.rdd.ShuffleCoGroupSplitDep
 
handle(Signal) - Method in class org.apache.spark.util.SignalLoggerHandler
 
handleBeginEvent(Task<?>, TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleExecutorAdded(String, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleExecutorLost(String, boolean, Option<Object>) - Method in class org.apache.spark.scheduler.DAGScheduler

Responds to an executor being lost.

handleFailedTask(TaskSetManager, long, Enumeration.Value, TaskEndReason) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleFailedTask(long, Enumeration.Value, TaskEndReason) - Method in class org.apache.spark.scheduler.TaskSetManager

Marks the task as failed, re-adds it to the list of pending tasks, and notifies the
 DAG Scheduler.

handleGetTaskResult(TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobCancellation(int, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobCompletion(Job) - Method in class org.apache.spark.streaming.scheduler.JobSet
 
handleJobGroupCancelled(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobStart(Job) - Method in class org.apache.spark.streaming.scheduler.JobSet
 
handleJobSubmitted(int, RDD<?>, Function2<TaskContext, Iterator<Object>, ?>, int[], boolean, CallSite, JobListener, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleKillRequest(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.StagesTab
 
handleStageCancellation(int) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleSuccessfulTask(TaskSetManager, long, DirectTaskResult<?>) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleSuccessfulTask(long, DirectTaskResult<?>) - Method in class org.apache.spark.scheduler.TaskSetManager

Marks the task as successful and notifies the DAGScheduler that a task has ended.

handleTaskCompletion(CompletionEvent) - Method in class org.apache.spark.scheduler.DAGScheduler

Responds to a task finishing.

handleTaskGettingResult(TaskSetManager, long) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleTaskGettingResult(long) - Method in class org.apache.spark.scheduler.TaskSetManager

Marks the task as getting result and notifies the DAG Scheduler

handleTaskSetFailed(TaskSet, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
hasCompleted() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
hasDstId() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
hasExecutorsAliveOnHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
HasFeaturesCol - Interface in org.apache.spark.ml.param
 
HashAggregation() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
hashCode() - Method in class org.apache.spark.graphx.EdgeDirection
 
hashCode() - Method in class org.apache.spark.HashPartitioner
 
hashCode() - Method in interface org.apache.spark.mllib.linalg.Vector
 
hashCode() - Method in interface org.apache.spark.Partition
 
hashCode() - Method in class org.apache.spark.RangePartitioner
 
hashCode() - Method in class org.apache.spark.rdd.CoGroupPartition
 
hashCode() - Method in class org.apache.spark.rdd.HadoopPartition
 
hashCode() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
hashCode() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
hashCode() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
hashCode() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
hashCode() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
hashCode() - Method in class org.apache.spark.scheduler.SplitInfo
 
hashCode() - Method in class org.apache.spark.scheduler.Stage
 
hashCode() - Method in class org.apache.spark.sql.api.java.ArrayType
 
hashCode() - Method in class org.apache.spark.sql.api.java.DecimalType
 
hashCode() - Method in class org.apache.spark.sql.api.java.MapType
 
hashCode() - Method in class org.apache.spark.sql.api.java.Row
 
hashCode() - Method in class org.apache.spark.sql.api.java.StructField
 
hashCode() - Method in class org.apache.spark.sql.api.java.StructType
 
hashCode() - Method in class org.apache.spark.storage.BlockId
 
hashCode() - Method in class org.apache.spark.storage.BlockManagerId
 
hashCode() - Method in class org.apache.spark.storage.StorageLevel
 
HashedRelation - Interface in org.apache.spark.sql.execution.joins

Interface for a hashed relation by some key.

HashingTF - Class in org.apache.spark.ml.feature

:: AlphaComponent ::
 Maps a sequence of terms to their term frequencies using the hashing trick.

HashingTF() - Constructor for class org.apache.spark.ml.feature.HashingTF
 
HashingTF - Class in org.apache.spark.mllib.feature

:: Experimental ::
 Maps a sequence of terms to their term frequencies using the hashing trick.

HashingTF(int) - Constructor for class org.apache.spark.mllib.feature.HashingTF
 
HashingTF() - Constructor for class org.apache.spark.mllib.feature.HashingTF
 
HashJoin - Interface in org.apache.spark.sql.execution.joins
 
hashJoin(Iterator<Row>, HashedRelation) - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
HashJoin() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
hasHostAliveOnRack(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
HashOuterJoin - Class in org.apache.spark.sql.execution.joins

:: DeveloperApi ::
 Performs a hash based outer join for two child relations by shuffling the data using
 the join keys.

HashOuterJoin(Seq<Expression>, Seq<Expression>, JoinType, Option<Expression>, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.HashOuterJoin
 
HashPartitioner - Class in org.apache.spark

A Partitioner that implements hash-based partitioning using
 Java's Object.hashCode.

HashPartitioner(int) - Constructor for class org.apache.spark.HashPartitioner
 
HasInputCol - Interface in org.apache.spark.ml.param
 
HasLabelCol - Interface in org.apache.spark.ml.param
 
HasMaxIter - Interface in org.apache.spark.ml.param
 
hasNext() - Method in class org.apache.spark.InterruptibleIterator
 
hasNext() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
hasNext() - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
hasNext() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
hasNext() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
hasNext() - Method in interface org.apache.spark.sql.columnar.compression.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
hasNext() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
hasNext() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
hasNext() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
hasNext() - Method in class org.apache.spark.util.CompletionIterator
 
hasNext() - Method in class org.apache.spark.util.NextIterator
 
hasNext() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
hasNext() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
HasOutputCol - Interface in org.apache.spark.ml.param
 
HasPredictionCol - Interface in org.apache.spark.ml.param
 
HasRegParam - Interface in org.apache.spark.ml.param
 
hasRootAsShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
hasRootAsShutdownDeleteDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
HasScoreCol - Interface in org.apache.spark.ml.param
 
hasShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
hasShutdownDeleteTachyonDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
hasSrcId() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
hasStarted() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
HasThreshold - Interface in org.apache.spark.ml.param
 
hasUnallocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker

Check if any blocks are left to be processed

hasUnallocatedReceivedBlocks() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker

Check if any blocks are left to be allocated to batches.

HDFSCacheTaskLocation - Class in org.apache.spark.scheduler

A location on a host that is cached by HDFS.

HDFSCacheTaskLocation(String) - Constructor for class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
HdfsUtils - Class in org.apache.spark.streaming.util
 
HdfsUtils() - Constructor for class org.apache.spark.streaming.util.HdfsUtils
 
headerSparkPage(String, Function0<Seq<Node>>, SparkUITab, Option<Object>, Option<String>) - Static method in class org.apache.spark.ui.UIUtils

Returns a spark page with correctly formatted headers

headerTabs() - Method in class org.apache.spark.ui.WebUITab

Get a list of header tabs from the parent UI.

Heartbeat - Class in org.apache.spark

A heartbeat from executors to the driver.

Heartbeat(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Constructor for class org.apache.spark.Heartbeat
 
HeartbeatReceiver - Class in org.apache.spark

Lives in the driver to receive heartbeats from executors..

HeartbeatReceiver(TaskScheduler) - Constructor for class org.apache.spark.HeartbeatReceiver
 
HeartbeatResponse - Class in org.apache.spark
 
HeartbeatResponse(boolean) - Constructor for class org.apache.spark.HeartbeatResponse
 
hiccups() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
high() - Method in class org.apache.spark.partial.BoundedDouble
 
HighlyCompressedMapStatus - Class in org.apache.spark.scheduler

A MapStatus implementation that only stores the average size of non-empty blocks,
 plus a bitmap for tracking which blocks are empty.

highSplit() - Method in class org.apache.spark.mllib.tree.model.Bin
 
HingeGradient - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.

HingeGradient() - Constructor for class org.apache.spark.mllib.optimization.HingeGradient
 
histogram(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Compute a histogram of the data using bucketCount number of buckets evenly
  spaced between the minimum and maximum of the RDD.

histogram(double[]) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Compute a histogram using the provided buckets.

histogram(Double[], boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
histogram(int) - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Compute a histogram of the data using bucketCount number of buckets evenly
  spaced between the minimum and maximum of the RDD.

histogram(double[], boolean) - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Compute a histogram using the provided buckets.

hiveContext() - Method in class org.apache.spark.sql.hive.execution.AddFile
 
hiveContext() - Method in class org.apache.spark.sql.hive.execution.AddJar
 
hiveContext() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
 
hiveContext() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
HiveContext - Class in org.apache.spark.sql.hive

An instance of the Spark SQL execution engine that integrates with data stored in Hive.

HiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext
 
hiveContext() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
hiveDevHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext

The location of the hive source code.

hiveFilesTemp() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
HiveFunctionRegistry - Class in org.apache.spark.sql.hive
 
HiveFunctionRegistry() - Constructor for class org.apache.spark.sql.hive.HiveFunctionRegistry
 
HiveFunctionWrapper - Class in org.apache.spark.sql.hive

This class provides the UDF creation and also the UDF instance serialization and
 de-serialization cross process boundary.

HiveFunctionWrapper(String) - Constructor for class org.apache.spark.sql.hive.HiveFunctionWrapper
 
HiveFunctionWrapper() - Constructor for class org.apache.spark.sql.hive.HiveFunctionWrapper
 
HiveGenericUdaf - Class in org.apache.spark.sql.hive
 
HiveGenericUdaf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdaf
 
HiveGenericUdf - Class in org.apache.spark.sql.hive
 
HiveGenericUdf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdf
 
HiveGenericUdtf - Class in org.apache.spark.sql.hive

Converts a Hive Generic User Defined Table Generating Function (UDTF) to a
 Generator.

HiveGenericUdtf(HiveFunctionWrapper, Seq<String>, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdtf
 
hiveHome() - Method in class org.apache.spark.sql.hive.test.TestHiveContext

The location of the compiled hive distribution

HiveInspectors - Interface in org.apache.spark.sql.hive
 
HiveInspectors.typeInfoConversions - Class in org.apache.spark.sql.hive
 
HiveInspectors.typeInfoConversions(DataType) - Constructor for class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
 
HiveMetastoreCatalog - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog(HiveContext) - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
HiveMetastoreCatalog.CreateTables - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog.CreateTables() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables

Creates any tables required for query execution.

HiveMetastoreCatalog.PreInsertionCasts - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog.PreInsertionCasts() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts

Casts input data to correct data types according to table definition before inserting into
 that table.

HiveMetastoreTypes - Class in org.apache.spark.sql.hive

:: DeveloperApi ::
 Provides conversions between Spark SQL data types and Hive Metastore types.

HiveMetastoreTypes() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreTypes
 
hivePlanner() - Method in class org.apache.spark.sql.hive.HiveContext
 
hiveql(String) - Method in class org.apache.spark.sql.hive.HiveContext
 
HiveQl - Class in org.apache.spark.sql.hive

Provides a mapping from HiveQL statements to catalyst logical plans and expression trees.

HiveQl() - Constructor for class org.apache.spark.sql.hive.HiveQl
 
HiveQl.ParseException - Exception in org.apache.spark.sql.hive

Throws an error if this is not equal to other.

HiveQl.ParseException(String, Throwable) - Constructor for exception org.apache.spark.sql.hive.HiveQl.ParseException
 
HiveQl.SemanticException - Exception in org.apache.spark.sql.hive
 
HiveQl.SemanticException(String) - Constructor for exception org.apache.spark.sql.hive.HiveQl.SemanticException
 
HiveQl.Token$ - Class in org.apache.spark.sql.hive

Extractor for matching Hive's AST Tokens.

HiveQl.Token$() - Constructor for class org.apache.spark.sql.hive.HiveQl.Token$
 
HiveQl.TransformableNode - Class in org.apache.spark.sql.hive

A set of implicit transformations that allow Hive ASTNodes to be rewritten by transformations
 similar to catalyst.trees.TreeNode.

HiveQl.TransformableNode(ASTNode) - Constructor for class org.apache.spark.sql.hive.HiveQl.TransformableNode
 
hiveQlPartitions() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
hiveQlTable() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
hiveQTestUtilTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
HiveShim - Class in org.apache.spark.sql.hive

A compatibility layer for interacting with Hive version 0.13.1.

HiveShim() - Constructor for class org.apache.spark.sql.hive.HiveShim
 
HiveSimpleUdf - Class in org.apache.spark.sql.hive
 
HiveSimpleUdf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveSimpleUdf
 
HiveStrategies - Interface in org.apache.spark.sql.hive
 
HiveStrategies.DataSinks - Class in org.apache.spark.sql.hive
 
HiveStrategies.DataSinks() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.DataSinks
 
HiveStrategies.HiveCommandStrategy - Class in org.apache.spark.sql.hive
 
HiveStrategies.HiveCommandStrategy(HiveContext) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
HiveStrategies.HiveTableScans - Class in org.apache.spark.sql.hive
 
HiveStrategies.HiveTableScans() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans

Retrieves data using a HiveTableScan.

HiveStrategies.ParquetConversion - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion

:: Experimental ::
 Finds table scans that would use the Hive SerDe and replaces them with our own native parquet
 table scan operator.

HiveStrategies.ParquetConversion.LogicalPlanHacks - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion.LogicalPlanHacks(SchemaRDD) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
HiveStrategies.ParquetConversion.PhysicalPlanHacks - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion.PhysicalPlanHacks(SparkPlan) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.PhysicalPlanHacks
 
HiveStrategies.Scripts - Class in org.apache.spark.sql.hive
 
HiveStrategies.Scripts() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.Scripts
 
hiveString() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
HiveTableScan - Class in org.apache.spark.sql.hive.execution

:: DeveloperApi ::
 The Hive table scan operator.

HiveTableScan(Seq<Attribute>, MetastoreRelation, Option<Expression>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.HiveTableScan
 
HiveTableScans() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
HiveUdaf - Class in org.apache.spark.sql.hive

It is used as a wrapper for the hive functions which uses UDAF interface

HiveUdaf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveUdaf
 
HiveUdafFunction - Class in org.apache.spark.sql.hive
 
HiveUdafFunction(HiveFunctionWrapper, Seq<Expression>, AggregateExpression, boolean) - Constructor for class org.apache.spark.sql.hive.HiveUdafFunction
 
HiveUdafFunction() - Constructor for class org.apache.spark.sql.hive.HiveUdafFunction
 
host() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
host() - Method in class org.apache.spark.scheduler.ExecutorAdded
 
host() - Method in class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
host() - Method in class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
host() - Method in class org.apache.spark.scheduler.HostTaskLocation
 
host() - Method in class org.apache.spark.scheduler.TaskInfo
 
host() - Method in interface org.apache.spark.scheduler.TaskLocation
 
host() - Method in class org.apache.spark.scheduler.WorkerOffer
 
host() - Method in class org.apache.spark.storage.BlockManagerId
 
host() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
hostLocation() - Method in class org.apache.spark.scheduler.SplitInfo
 
hostPort() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
hostPort() - Method in class org.apache.spark.storage.BlockManagerId
 
hostPort() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
HostTaskLocation - Class in org.apache.spark.scheduler

A location on a host.

HostTaskLocation(String) - Constructor for class org.apache.spark.scheduler.HostTaskLocation
 
hours() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
hql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext

DEPRECATED: Use sql(...) Instead

hql(String) - Method in class org.apache.spark.sql.hive.HiveContext
 
htmlResponderToServlet(Function1<HttpServletRequest, Seq<Node>>) - Static method in class org.apache.spark.ui.JettyUtils
 
HTTP_BROADCAST() - Static method in class org.apache.spark.util.MetadataCleanerType
 
HttpBroadcast<T> - Class in org.apache.spark.broadcast

A Broadcast implementation that uses HTTP server
 as a broadcast mechanism.

HttpBroadcast(T, boolean, long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.HttpBroadcast
 
HttpBroadcastFactory - Class in org.apache.spark.broadcast

A BroadcastFactory implementation that uses a
 HTTP server as the broadcast mechanism.

HttpBroadcastFactory() - Constructor for class org.apache.spark.broadcast.HttpBroadcastFactory
 
HttpFileServer - Class in org.apache.spark
 
HttpFileServer(SparkConf, SecurityManager, int) - Constructor for class org.apache.spark.HttpFileServer
 
httpFileServer() - Method in class org.apache.spark.SparkEnv
 
httpServer() - Method in class org.apache.spark.HttpFileServer
 
HttpServer - Class in org.apache.spark

An HTTP server for static content used to allow worker nodes to access JARs added to SparkContext
 as well as classes created by the interpreter when the user types in code.

HttpServer(SparkConf, File, SecurityManager, int, String) - Constructor for class org.apache.spark.HttpServer
 
HyperLogLogSerializer - Class in org.apache.spark.sql.execution
 
HyperLogLogSerializer() - Constructor for class org.apache.spark.sql.execution.HyperLogLogSerializer
 




I

i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
id() - Method in class org.apache.spark.Accumulable
 
id() - Method in interface org.apache.spark.api.java.JavaRDDLike

A unique ID for this RDD (within its SparkContext).

id() - Method in class org.apache.spark.broadcast.Broadcast
 
id() - Method in class org.apache.spark.mllib.tree.model.Node
 
id() - Method in class org.apache.spark.rdd.RDD

A unique ID for this RDD (within its SparkContext).

id() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
id() - Method in class org.apache.spark.scheduler.Stage
 
id() - Method in class org.apache.spark.scheduler.TaskInfo
 
id() - Method in class org.apache.spark.scheduler.TaskSet
 
id() - Method in class org.apache.spark.storage.RDDInfo
 
id() - Method in class org.apache.spark.storage.TempLocalBlockId
 
id() - Method in class org.apache.spark.storage.TempShuffleBlockId
 
id() - Method in class org.apache.spark.storage.TestBlockId
 
id() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream

This is an unique identifier for the receiver input stream.

id() - Method in class org.apache.spark.streaming.scheduler.Job
 
id() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
Identifiable - Interface in org.apache.spark.ml

Object with a unique id.

IDF - Class in org.apache.spark.mllib.feature

:: Experimental ::
 Inverse document frequency (IDF).

IDF(int) - Constructor for class org.apache.spark.mllib.feature.IDF
 
IDF() - Constructor for class org.apache.spark.mllib.feature.IDF
 
idf() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator

Returns the current IDF vector.

idf() - Method in class org.apache.spark.mllib.feature.IDFModel
 
IDF.DocumentFrequencyAggregator - Class in org.apache.spark.mllib.feature

Document frequency aggregator.

IDF.DocumentFrequencyAggregator(int) - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
IDF.DocumentFrequencyAggregator() - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
IDFModel - Class in org.apache.spark.mllib.feature

:: Experimental ::
 Represents an IDF model that can transform term frequency vectors.

IDFModel(Vector) - Constructor for class org.apache.spark.mllib.feature.IDFModel
 
IdGenerator - Class in org.apache.spark.util

A util used to get a unique generation ID.

IdGenerator() - Constructor for class org.apache.spark.util.IdGenerator
 
idx() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
idx() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
idx() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
ifExists() - Method in class org.apache.spark.sql.hive.DropTable
 
ifExists() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
Impurities - Class in org.apache.spark.mllib.tree.impurity

Factory for Impurity instances.

Impurities() - Constructor for class org.apache.spark.mllib.tree.impurity.Impurities
 
impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
impurity() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
Impurity - Interface in org.apache.spark.mllib.tree.impurity

:: Experimental ::
 Trait for calculating information gain.

impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
impurity() - Method in class org.apache.spark.mllib.tree.model.Node
 
impurityAggregator() - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator

ImpurityAggregator instance specifying the impurity type.

ImpurityAggregator - Class in org.apache.spark.mllib.tree.impurity

Interface for updating views of a vector of sufficient statistics,
 in order to compute impurity from a sample.

ImpurityAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
 
ImpurityCalculator - Class in org.apache.spark.mllib.tree.impurity

Stores statistics for one (node, feature, bin) for calculating impurity.

ImpurityCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
 
In() - Static method in class org.apache.spark.graphx.EdgeDirection

Edges arriving at a vertex.

IN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
In - Class in org.apache.spark.sql.sources
 
In(String, Object[]) - Constructor for class org.apache.spark.sql.sources.In
 
increaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
 
incrementEpoch() - Method in class org.apache.spark.MapOutputTrackerMaster
 
inDegrees() - Method in class org.apache.spark.graphx.GraphOps

The in-degree of each vertex in the graph.

independence() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
index() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
index() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
index() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
index(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
index(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix

Return the index for the (i, j)-th element in the backing array.

index(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
index() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
index() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
index() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
index() - Method in interface org.apache.spark.Partition

Get the partition's index within its parent RDD

index() - Method in class org.apache.spark.rdd.BlockRDDPartition
 
index() - Method in class org.apache.spark.rdd.CartesianPartition
 
index() - Method in class org.apache.spark.rdd.CheckpointRDDPartition
 
index() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
index() - Method in class org.apache.spark.rdd.CoGroupPartition
 
index() - Method in class org.apache.spark.rdd.HadoopPartition
 
index() - Method in class org.apache.spark.rdd.JdbcPartition
 
index() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
index() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
index() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
index() - Method in class org.apache.spark.rdd.PartitionPruningRDDPartition
 
index() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
index() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
index() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
index() - Method in class org.apache.spark.rdd.UnionPartition
 
index() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
index() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
index() - Method in class org.apache.spark.scheduler.TaskDescription
 
index() - Method in class org.apache.spark.scheduler.TaskInfo
 
index() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
index() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
IndexedRow - Class in org.apache.spark.mllib.linalg.distributed

:: Experimental ::
 Represents a row of IndexedRowMatrix.

IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed

:: Experimental ::
 Represents a row-oriented DistributedMatrix with
 indexed rows.

IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix

Alternative constructor leaving matrix dimensions to be determined automatically.

indexOf(Object) - Method in class org.apache.spark.mllib.feature.HashingTF

Returns the index of the input term.

indexSize() - Method in class org.apache.spark.graphx.impl.EdgePartition

The number of unique source vertices in the partition.

indexToLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node

Return the level of a tree which the given node is in.

indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
IndirectTaskResult<T> - Class in org.apache.spark.scheduler

A reference to a DirectTaskResult that has been stored in the worker's BlockManager.

IndirectTaskResult(BlockId, int) - Constructor for class org.apache.spark.scheduler.IndirectTaskResult
 
inferSchema(RDD<String>, double, String) - Static method in class org.apache.spark.sql.json.JsonRDD
 
InformationGainStats - Class in org.apache.spark.mllib.tree.model

:: DeveloperApi ::
 Information gain statistics for each split

InformationGainStats(double, double, double, double, Predict, Predict) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
 
init(RDD<BaggedPoint<TreePoint>>, int, Option<String>, int, int) - Static method in class org.apache.spark.mllib.tree.impl.NodeIdCache

Initialize the node Id cache with initial node Id values.

init(Configuration, Map<String, String>, MessageType) - Method in class org.apache.spark.sql.parquet.RowReadSupport
 
init(Configuration) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
init(Configuration) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
initFrom(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartitionBase

Construct the constituents of a VertexPartitionBase from the given vertices, merging duplicate
 entries arbitrarily.

initFrom(Iterator<Tuple2<Object, VD>>, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartitionBase

Construct the constituents of a VertexPartitionBase from the given vertices, merging duplicate
 entries using mergeFunc.

INITIAL_ARRAY_SIZE() - Static method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
initialCheckpoint() - Method in class org.apache.spark.streaming.StreamingContext
 
initialHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
initialize(boolean, SparkConf, SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
initialize(boolean, SparkConf, SecurityManager) - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
initialize(boolean, SparkConf, SecurityManager) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
initialize(boolean, SparkConf, SecurityManager) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
initialize() - Method in class org.apache.spark.HttpFileServer
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamBasedRecordReader
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
initialize() - Method in class org.apache.spark.metrics.MetricsConfig
 
initialize(SchedulerBackend) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
initialize(int, String, boolean) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
initialize() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder

Initializes with an approximate lower bound on the expected number of elements in this column.

initialize() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
initialize() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
initialize(Configuration, Properties) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
initialize(String) - Method in class org.apache.spark.storage.BlockManager

Initializes the BlockManager with the given appId.

initialize(Time) - Method in class org.apache.spark.streaming.dstream.DStream

Initialize the DStream by setting the "zero" time, based on which
 the validity of future times is calculated.

initialize(String) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor

The Kinesis Client Library calls this method during IRecordProcessor initialization.

initialize() - Method in class org.apache.spark.ui.SparkUI

Initialize all components of the server.

initialize() - Method in class org.apache.spark.ui.WebUI

Initialize all components of the server.

Initialized() - Static method in class org.apache.spark.rdd.CheckpointState
 
Initialized() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Initialized() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
initializeIfNecessary() - Method in interface org.apache.spark.Logging
 
initializeLocalJobConfFunc(String, TableDesc, JobConf) - Static method in class org.apache.spark.sql.hive.HadoopTableReader

Curried.

initializeLogging() - Method in interface org.apache.spark.Logging
 
initialValue() - Method in class org.apache.spark.partial.PartialResult
 
initialValues() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
 
initLocalProperties() - Method in class org.apache.spark.SparkContext
 
initNextRecordReader() - Method in class org.apache.spark.input.WholeCombineFileRecordReader
 
InLinkBlock - Class in org.apache.spark.mllib.recommendation

In-link information for a user (or product) block.

InLinkBlock(int[], Tuple2<int[], double[]>[][]) - Constructor for class org.apache.spark.mllib.recommendation.InLinkBlock
 
InMemoryColumnarTableScan - Class in org.apache.spark.sql.columnar
 
InMemoryColumnarTableScan(Seq<Attribute>, Seq<Expression>, InMemoryRelation) - Constructor for class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
inMemoryPartitionPruning() - Method in interface org.apache.spark.sql.SQLConf

When set to true, partition pruning for in-memory columnar tables is enabled.

InMemoryRelation - Class in org.apache.spark.sql.columnar
 
InMemoryRelation(Seq<Attribute>, boolean, int, StorageLevel, SparkPlan, Option<String>, RDD<CachedBatch>, Statistics) - Constructor for class org.apache.spark.sql.columnar.InMemoryRelation
 
InMemoryScans() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
InnerClosureFinder - Class in org.apache.spark.util
 
InnerClosureFinder(Set<Class<?>>) - Constructor for class org.apache.spark.util.InnerClosureFinder
 
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.EdgeRDD

Inner joins this EdgeRDD with another EdgeRDD, assuming both are partitioned using the same
 PartitionStrategy.

innerJoin(EdgePartition<ED2, ?>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Apply f to all edges present in both this and other and return a new EdgePartition
 containing the resulting edges.

innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
innerJoin(Self, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps

Inner join another VertexPartition.

innerJoin(Iterator<Product2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps

Inner join an iterator of messages.

innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD

Inner joins this VertexRDD with an RDD containing vertex attribute pairs.

innerJoinKeepLeft(Iterator<Product2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps

Similar to innerJoin, but vertices from the left side that don't appear in iter will remain in
 the partition, hidden by the bitmask.

innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD

Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.

input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
INPUT() - Static method in class org.apache.spark.ui.ToolTips
 
inputBytes() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
inputBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
inputCol() - Method in interface org.apache.spark.ml.param.HasInputCol

param for input column name

inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
InputDStream<T> - Class in org.apache.spark.streaming.dstream

This is the abstract base class for all input streams.

InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
 
inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
 
InputFormatInfo - Class in org.apache.spark.scheduler

:: DeveloperApi ::
 Parses and holds information about inputFormat (and files) specified as a parameter.

InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
 
inputMetrics() - Method in class org.apache.spark.storage.BlockResult
 
inputMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
inputMetricsToJson(InputMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
inputProjection() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
inputSplit() - Method in class org.apache.spark.rdd.HadoopPartition
 
inputSplitWithLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
inRepoTests() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
insertInto(String, boolean) - Method in interface org.apache.spark.sql.SchemaRDDLike

:: Experimental ::
 Adds the rows from this RDD to the specified table, optionally overwriting the existing data.

insertInto(String) - Method in interface org.apache.spark.sql.SchemaRDDLike

:: Experimental ::
 Appends the rows from this RDD to the specified table.

InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution

:: DeveloperApi ::

InsertIntoHiveTable(MetastoreRelation, Map<String, Option<String>>, SparkPlan, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
InsertIntoHiveTable - Class in org.apache.spark.sql.hive

A logical plan representing insertion into Hive table.

InsertIntoHiveTable(LogicalPlan, Map<String, Option<String>>, LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.hive.InsertIntoHiveTable
 
InsertIntoParquetTable - Class in org.apache.spark.sql.parquet

:: DeveloperApi ::
 Operator that acts as a sink for queries on RDDs and can be used to
 store the output inside a directory of Parquet files.

InsertIntoParquetTable(ParquetRelation, SparkPlan, boolean) - Constructor for class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
inShutdown() - Static method in class org.apache.spark.util.Utils

Detect whether this thread might be executing a shutdown hook.

inspectorToDataType(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
instance() - Method in class org.apache.spark.metrics.MetricsSystem
 
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Entropy

Get this impurity instance.

instance() - Static method in class org.apache.spark.mllib.tree.impurity.Gini

Get this impurity instance.

instance() - Static method in class org.apache.spark.mllib.tree.impurity.Variance

Get this impurity instance.

INSTANCE_REGEX() - Method in class org.apache.spark.metrics.MetricsConfig
 
INT - Class in org.apache.spark.sql.columnar
 
INT() - Constructor for class org.apache.spark.sql.columnar.INT
 
intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext

Create an Accumulator integer variable, which tasks can "add" values
 to using the add method.

intAccumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext

Create an Accumulator integer variable, which tasks can "add" values
 to using the add method.

IntColumnAccessor - Class in org.apache.spark.sql.columnar
 
IntColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.IntColumnAccessor
 
IntColumnBuilder - Class in org.apache.spark.sql.columnar
 
IntColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.IntColumnBuilder
 
IntColumnStats - Class in org.apache.spark.sql.columnar
 
IntColumnStats() - Constructor for class org.apache.spark.sql.columnar.IntColumnStats
 
IntDelta - Class in org.apache.spark.sql.columnar.compression
 
IntDelta() - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta
 
IntDelta.Decoder - Class in org.apache.spark.sql.columnar.compression
 
IntDelta.Decoder(ByteBuffer, NativeColumnType<IntegerType$>) - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
IntDelta.Encoder - Class in org.apache.spark.sql.columnar.compression
 
IntDelta.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
IntegerHashSetSerializer - Class in org.apache.spark.sql.execution
 
IntegerHashSetSerializer() - Constructor for class org.apache.spark.sql.execution.IntegerHashSetSerializer
 
IntegerType - Static variable in class org.apache.spark.sql.api.java.DataType

Gets the IntegerType object.

IntegerType - Class in org.apache.spark.sql.api.java

The data type representing int and Integer values.

INTER_JOB_WAIT_MS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
 
intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
 
intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
internalMap() - Method in class org.apache.spark.util.TimeStampedHashSet
 
InterruptibleIterator<T> - Class in org.apache.spark

:: DeveloperApi ::
 An iterator that wraps around an existing iterator to provide task killing functionality.

InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
 
interruptThread() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
interruptThread() - Method in class org.apache.spark.scheduler.local.KillTask
 
Intersect - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Returns the rows in left that also appear in right using the built in spark
 intersection function.

Intersect(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Intersect
 
intersect(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD

Performs a relational intersect on two SchemaRDDs

intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return the intersection of this RDD and another one.

intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Return the intersection of this RDD and another one.

intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD

Return the intersection of this RDD and another one.

intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD

Return the intersection of this RDD and another one.

intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD

Return the intersection of this RDD and another one.

intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD

Return the intersection of this RDD and another one.

intersection(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Return the intersection of this RDD and another one.

intersection(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Return the intersection of this RDD and another one.

intersection(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Return the intersection of this RDD and another one.

intersection(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
intersection(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
intersection(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
 
Interval - Class in org.apache.spark.streaming
 
Interval(Time, Time) - Constructor for class org.apache.spark.streaming.Interval
 
Interval(long, long) - Constructor for class org.apache.spark.streaming.Interval
 
INTERVAL_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
INTERVAL_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
IntParam - Class in org.apache.spark.ml.param

Specialized version of Param[Int] for Java.

IntParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.IntParam
 
IntParam - Class in org.apache.spark.util

An extractor object for parsing strings into integers.

IntParam() - Constructor for class org.apache.spark.util.IntParam
 
intToIntWritable(int) - Static method in class org.apache.spark.SparkContext
 
intWritableConverter() - Static method in class org.apache.spark.SparkContext
 
invalidateCache(LogicalPlan) - Method in interface org.apache.spark.sql.CacheManager

Invalidates the cache of any data that contains plan.

invalidInformationGainStats() - Static method in class org.apache.spark.mllib.tree.model.InformationGainStats

An InformationGainStats object to
 denote that current split doesn't satisfies minimum info gain or
 minimum number of instances per node.

invoke(Class<?>, Object, String, Seq<Tuple2<Class<?>, Object>>) - Static method in class org.apache.spark.util.Utils
 
invokedMethod(Object, Class<?>, String) - Static method in class org.apache.spark.graphx.util.BytecodeUtils

Test whether the given closure invokes the specified method in the specified class.

isActive(long) - Method in class org.apache.spark.graphx.impl.EdgePartition

Look up vid in activeSet, throwing an exception if it is None.

isAkkaConf(String) - Static method in class org.apache.spark.SparkConf

Return whether the given config is an akka config (e.g.

isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
 
isApplicationCompleteFile(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
isAuthenticationEnabled() - Method in class org.apache.spark.SecurityManager

Check to see if authentication for the Spark communication protocols is enabled

isAvailable() - Method in class org.apache.spark.scheduler.Stage
 
isBindCollision(Throwable) - Static method in class org.apache.spark.util.Utils

Return whether the exception is caused by an address-port collision when binding.

isBroadcast() - Method in class org.apache.spark.storage.BlockId
 
isCached(String) - Method in interface org.apache.spark.sql.CacheManager

Returns true if the table is currently cached in-memory.

isCached() - Method in class org.apache.spark.storage.BlockStatus
 
isCached() - Method in class org.apache.spark.storage.RDDInfo
 
isCancelled() - Method in class org.apache.spark.ComplexFutureAction
 
isCancelled() - Method in interface org.apache.spark.FutureAction

Returns whether the action has been cancelled.

isCancelled() - Method in class org.apache.spark.JavaFutureActionWrapper
 
isCancelled() - Method in class org.apache.spark.SimpleFutureAction
 
isCategorical(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike

Return whether this RDD has been checkpointed or not

isCheckpointed() - Method in class org.apache.spark.rdd.RDD

Return whether this RDD has been checkpointed or not

isCheckpointed() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
isCheckpointPresent() - Method in class org.apache.spark.streaming.StreamingContext
 
isClassification() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isCompleted() - Method in class org.apache.spark.ComplexFutureAction
 
isCompleted() - Method in interface org.apache.spark.FutureAction

Returns whether the action has already been completed with a value or an exception.

isCompleted() - Method in class org.apache.spark.SimpleFutureAction
 
isCompleted() - Method in class org.apache.spark.TaskContext

Whether the task has completed.

isCompleted() - Method in class org.apache.spark.TaskContextImpl
 
isCompressionCodecFile(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
isContainsNull() - Method in class org.apache.spark.sql.api.java.ArrayType
 
isContinuous(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isDefined(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
isDone() - Method in class org.apache.spark.JavaFutureActionWrapper
 
isDriver() - Method in class org.apache.spark.broadcast.BroadcastManager
 
isDriver() - Method in class org.apache.spark.storage.BlockManagerId
 
isEmpty() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
isEventLogEnabled() - Method in class org.apache.spark.SparkContext
 
isEventLogFile(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
isExecutorAlive(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
isExecutorStartupConf(String) - Static method in class org.apache.spark.SparkConf

Return whether the given config should be passed to an executor on start-up.

isExtended() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
isFairScheduler() - Method in class org.apache.spark.ui.jobs.JobsTab
 
isFairScheduler() - Method in class org.apache.spark.ui.jobs.StagesTab
 
isFatalError(Throwable) - Static method in class org.apache.spark.util.Utils

Returns true if the given exception was fatal.

isFinished(Protos.TaskState) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend

Check whether a Mesos task state represents a finished task

isFinished(Enumeration.Value) - Static method in class org.apache.spark.TaskState
 
isFixed() - Method in class org.apache.spark.sql.api.java.DecimalType
 
isInitialized() - Method in class org.apache.spark.streaming.dstream.DStream
 
isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
 
isInMemory() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
isInterrupted() - Method in class org.apache.spark.TaskContext

Whether the task has been killed.

isInterrupted() - Method in class org.apache.spark.TaskContextImpl
 
isLazy() - Method in class org.apache.spark.sql.execution.CacheTableCommand
 
isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
 
isLeftChild(int) - Static method in class org.apache.spark.mllib.tree.model.Node

Returns true if this is a left child.

isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
 
isLocal() - Method in class org.apache.spark.SparkContext
 
isLocal() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
isLogManagerEnabled() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker

Check if the log manager is enabled.

isMac() - Static method in class org.apache.spark.util.Utils

Whether the underlying operating system is Mac OS X.

isMulticlass() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isMulticlassClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
 
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
 
isNullable() - Method in class org.apache.spark.sql.api.java.StructField
 
isNullAt(int) - Method in class org.apache.spark.sql.api.java.Row

Returns true if value at column `i` is NULL.

isOpen() - Method in class org.apache.spark.storage.BlockObjectWriter
 
isOpen() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
isParquetBinaryAsString() - Method in interface org.apache.spark.sql.SQLConf

When set to true, we always treat byte arrays in Parquet files as strings.

isPrimitiveType(DataType) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
isRDD() - Method in class org.apache.spark.storage.BlockId
 
isReady() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
isReady() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
isReceiverStarted() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Check if receiver has been marked for stopping

isReceiverStopped() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Check if receiver has been marked for stopping

isRegistered() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
isRegistered() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
isRunningInYarnContainer(SparkConf) - Static method in class org.apache.spark.util.Utils
 
isRunningLocally() - Method in class org.apache.spark.TaskContext
 
isRunningLocally() - Method in class org.apache.spark.TaskContextImpl
 
isSet(Param<?>) - Method in interface org.apache.spark.ml.param.Params

Checks whether a param is explicitly set.

isShuffle() - Method in class org.apache.spark.storage.BlockId
 
isShuffleMap() - Method in class org.apache.spark.scheduler.Stage
 
isSparkPortConf(String) - Static method in class org.apache.spark.SparkConf

Return true if the given config matches either spark.*.port or spark.port.*.

isSparkVersionFile(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
isSplitable(JobContext, Path) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat

Override of isSplitable to ensure initial computation of the record length

isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver

Check if the receiver has started or not.

isStopped() - Method in class org.apache.spark.SparkEnv
 
isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver

Check if receiver has been marked for stopping.

isSymlink(File) - Static method in class org.apache.spark.util.Utils

Check to see if file is a symbolic link.

isTesting() - Static method in class org.apache.spark.util.Utils

Indicates whether Spark is currently running unit tests.

isTimeValid(Time) - Method in class org.apache.spark.streaming.dstream.DStream

Checks whether the 'time' is valid wrt slideDuration for generating RDD

isTimeValid(Time) - Method in class org.apache.spark.streaming.dstream.InputDStream

Checks whether the 'time' is valid wrt slideDuration for generating RDD.

isTraceEnabled() - Method in interface org.apache.spark.Logging
 
isUDAFBridgeRequired() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
isUnlimited() - Method in class org.apache.spark.sql.api.java.DecimalType
 
isUnordered(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isValid() - Method in class org.apache.spark.broadcast.Broadcast

Whether this Broadcast is actually usable.

isValid() - Method in class org.apache.spark.rdd.BlockRDD

Whether this BlockRDD is actually usable.

isValid() - Method in class org.apache.spark.storage.StorageLevel
 
isValueContainsNull() - Method in class org.apache.spark.sql.api.java.MapType
 
isWindows() - Static method in class org.apache.spark.util.Utils

Whether the underlying operating system is Windows.

isWorthCompressing(Encoder<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
isZero() - Method in class org.apache.spark.streaming.Duration
 
isZombie() - Method in class org.apache.spark.scheduler.TaskSetManager
 
it() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
item() - Method in class org.apache.spark.streaming.receiver.SingleItemData
 
iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike

Internal method to this RDD; will read from cache if applicable, or otherwise compute it.

iterator() - Method in class org.apache.spark.graphx.impl.EdgePartition

Get an iterator over the edges in this partition.

iterator() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition

Returns an iterator over all vertex ids stored in this `RoutingTablePartition`.

iterator() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
iterator() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
iterator() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD

Internal method to this RDD; will read from cache if applicable, or otherwise compute it.

iterator() - Method in class org.apache.spark.storage.IteratorValues
 
iterator() - Method in class org.apache.spark.streaming.receiver.IteratorBlock
 
iterator() - Method in class org.apache.spark.streaming.receiver.IteratorData
 
iterator() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
iterator() - Method in class org.apache.spark.util.TimeStampedHashMap
 
iterator() - Method in class org.apache.spark.util.TimeStampedHashSet
 
iterator() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
IteratorBlock - Class in org.apache.spark.streaming.receiver

class representing a block received as an Iterator

IteratorBlock(Iterator<Object>) - Constructor for class org.apache.spark.streaming.receiver.IteratorBlock
 
IteratorData<T> - Class in org.apache.spark.streaming.receiver
 
IteratorData(Iterator<T>) - Constructor for class org.apache.spark.streaming.receiver.IteratorData
 
IteratorValues - Class in org.apache.spark.storage
 
IteratorValues(Iterator<Object>) - Constructor for class org.apache.spark.storage.IteratorValues
 




J

j() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
jarDir() - Method in class org.apache.spark.HttpFileServer
 
jarOfClass(Class<?>) - Static method in class org.apache.spark.api.java.JavaSparkContext

Find the JAR from which a given class was loaded, to make it easy for users to pass
 their JARs to SparkContext.

jarOfClass(Class<?>) - Static method in class org.apache.spark.SparkContext

Find the JAR from which a given class was loaded, to make it easy for users to pass
 their JARs to SparkContext.

jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Find the JAR from which a given class was loaded, to make it easy for users to pass
 their JARs to StreamingContext.

jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.StreamingContext

Find the JAR from which a given class was loaded, to make it easy for users to pass
 their JARs to StreamingContext.

jarOfObject(Object) - Static method in class org.apache.spark.api.java.JavaSparkContext

Find the JAR that contains the class of a particular object, to make it easy for users
 to pass their JARs to SparkContext.

jarOfObject(Object) - Static method in class org.apache.spark.SparkContext

Find the JAR that contains the class of a particular object, to make it easy for users
 to pass their JARs to SparkContext.

jars() - Method in class org.apache.spark.api.java.JavaSparkContext
 
jars() - Method in class org.apache.spark.SparkContext
 
jars() - Method in class org.apache.spark.streaming.Checkpoint
 
javaClassToDataType(Class<?>) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
JavaDeserializationStream - Class in org.apache.spark.serializer
 
JavaDeserializationStream(InputStream, ClassLoader) - Constructor for class org.apache.spark.serializer.JavaDeserializationStream
 
JavaDoubleRDD - Class in org.apache.spark.api.java
 
JavaDoubleRDD(RDD<Object>) - Constructor for class org.apache.spark.api.java.JavaDoubleRDD
 
JavaDStream<T> - Class in org.apache.spark.streaming.api.java

A Java-friendly interface to DStream, the basic
 abstraction in Spark Streaming that represents a continuous stream of data.

JavaDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaDStream
 
JavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Interface in org.apache.spark.streaming.api.java
 
JavaFutureAction<T> - Interface in org.apache.spark.api.java
 
JavaFutureActionWrapper<S,T> - Class in org.apache.spark
 
JavaFutureActionWrapper(FutureAction<S>, Function1<S, T>) - Constructor for class org.apache.spark.JavaFutureActionWrapper
 
JavaHadoopRDD<K,V> - Class in org.apache.spark.api.java
 
JavaHadoopRDD(HadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaHadoopRDD
 
JavaHiveContext - Class in org.apache.spark.sql.hive.api.java

The entry point for executing Spark SQL queries from a Java program.

JavaHiveContext(SQLContext) - Constructor for class org.apache.spark.sql.hive.api.java.JavaHiveContext
 
JavaHiveContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.hive.api.java.JavaHiveContext
 
JavaInputDStream<T> - Class in org.apache.spark.streaming.api.java

A Java-friendly interface to InputDStream.

JavaInputDStream(InputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaInputDStream
 
JavaIterableWrapperSerializer - Class in org.apache.spark.serializer

A Kryo serializer for serializing results returned by asJavaIterable.

JavaIterableWrapperSerializer() - Constructor for class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
JavaKinesisWordCountASL - Class in org.apache.spark.examples.streaming

Java-friendly Kinesis Spark Streaming WordCount example

 See http://spark.apache.org/docs/latest/streaming-kinesis.html for more details
 on the Kinesis Spark Streaming integration.

JavaNewHadoopRDD<K,V> - Class in org.apache.spark.api.java
 
JavaNewHadoopRDD(NewHadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaNewHadoopRDD
 
JavaPairDStream<K,V> - Class in org.apache.spark.streaming.api.java

A Java-friendly interface to a DStream of key-value pairs, which provides extra methods
 like reduceByKey and join.

JavaPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairDStream
 
JavaPairInputDStream<K,V> - Class in org.apache.spark.streaming.api.java

A Java-friendly interface to InputDStream of
 key-value pairs.

JavaPairInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
JavaPairRDD<K,V> - Class in org.apache.spark.api.java
 
JavaPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaPairRDD
 
JavaPairReceiverInputDStream<K,V> - Class in org.apache.spark.streaming.api.java

A Java-friendly interface to ReceiverInputDStream, the
 abstract class for defining any input stream that receives data over the network.

JavaPairReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
JavaRDD<T> - Class in org.apache.spark.api.java
 
JavaRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.api.java.JavaRDD
 
JavaRDDLike<T,This extends JavaRDDLike<T,This>> - Interface in org.apache.spark.api.java
 
JavaReceiverInputDStream<T> - Class in org.apache.spark.streaming.api.java

A Java-friendly interface to ReceiverInputDStream, the
 abstract class for defining any input stream that receives data over the network.

JavaReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
JavaSchemaRDD - Class in org.apache.spark.sql.api.java

An RDD of Row objects that is returned as the result of a Spark SQL query.

JavaSchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.api.java.JavaSchemaRDD
 
JavaSerializationStream - Class in org.apache.spark.serializer
 
JavaSerializationStream(OutputStream, int) - Constructor for class org.apache.spark.serializer.JavaSerializationStream
 
JavaSerializer - Class in org.apache.spark.serializer

:: DeveloperApi ::
 A Spark serializer that uses Java's built-in serialization.

JavaSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.JavaSerializer
 
JavaSerializerInstance - Class in org.apache.spark.serializer
 
JavaSerializerInstance(int, ClassLoader) - Constructor for class org.apache.spark.serializer.JavaSerializerInstance
 
JavaSparkContext - Class in org.apache.spark.api.java

A Java-friendly version of SparkContext that returns
 JavaRDDs and works with Java collections instead of Scala ones.

JavaSparkContext(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext() - Constructor for class org.apache.spark.api.java.JavaSparkContext

Create a JavaSparkContext that loads settings from system properties (for instance, when
 launching with ./bin/spark-submit).

JavaSparkContext(SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String[]) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String[], Map<String, String>) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkStatusTracker - Class in org.apache.spark.api.java

Low-level status reporting APIs for monitoring job and stage progress.

JavaSparkStatusTracker(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkStatusTracker
 
JavaSQLContext - Class in org.apache.spark.sql.api.java

The entry point for executing Spark SQL queries from a Java program.

JavaSQLContext(SQLContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
 
JavaSQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.api.java.JavaSQLContext
 
JavaStreamingContext - Class in org.apache.spark.streaming.api.java

A Java-friendly version of StreamingContext which is the main
 entry point for Spark Streaming functionality.

JavaStreamingContext(StreamingContext) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
 
JavaStreamingContext(String, String, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a StreamingContext.

JavaStreamingContext(String, String, Duration, String, String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a StreamingContext.

JavaStreamingContext(String, String, Duration, String, String[]) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a StreamingContext.

JavaStreamingContext(String, String, Duration, String, String[], Map<String, String>) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a StreamingContext.

JavaStreamingContext(JavaSparkContext, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a JavaStreamingContext using an existing JavaSparkContext.

JavaStreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a JavaStreamingContext using a SparkConf configuration.

JavaStreamingContext(String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext

Recreate a JavaStreamingContext from a checkpoint file.

JavaStreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext

Re-creates a JavaStreamingContext from a checkpoint file.

JavaStreamingContextFactory - Interface in org.apache.spark.streaming.api.java

Factory interface for creating a new JavaStreamingContext

javaToPython() - Method in class org.apache.spark.sql.SchemaRDD

Converts a JavaRDD to a PythonRDD.

JavaToScalaUDTWrapper<UserType> - Class in org.apache.spark.sql.api.java

Scala wrapper for a Java UserDefinedType

JavaToScalaUDTWrapper(UserDefinedType<UserType>) - Constructor for class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
 
javaUDT() - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
 
JavaUtils - Class in org.apache.spark.api.java
 
JavaUtils() - Constructor for class org.apache.spark.api.java.JavaUtils
 
JavaUtils.SerializableMapWrapper<A,B> - Class in org.apache.spark.api.java
 
JavaUtils.SerializableMapWrapper(Map<A, B>) - Constructor for class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
JdbcPartition - Class in org.apache.spark.rdd
 
JdbcPartition(int, long, long) - Constructor for class org.apache.spark.rdd.JdbcPartition
 
JdbcRDD<T> - Class in org.apache.spark.rdd

An RDD that executes an SQL query on a JDBC connection and reads results.

JdbcRDD(SparkContext, Function0<Connection>, String, long, long, int, Function1<ResultSet, T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.JdbcRDD
 
JdbcRDD.ConnectionFactory - Interface in org.apache.spark.rdd
 
JettyUtils - Class in org.apache.spark.ui

Utilities for launching a web server using Jetty's HTTP Server class

JettyUtils() - Constructor for class org.apache.spark.ui.JettyUtils
 
JettyUtils.ServletParams<T> - Class in org.apache.spark.ui
 
JettyUtils.ServletParams(Function1<HttpServletRequest, T>, String, Function1<T, String>, Function1<T, Object>) - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams
 
JettyUtils.ServletParams$ - Class in org.apache.spark.ui
 
JettyUtils.ServletParams$() - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams$
 
JmxSink - Class in org.apache.spark.metrics.sink
 
JmxSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.JmxSink
 
Job - Class in org.apache.spark.streaming.scheduler

Class representing a Spark computation.

Job(Time, Function0<?>) - Constructor for class org.apache.spark.streaming.scheduler.Job
 
job() - Method in class org.apache.spark.streaming.scheduler.JobCompleted
 
job() - Method in class org.apache.spark.streaming.scheduler.JobStarted
 
JobCancelled - Class in org.apache.spark.scheduler
 
JobCancelled(int) - Constructor for class org.apache.spark.scheduler.JobCancelled
 
JobCompleted - Class in org.apache.spark.streaming.scheduler
 
JobCompleted(Job) - Constructor for class org.apache.spark.streaming.scheduler.JobCompleted
 
jobEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobEndToJson(SparkListenerJobEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
JobExecutionStatus - Enum in org.apache.spark
 
jobFailed(Exception) - Method in class org.apache.spark.partial.ApproximateActionListener
 
JobFailed - Class in org.apache.spark.scheduler
 
JobFailed(Exception) - Constructor for class org.apache.spark.scheduler.JobFailed
 
jobFailed(Exception) - Method in interface org.apache.spark.scheduler.JobListener
 
jobFailed(Exception) - Method in class org.apache.spark.scheduler.JobWaiter
 
jobFinished() - Method in class org.apache.spark.scheduler.JobWaiter
 
JobGenerator - Class in org.apache.spark.streaming.scheduler

This class generates jobs from DStreams as well as drives checkpointing and cleaning
 up DStream metadata.

JobGenerator(JobScheduler) - Constructor for class org.apache.spark.streaming.scheduler.JobGenerator
 
JobGeneratorEvent - Interface in org.apache.spark.streaming.scheduler

Event classes for JobGenerator

jobGroup() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
JobGroupCancelled - Class in org.apache.spark.scheduler
 
JobGroupCancelled(String) - Constructor for class org.apache.spark.scheduler.JobGroupCancelled
 
jobId() - Method in class org.apache.spark.scheduler.ActiveJob
 
jobId() - Method in class org.apache.spark.scheduler.JobCancelled
 
jobId() - Method in class org.apache.spark.scheduler.JobSubmitted
 
jobId() - Method in class org.apache.spark.scheduler.JobWaiter
 
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
jobId() - Method in class org.apache.spark.scheduler.Stage
 
jobId() - Method in interface org.apache.spark.SparkJobInfo
 
jobId() - Method in class org.apache.spark.SparkJobInfoImpl
 
jobId() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
jobIds() - Method in interface org.apache.spark.api.java.JavaFutureAction

Returns the job IDs run by the underlying async operation.

jobIds() - Method in class org.apache.spark.ComplexFutureAction
 
jobIds() - Method in interface org.apache.spark.FutureAction

Returns the job IDs run by the underlying async operation.

jobIds() - Method in class org.apache.spark.JavaFutureActionWrapper
 
jobIds() - Method in class org.apache.spark.scheduler.Stage

Set of jobs that this stage belongs to.

jobIds() - Method in class org.apache.spark.SimpleFutureAction
 
jobIdToActiveJob() - Method in class org.apache.spark.scheduler.DAGScheduler
 
jobIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
jobIdToStageIds() - Method in class org.apache.spark.scheduler.DAGScheduler
 
JobListener - Interface in org.apache.spark.scheduler

Interface used to listen for job completion or failure events after submitting a job to the
 DAGScheduler.

JobLogger - Class in org.apache.spark.scheduler

:: DeveloperApi ::
 A logger class to record runtime information for jobs in Spark.

JobLogger(String, String) - Constructor for class org.apache.spark.scheduler.JobLogger
 
JobLogger() - Constructor for class org.apache.spark.scheduler.JobLogger
 
JobPage - Class in org.apache.spark.ui.jobs

Page showing statistics and stage list for a given job

JobPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.JobPage
 
jobProgressListener() - Method in class org.apache.spark.SparkContext
 
JobProgressListener - Class in org.apache.spark.ui.jobs

:: DeveloperApi ::
 Tracks task-level information to be displayed in the UI.

JobProgressListener(SparkConf) - Constructor for class org.apache.spark.ui.jobs.JobProgressListener
 
jobProgressListener() - Method in class org.apache.spark.ui.SparkUI
 
JobResult - Interface in org.apache.spark.scheduler

:: DeveloperApi ::
 A result of a job in the DAGScheduler.

jobResult() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
jobResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobResultToJson(JobResult) - Static method in class org.apache.spark.util.JsonProtocol
 
jobs() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
JobScheduler - Class in org.apache.spark.streaming.scheduler

This class schedules jobs to be run on Spark.

JobScheduler(StreamingContext) - Constructor for class org.apache.spark.streaming.scheduler.JobScheduler
 
JobSchedulerEvent - Interface in org.apache.spark.streaming.scheduler
 
JobSet - Class in org.apache.spark.streaming.scheduler

Class representing a set of Jobs
 belong to the same batch.

JobSet(Time, Seq<Job>, Map<Object, ReceivedBlockInfo[]>) - Constructor for class org.apache.spark.streaming.scheduler.JobSet
 
JobsTab - Class in org.apache.spark.ui.jobs

Web UI showing progress status of all jobs in the given SparkContext.

JobsTab(SparkUI) - Constructor for class org.apache.spark.ui.jobs.JobsTab
 
JobStarted - Class in org.apache.spark.streaming.scheduler
 
JobStarted(Job) - Constructor for class org.apache.spark.streaming.scheduler.JobStarted
 
jobStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobStartToJson(SparkListenerJobStart) - Static method in class org.apache.spark.util.JsonProtocol
 
JobSubmitted - Class in org.apache.spark.scheduler
 
JobSubmitted(int, RDD<?>, Function2<TaskContext, Iterator<Object>, ?>, int[], boolean, CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.JobSubmitted
 
JobSucceeded - Class in org.apache.spark.scheduler
 
JobSucceeded() - Constructor for class org.apache.spark.scheduler.JobSucceeded
 
JobWaiter<T> - Class in org.apache.spark.scheduler

An object that waits for a DAGScheduler job to complete.

JobWaiter(DAGScheduler, int, int, Function2<Object, T, BoxedUnit>) - Constructor for class org.apache.spark.scheduler.JobWaiter
 
join(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Merge the values for each key using an associative reduce function.

join(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD containing all pairs of elements with matching keys in this and other.

join(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD containing all pairs of elements with matching keys in this and other.

join(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return an RDD containing all pairs of elements with matching keys in this and other.

join(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return an RDD containing all pairs of elements with matching keys in this and other.

join(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return an RDD containing all pairs of elements with matching keys in this and other.

join() - Method in class org.apache.spark.sql.execution.Generate
 
join(SchemaRDD, JoinType, Option<Expression>) - Method in class org.apache.spark.sql.SchemaRDD

Performs a relational join on two SchemaRDDs

join(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.

join(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.

join(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.

join(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.

join(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.

join(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.

joinType() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
joinType() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
joinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD>, ClassTag<U>) - Method in class org.apache.spark.graphx.GraphOps

Join the vertices with an RDD and then apply a function from the
 the vertex and RDD entry to a new vertex value.

jsonFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext

Loads a JSON file (one object per line), returning the result as a JavaSchemaRDD.

jsonFile(String, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext

:: Experimental ::
 Loads a JSON file (one object per line) and applies the given schema,
 returning the result as a JavaSchemaRDD.

jsonFile(String) - Method in class org.apache.spark.sql.SQLContext

Loads a JSON file (one object per line), returning the result as a SchemaRDD.

jsonFile(String, StructType) - Method in class org.apache.spark.sql.SQLContext

:: Experimental ::
 Loads a JSON file (one object per line) and applies the given schema,
 returning the result as a SchemaRDD.

jsonFile(String, double) - Method in class org.apache.spark.sql.SQLContext

:: Experimental ::

jsonOption(JsonAST.JValue) - Static method in class org.apache.spark.util.Utils

Return an option that translates JNothing to None

JsonProtocol - Class in org.apache.spark.util

Serializes SparkListener events to/from JSON.

JsonProtocol() - Constructor for class org.apache.spark.util.JsonProtocol
 
jsonRDD(JavaRDD<String>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext

Loads an RDD[String] storing JSON objects (one object per record), returning the result as a
 JavaSchemaRDD.

jsonRDD(JavaRDD<String>, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext

:: Experimental ::
 Loads an RDD[String] storing JSON objects (one object per record) and applies the given schema,
 returning the result as a JavaSchemaRDD.

JsonRDD - Class in org.apache.spark.sql.json
 
JsonRDD() - Constructor for class org.apache.spark.sql.json.JsonRDD
 
jsonRDD(RDD<String>) - Method in class org.apache.spark.sql.SQLContext

Loads an RDD[String] storing JSON objects (one object per record), returning the result as a
 SchemaRDD.

jsonRDD(RDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext

:: Experimental ::
 Loads an RDD[String] storing JSON objects (one object per record) and applies the given schema,
 returning the result as a SchemaRDD.

jsonRDD(RDD<String>, double) - Method in class org.apache.spark.sql.SQLContext

:: Experimental ::

JSONRelation - Class in org.apache.spark.sql.json
 
JSONRelation(String, double, SQLContext) - Constructor for class org.apache.spark.sql.json.JSONRelation
 
jsonResponderToServlet(Function1<HttpServletRequest, JsonAST.JValue>) - Static method in class org.apache.spark.ui.JettyUtils
 
jsonStringToRow(RDD<String>, StructType, String) - Static method in class org.apache.spark.sql.json.JsonRDD
 
jvmInformation() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
JvmSource - Class in org.apache.spark.metrics.source
 
JvmSource() - Constructor for class org.apache.spark.metrics.source.JvmSource
 




K

k() - Method in class org.apache.spark.mllib.clustering.KMeansModel

Total number of clusters.

k() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
K_MEANS_PARALLEL() - Static method in class org.apache.spark.mllib.clustering.KMeans
 
KafkaInputDStream<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka

Input stream that pulls messages from a Kafka Broker.

KafkaInputDStream(StreamingContext, Map<String, String>, Map<String, Object>, boolean, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.KafkaInputDStream
 
KafkaReceiver<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka
 
KafkaReceiver(Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.KafkaReceiver
 
KafkaUtils - Class in org.apache.spark.streaming.kafka
 
KafkaUtils() - Constructor for class org.apache.spark.streaming.kafka.KafkaUtils
 
kClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
 
kClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
 
kClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
keyBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Creates tuples of the elements in this RDD by applying f.

keyBy(Function1<T, K>) - Method in class org.apache.spark.rdd.RDD

Creates tuples of the elements in this RDD by applying f.

keyClass() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
keyOrdering() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
keyOrdering() - Method in class org.apache.spark.ShuffleDependency
 
keys() - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD with the keys of each tuple.

keys() - Method in class org.apache.spark.rdd.PairRDDFunctions

Return an RDD with the keys of each tuple.

kFold(RDD<T>, int, int, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils

:: Experimental ::
 Return a k element array of pairs of RDDs with the first element of each pair
 containing the training data, a complement of the validation data and the second
 element, the validation data, containing a unique 1/kth of the data.

kill(boolean) - Method in class org.apache.spark.scheduler.Task

Kills a task by setting the interrupted flag to true.

killed() - Method in class org.apache.spark.scheduler.Task

Whether the task has been killed.

KILLED() - Static method in class org.apache.spark.TaskState
 
killEnabled() - Method in class org.apache.spark.ui.jobs.JobsTab
 
killEnabled() - Method in class org.apache.spark.ui.jobs.StagesTab
 
killEnabled() - Method in class org.apache.spark.ui.SparkUI
 
killExecutor(String) - Method in interface org.apache.spark.ExecutorAllocationClient

Request that the cluster manager kill the specified executor.

killExecutor(String) - Method in class org.apache.spark.SparkContext

:: DeveloperApi ::
 Request that cluster manager the kill the specified executor.

killExecutors(Seq<String>) - Method in interface org.apache.spark.ExecutorAllocationClient

Request that the cluster manager kill the specified executors.

killExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend

Request that the cluster manager kill the specified executors.

killExecutors(Seq<String>) - Method in class org.apache.spark.SparkContext

:: DeveloperApi ::
 Request that the cluster manager kill the specified executors.

killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
KillTask - Class in org.apache.spark.scheduler.local
 
KillTask(long, boolean) - Constructor for class org.apache.spark.scheduler.local.KillTask
 
killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.local.LocalBackend
 
killTask(long, String, boolean) - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
KinesisCheckpointState - Class in org.apache.spark.streaming.kinesis

This is a helper class for managing checkpoint clocks.

KinesisCheckpointState(Duration, Clock) - Constructor for class org.apache.spark.streaming.kinesis.KinesisCheckpointState
 
kinesisClientLibConfiguration() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
KinesisReceiver - Class in org.apache.spark.streaming.kinesis

Custom AWS Kinesis-specific implementation of Spark Streaming's Receiver.

KinesisReceiver(String, String, String, Duration, InitialPositionInStream, StorageLevel) - Constructor for class org.apache.spark.streaming.kinesis.KinesisReceiver
 
KinesisRecordProcessor - Class in org.apache.spark.streaming.kinesis

Kinesis-specific implementation of the Kinesis Client Library (KCL) IRecordProcessor.

KinesisRecordProcessor(KinesisReceiver, String, KinesisCheckpointState) - Constructor for class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
 
KinesisUtils - Class in org.apache.spark.streaming.kinesis

Helper class to create Amazon Kinesis Input Stream
 :: Experimental ::

KinesisUtils() - Constructor for class org.apache.spark.streaming.kinesis.KinesisUtils
 
KinesisWordCountASL - Class in org.apache.spark.examples.streaming

Kinesis Spark Streaming WordCount example.

KinesisWordCountASL() - Constructor for class org.apache.spark.examples.streaming.KinesisWordCountASL
 
KinesisWordCountProducerASL - Class in org.apache.spark.examples.streaming

Usage: KinesisWordCountProducerASL  
      
    is the name of the Kinesis stream (ie.

KinesisWordCountProducerASL() - Constructor for class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
kManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
KMeans - Class in org.apache.spark.mllib.clustering

K-means clustering with support for multiple parallel runs and a k-means++ like initialization
 mode (the k-means|| algorithm by Bahmani et al).

KMeans() - Constructor for class org.apache.spark.mllib.clustering.KMeans

Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, runs: 1,
 initializationMode: "k-means||", initializationSteps: 5, epsilon: 1e-4}.

KMeansDataGenerator - Class in org.apache.spark.mllib.util

:: DeveloperApi ::
 Generate test data for KMeans.

KMeansDataGenerator() - Constructor for class org.apache.spark.mllib.util.KMeansDataGenerator
 
KMeansModel - Class in org.apache.spark.mllib.clustering

A clustering model for K-means.

KMeansModel(Vector[]) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel
 
kMeansPlusPlus(int, VectorWithNorm[], double[], int, int) - Static method in class org.apache.spark.mllib.clustering.LocalKMeans

Run K-means++ on the weighted point set points.

KryoDeserializationStream - Class in org.apache.spark.serializer
 
KryoDeserializationStream(Kryo, InputStream) - Constructor for class org.apache.spark.serializer.KryoDeserializationStream
 
KryoRegistrator - Interface in org.apache.spark.serializer

Interface implemented by clients to register their classes with Kryo when using Kryo
 serialization.

KryoResourcePool - Class in org.apache.spark.sql.execution
 
KryoResourcePool(int) - Constructor for class org.apache.spark.sql.execution.KryoResourcePool
 
KryoSerializationStream - Class in org.apache.spark.serializer
 
KryoSerializationStream(Kryo, OutputStream) - Constructor for class org.apache.spark.serializer.KryoSerializationStream
 
KryoSerializer - Class in org.apache.spark.serializer

A Spark serializer that uses the Kryo serialization library.

KryoSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.KryoSerializer
 
KryoSerializerInstance - Class in org.apache.spark.serializer
 
KryoSerializerInstance(KryoSerializer) - Constructor for class org.apache.spark.serializer.KryoSerializerInstance
 
kv() - Method in class org.apache.spark.sql.execution.SetCommand
 




L

L1Updater - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Updater for L1 regularized problems.

L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
 
label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
label() - Method in class org.apache.spark.mllib.tree.impl.TreePoint
 
labelCol() - Method in interface org.apache.spark.ml.param.HasLabelCol

param for label column name

LabeledPoint - Class in org.apache.spark.mllib.regression

Class that represents the features and labels of a data point.

LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
 
LabelPropagation - Class in org.apache.spark.graphx.lib

Label Propagation algorithm.

LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
 
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns the sequence of labels in ascending order

labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns the sequence of labels in ascending order

LassoModel - Class in org.apache.spark.mllib.regression

Regression model trained using Lasso.

LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
 
LassoWithSGD - Class in org.apache.spark.mllib.regression

Train a regression model with L1-regularization using Stochastic Gradient Descent.

LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD

Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100,
 regParam: 0.01, miniBatchFraction: 1.0}.

lastCompletedBatch() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastDir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
lastFinishTime() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastId() - Static method in class org.apache.spark.Accumulators
 
lastLaunchTime() - Method in class org.apache.spark.scheduler.TaskSetManager
 
lastProgressBar() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastReceivedBatch() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastReceivedBatchRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastSeenMs() - Method in class org.apache.spark.storage.BlockManagerInfo
 
lastUpdateTime() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
laterViewToken() - Static method in class org.apache.spark.sql.hive.HiveQl
 
latestInfo() - Method in class org.apache.spark.scheduler.Stage

Pointer to the latest [StageInfo] object, set by DAGScheduler.

latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Return the latest model.

latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm

Return the latest model.

LAUNCHING() - Static method in class org.apache.spark.TaskState
 
launchTasks(Seq<Seq<TaskDescription>>) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
 
LBFGS - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Class used to solve an optimization problem using Limited-memory BFGS.

LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
 
LeafNode - Interface in org.apache.spark.sql.execution
 
learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
LeastSquaresGradient - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Compute gradient and loss for a Least-squared loss function, as used in linear regression.

LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
left() - Method in class org.apache.spark.sql.execution.Except
 
left() - Method in class org.apache.spark.sql.execution.Intersect
 
left() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
left() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
left() - Method in class org.apache.spark.sql.execution.joins.CartesianProduct
 
left() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
left() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
left() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL

The Streamed Relation

left() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
left() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node

Return the index of the left child of this node.

leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
leftJoin(Self, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps

Left outer join another VertexPartition.

leftJoin(Iterator<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps

Left outer join another iterator of messages.

leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD

Left joins this VertexRDD with an RDD containing vertex attribute pairs.

leftKeys() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
leftKeys() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
leftKeys() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
leftKeys() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
leftKeys() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
 
leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a left outer join of this and other.

leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a left outer join of this and other.

leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a left outer join of this and other.

leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a left outer join of this and other.

leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a left outer join of this and other.

leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a left outer join of this and other.

leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'left outer join' between RDDs of this DStream and
 other DStream.

leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'left outer join' between RDDs of this DStream and
 other DStream.

leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'left outer join' between RDDs of this DStream and
 other DStream.

leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'left outer join' between RDDs of this DStream and
 other DStream.

leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'left outer join' between RDDs of this DStream and
 other DStream.

leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'left outer join' between RDDs of this DStream and
 other DStream.

leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
LeftSemiJoin() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
LeftSemiJoinBNL - Class in org.apache.spark.sql.execution.joins

:: DeveloperApi ::
 Using BroadcastNestedLoopJoin to calculate left semi join result when there's no join keys
 for hash join.

LeftSemiJoinBNL(SparkPlan, SparkPlan, Option<Expression>) - Constructor for class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
LeftSemiJoinHash - Class in org.apache.spark.sql.execution.joins

:: DeveloperApi ::
 Build the right table's join keys into a HashSet, and iteratively go through the left
 table, to find the if join keys are in the Hash set.

LeftSemiJoinHash(Seq<Expression>, Seq<Expression>, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD

Left joins this RDD with another VertexRDD with the same index.

length() - Method in class org.apache.spark.scheduler.SplitInfo
 
length() - Method in class org.apache.spark.sql.api.java.Row

Returns the number of columns present in this Row.

length() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
length() - Method in class org.apache.spark.storage.FileSegment
 
length() - Method in class org.apache.spark.storage.TachyonFileSegment
 
length() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
length() - Method in class org.apache.spark.util.Distribution
 
length() - Method in class org.apache.spark.util.Vector
 
less(Duration) - Method in class org.apache.spark.streaming.Duration
 
less(Time) - Method in class org.apache.spark.streaming.Time
 
lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
 
lessEq(Time) - Method in class org.apache.spark.streaming.Time
 
LessThan - Class in org.apache.spark.sql.sources
 
LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
 
LessThanOrEqual - Class in org.apache.spark.sql.sources
 
LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
 
level() - Method in class org.apache.spark.storage.BlockInfo
 
lexical() - Method in class org.apache.spark.sql.hive.ExtendedHiveQlParser
 
lexical() - Method in class org.apache.spark.sql.sources.DDLParser
 
lexicographicOrdering() - Static method in class org.apache.spark.graphx.Edge
 
lexicographicOrdering() - Static method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
libraryPathEnvName() - Static method in class org.apache.spark.util.Utils

Return the current system LD_LIBRARY_PATH name

libraryPathEnvPrefix(Seq<String>) - Static method in class org.apache.spark.util.Utils

Return the prefix of a command that appends the given library paths to the
 system-specific library path environment variable.

LIKE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
Limit - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Take the first limit elements.

Limit(int, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Limit
 
limit() - Method in class org.apache.spark.sql.execution.Limit
 
limit() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
limit(Expression) - Method in class org.apache.spark.sql.SchemaRDD
 
limit(int) - Method in class org.apache.spark.sql.SchemaRDD

Limits the results by the given integer.

LinearDataGenerator - Class in org.apache.spark.mllib.util

:: DeveloperApi ::
 Generate sample data used for Linear Data.

LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
 
LinearRegressionModel - Class in org.apache.spark.mllib.regression

Regression model trained using LinearRegression.

LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
 
LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression

Train a linear regression model with no regularization using Stochastic Gradient Descent.

LinearRegressionWithSGD(double, int, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
 
LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD

Construct a LinearRegression object with default parameters: {stepSize: 1.0,
 numIterations: 100, miniBatchFraction: 1.0}.

listener() - Method in class org.apache.spark.scheduler.ActiveJob
 
listener() - Method in class org.apache.spark.scheduler.JobSubmitted
 
listener() - Method in class org.apache.spark.streaming.ui.StreamingTab
 
listener() - Method in class org.apache.spark.ui.env.EnvironmentTab
 
listener() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
listener() - Method in class org.apache.spark.ui.jobs.JobsTab
 
listener() - Method in class org.apache.spark.ui.jobs.StagesTab
 
listener() - Method in class org.apache.spark.ui.storage.StorageTab
 
listenerBus() - Method in class org.apache.spark.SparkContext
 
listenerBus() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
listenerThread() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
listenerThreadIsAlive() - Method in class org.apache.spark.scheduler.LiveListenerBus

For testing only.

listFiles(String, Configuration) - Static method in class org.apache.spark.sql.parquet.FileSystemHelper
 
listingTable(Seq<String>, Function1<T, Seq<Node>>, Iterable<T>, boolean, Option<String>, Seq<String>, boolean) - Static method in class org.apache.spark.ui.UIUtils

Returns an HTML table constructed by generating a row for each object in a sequence.

LiveListenerBus - Class in org.apache.spark.scheduler

Asynchronously passes SparkListenerEvents to registered SparkListeners.

LiveListenerBus() - Constructor for class org.apache.spark.scheduler.LiveListenerBus
 
loadClass(String) - Method in class org.apache.spark.util.ParentClassLoader
 
loadDefaultSparkProperties(SparkConf, String) - Static method in class org.apache.spark.util.Utils

Load default Spark properties from the given file.

loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils

Deprecated.
Should use RDD.saveAsTextFile(java.lang.String) for saving and
            MLUtils.loadLabeledPoints(org.apache.spark.SparkContext, java.lang.String, int) for loading.


loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils

Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile.

loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils

Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile with the default number of
 partitions.

loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils

Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].

loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils

Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of
 partitions.

loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils

Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of
 features determined automatically and the default number of partitions.

loadTestTable(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils

Loads vectors saved using RDD[Vector].saveAsTextFile.

loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils

Loads vectors saved using RDD[Vector].saveAsTextFile with the default number of partitions.

localAccums() - Static method in class org.apache.spark.Accumulators
 
LocalActor - Class in org.apache.spark.scheduler.local

Calls to LocalBackend are all serialized through LocalActor.

LocalActor(TaskSchedulerImpl, LocalBackend, int) - Constructor for class org.apache.spark.scheduler.local.LocalActor
 
localActor() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
LocalBackend - Class in org.apache.spark.scheduler.local

LocalBackend is used when running a local version of Spark where the executor, backend, and
 master all run in the same JVM.

LocalBackend(TaskSchedulerImpl, int) - Constructor for class org.apache.spark.scheduler.local.LocalBackend
 
localDirs() - Method in class org.apache.spark.storage.DiskBlockManager
 
localDstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
localFraction() - Method in class org.apache.spark.rdd.CoalescedRDDPartition

Computes the fraction of the parents' partitions containing preferredLocation within
 their getPreferredLocs.

LocalHiveContext - Class in org.apache.spark.sql.hive

DEPRECATED: Use HiveContext instead.

LocalHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.LocalHiveContext
 
localHostName() - Static method in class org.apache.spark.util.Utils

Get the local machine's hostname.

localIpAddress() - Static method in class org.apache.spark.util.Utils

Get the local host's IP address in dotted-quad format (e.g.

localIpAddressHostname() - Static method in class org.apache.spark.util.Utils
 
localityWaits() - Method in class org.apache.spark.scheduler.TaskSetManager
 
LocalKMeans - Class in org.apache.spark.mllib.clustering

An utility object to run K-means locally.

LocalKMeans() - Constructor for class org.apache.spark.mllib.clustering.LocalKMeans
 
localSrcId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
localValue() - Method in class org.apache.spark.Accumulable

Get the current value of this accumulator from within a task.

location() - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
location() - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
location() - Method in interface org.apache.spark.scheduler.MapStatus

Location where this task was run.

location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
locations_() - Method in class org.apache.spark.rdd.BlockRDD
 
log() - Method in interface org.apache.spark.Logging
 
log() - Method in interface org.apache.spark.util.ActorLogReceive
 
log(String, boolean) - Method in class org.apache.spark.util.FileLogger

Log the message to the given writer.

log2(double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
 
log_() - Method in interface org.apache.spark.Logging
 
LOG_FILE_PERMISSIONS() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
LOG_PREFIX() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
 
logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logDir() - Method in class org.apache.spark.scheduler.EventLoggingListener
 
logDirName() - Method in class org.apache.spark.scheduler.EventLoggingListener
 
logDirName() - Method in class org.apache.spark.scheduler.JobLogger
 
logError(Function0<String>) - Method in interface org.apache.spark.Logging
 
logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logFileRegex() - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
 
logFilesTologInfo(Seq<Path>) - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager

Convert a sequence of files to a sequence of sorted LogInfo objects

loggedEvents() - Method in class org.apache.spark.scheduler.EventLoggingListener
 
Logging - Interface in org.apache.spark

:: DeveloperApi ::
 Utility trait for classes that want to log data.

logicalPlan() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
logicalPlan() - Method in interface org.apache.spark.sql.SchemaRDDLike
 
logicalPlanToSparkQuery(LogicalPlan) - Method in class org.apache.spark.sql.SQLContext

:: DeveloperApi ::
 Allows catalyst LogicalPlans to be executed as a SchemaRDD.

LogicalRDD - Class in org.apache.spark.sql.execution
 
LogicalRDD(Seq<Attribute>, RDD<Row>, SQLContext) - Constructor for class org.apache.spark.sql.execution.LogicalRDD
 
LogicalRelation - Class in org.apache.spark.sql.sources

Used to link a BaseRelation in to a logical query plan.

LogicalRelation(BaseRelation) - Constructor for class org.apache.spark.sql.sources.LogicalRelation
 
logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
 
logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
LogisticGradient - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Compute gradient and loss for a logistic loss function, as used in binary classification.

LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
 
LogisticRegression - Class in org.apache.spark.ml.classification

Logistic regression.

LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
 
LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util

:: DeveloperApi ::
 Generate test data for LogisticRegression.

LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
 
LogisticRegressionModel - Class in org.apache.spark.ml.classification

:: AlphaComponent ::
 Model produced by LogisticRegression.

LogisticRegressionModel(LogisticRegression, ParamMap, Vector) - Constructor for class org.apache.spark.ml.classification.LogisticRegressionModel
 
LogisticRegressionModel - Class in org.apache.spark.mllib.classification

Classification model trained using Logistic Regression.

LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
 
LogisticRegressionParams - Interface in org.apache.spark.ml.classification

:: AlphaComponent ::
 Params for logistic regression.

LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification

Train a classification model for Logistic Regression using Limited-memory BFGS.

LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
 
LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification

Train a classification model for Logistic Regression using Stochastic Gradient Descent.

LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD

Construct a LogisticRegression object with default parameters: {stepSize: 1.0,
 numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.

logLine(String, boolean) - Method in class org.apache.spark.util.FileLogger

Log the message to the given writer as a new line.

LogLoss - Class in org.apache.spark.mllib.tree.loss

:: DeveloperApi ::
 Class for log loss calculation (for classification).

LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
 
logMemoryUsage() - Method in class org.apache.spark.storage.MemoryStore

Log information about current memory usage.

logName() - Method in interface org.apache.spark.Logging
 
logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators

Generate a graph whose vertex out degree distribution is log normal.

logPaths() - Method in class org.apache.spark.scheduler.EventLoggingInfo
 
logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
 
logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logUncaughtExceptions(Function0<T>) - Static method in class org.apache.spark.util.Utils

Execute the given block, logging and re-throwing any uncaught exception.

logUnrollFailureMessage(BlockId, long) - Method in class org.apache.spark.storage.MemoryStore

Log a warning for failing to unroll a block.

logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
 
logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
LONG - Class in org.apache.spark.sql.columnar
 
LONG() - Constructor for class org.apache.spark.sql.columnar.LONG
 
LONG_FORM() - Static method in class org.apache.spark.util.CallSite
 
LongColumnAccessor - Class in org.apache.spark.sql.columnar
 
LongColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.LongColumnAccessor
 
LongColumnBuilder - Class in org.apache.spark.sql.columnar
 
LongColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.LongColumnBuilder
 
LongColumnStats - Class in org.apache.spark.sql.columnar
 
LongColumnStats() - Constructor for class org.apache.spark.sql.columnar.LongColumnStats
 
LongDelta - Class in org.apache.spark.sql.columnar.compression
 
LongDelta() - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta
 
LongDelta.Decoder - Class in org.apache.spark.sql.columnar.compression
 
LongDelta.Decoder(ByteBuffer, NativeColumnType<LongType$>) - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
LongDelta.Encoder - Class in org.apache.spark.sql.columnar.compression
 
LongDelta.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
longForm() - Method in class org.apache.spark.util.CallSite
 
LongHashSetSerializer - Class in org.apache.spark.sql.execution
 
LongHashSetSerializer() - Constructor for class org.apache.spark.sql.execution.LongHashSetSerializer
 
LongParam - Class in org.apache.spark.ml.param

Specialized version of Param[Long] for Java.

LongParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.LongParam
 
longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
 
LongType - Static variable in class org.apache.spark.sql.api.java.DataType

Gets the LongType object.

LongType - Class in org.apache.spark.sql.api.java

The data type representing long and Long values.

longWritableConverter() - Static method in class org.apache.spark.SparkContext
 
lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD

Return the list of values in the RDD for key key.

lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return the list of values in the RDD for key key.

lookupCachedData(SchemaRDD) - Method in interface org.apache.spark.sql.CacheManager

Optionally returns cached data for the given SchemaRDD

lookupCachedData(LogicalPlan) - Method in interface org.apache.spark.sql.CacheManager

Optionally returns cached data for the given LogicalPlan.

lookupFunction(String, Seq<Expression>) - Method in class org.apache.spark.sql.hive.HiveFunctionRegistry
 
lookupRelation(Seq<String>, Option<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
lookupTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils

Returns the default Spark timeout to use for Akka remote actor lookup.

loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
Loss - Interface in org.apache.spark.mllib.tree.loss

:: DeveloperApi ::
 Trait for adding "pluggable" loss functions for the gradient boosting algorithm.

Losses - Class in org.apache.spark.mllib.tree.loss
 
Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
 
LOST() - Static method in class org.apache.spark.TaskState
 
low() - Method in class org.apache.spark.partial.BoundedDouble
 
lower() - Method in class org.apache.spark.rdd.JdbcPartition
 
LOWER() - Static method in class org.apache.spark.sql.hive.HiveQl
 
lowerBound() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
lowerCase() - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
lowSplit() - Method in class org.apache.spark.mllib.tree.model.Bin
 
LZ4CompressionCodec - Class in org.apache.spark.io

:: DeveloperApi ::
 LZ4 implementation of CompressionCodec.

LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
 
LZFCompressionCodec - Class in org.apache.spark.io

:: DeveloperApi ::
 LZF implementation of CompressionCodec.

LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec
 




M

main(String[]) - Static method in class org.apache.spark.examples.streaming.JavaKinesisWordCountASL
 
main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountASL
 
main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
 
main(String[]) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
main(String[]) - Static method in class org.apache.spark.streaming.util.RawTextSender
 
main(String[]) - Static method in class org.apache.spark.streaming.util.RecurringTimer
 
main(String[]) - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
main(String[]) - Static method in class org.apache.spark.util.random.XORShiftRandom

Main method for running benchmark

makeBinarySearch(Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.util.CollectionsUtils
 
makeCopy(Object[]) - Method in class org.apache.spark.sql.execution.SparkPlan

Overridden make copy also propogates sqlContext to copied plan.

makeDriverRef(String, SparkConf, ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
 
makeExecutorRef(String, SparkConf, String, int, ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
 
makeOffers() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
makeOffers(String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
makeProgressBar(int, int, int, int, int) - Static method in class org.apache.spark.ui.UIUtils
 
makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext

Distribute a local Scala collection to form an RDD.

makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext

Distribute a local Scala collection to form an RDD, with one or more
 location preferences (hostnames of Spark nodes) for each object.

makeRDDForPartitionedTable(Seq<Partition>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
 
makeRDDForPartitionedTable(Map<Partition, Class<? extends Deserializer>>, Option<PathFilter>) - Method in class org.apache.spark.sql.hive.HadoopTableReader

Create a HadoopRDD for every partition key specified in the query.

makeRDDForPartitionedTable(Seq<Partition>) - Method in interface org.apache.spark.sql.hive.TableReader
 
makeRDDForTable(Table) - Method in class org.apache.spark.sql.hive.HadoopTableReader
 
makeRDDForTable(Table, Class<? extends Deserializer>, Option<PathFilter>) - Method in class org.apache.spark.sql.hive.HadoopTableReader

Creates a Hadoop RDD to read data from the target table's data directory.

makeRDDForTable(Table) - Method in interface org.apache.spark.sql.hive.TableReader
 
ManualClock - Class in org.apache.spark.streaming.util
 
ManualClock() - Constructor for class org.apache.spark.streaming.util.ManualClock
 
map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to all elements of this RDD.

map(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Construct a new edge partition by applying the function f to all
 edges in this partition.

map(Iterator<ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Construct a new edge partition by using the edge attributes
 contained in the iterator.

map(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps

Pass each vertex attribute along with the vertex id through a map
 function and retain the original RDD's partitioning and index.

map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult

Transform this PartialResult into a PartialResult of type T.

map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

Return a new RDD by applying a function to all elements of this RDD.

map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream by applying a function to all elements of this DStream.

map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream by applying a function to all elements of this DStream.

MAP_KEY_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
MAP_OUTPUT_TRACKER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
MAP_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
MAP_VALUE_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
mapAsSerializableJavaMap(Map<A, B>) - Static method in class org.apache.spark.api.java.JavaUtils
 
mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph

Transforms each edge attribute in the graph using the map function.

mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph

Transforms each edge attribute using the map function, passing it a whole partition at a
 time.

mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol

-------------------------------- *
 Util JSON deserialization methods |
 ---------------------------------

mapId() - Method in class org.apache.spark.FetchFailed
 
mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
MapOutputTracker - Class in org.apache.spark

Class that keeps track of the location of the map output of
 a stage.

MapOutputTracker(SparkConf) - Constructor for class org.apache.spark.MapOutputTracker
 
mapOutputTracker() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
mapOutputTracker() - Method in class org.apache.spark.SparkEnv
 
MapOutputTrackerMaster - Class in org.apache.spark

MapOutputTracker for the driver.

MapOutputTrackerMaster(SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerMaster
 
MapOutputTrackerMasterActor - Class in org.apache.spark

Actor class for MapOutputTrackerMaster

MapOutputTrackerMasterActor(MapOutputTrackerMaster, SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerMasterActor
 
MapOutputTrackerMessage - Interface in org.apache.spark
 
MapOutputTrackerWorker - Class in org.apache.spark

MapOutputTracker for the executors, which fetches map output information from the driver's
 MapOutputTrackerMaster.

MapOutputTrackerWorker(SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerWorker
 
MapPartitionedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
MapPartitionedDStream(DStream<T>, Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to each partition of this RDD.

mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to each partition of this RDD.

mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

Return a new RDD by applying a function to each partition of this RDD.

mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
 of this DStream.

mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
 of this DStream.

MapPartitionsRDD<U,T> - Class in org.apache.spark.rdd
 
MapPartitionsRDD(RDD<T>, Function3<TaskContext, Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.MapPartitionsRDD
 
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to each partition of this RDD.

mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to each partition of this RDD.

mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to each partition of this RDD.

mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to each partition of this RDD.

mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
 of this DStream.

mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

:: DeveloperApi ::
 Return a new RDD by applying a function to each partition of this RDD.

mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to each partition of this RDD, while tracking the index
 of the original partition.

mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

Return a new RDD by applying a function to each partition of this RDD, while tracking the index
 of the original partition.

mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD

Maps over a partition, providing the InputSplit that was used as the base of the partition.

mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD

Maps over a partition, providing the InputSplit that was used as the base of the partition.

mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.HadoopRDD

Maps over a partition, providing the InputSplit that was used as the base of the partition.

mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.NewHadoopRDD

Maps over a partition, providing the InputSplit that was used as the base of the partition.

mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

Return a new RDD by applying a function to each partition of this RDD, while tracking the index
 of the original partition.

MappedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
MappedDStream(DStream<T>, Function1<T, U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MappedDStream
 
MappedRDD<U,T> - Class in org.apache.spark.rdd
 
MappedRDD(RDD<T>, Function1<T, U>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.MappedRDD
 
MappedValuesRDD<K,V,U> - Class in org.apache.spark.rdd
 
MappedValuesRDD(RDD<? extends Product2<K, V>>, Function1<V, U>) - Constructor for class org.apache.spark.rdd.MappedValuesRDD
 
mapper() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
MAPRED_REDUCE_TASKS() - Method in class org.apache.spark.sql.SQLConf.Deprecated$
 
mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph

Aggregates values from the neighboring edges and vertices of each vertex.

mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
 
MapStatus - Interface in org.apache.spark.scheduler

Result returned by a ShuffleMapTask to a scheduler.

mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to all elements of this RDD.

mapToJson(Map<String, String>) - Static method in class org.apache.spark.util.JsonProtocol

------------------------------ *
 Util JSON serialization methods |
 -------------------------------

mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return a new RDD by applying a function to all elements of this RDD.

mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream by applying a function to all elements of this DStream.

mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph

Transforms each edge attribute using the map function, passing it the adjacent vertex
 attributes as well.

mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph

Transforms each edge attribute using the map function, passing it the adjacent vertex
 attributes as well.

mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph

Transforms each edge attribute a partition at a time using the map function, passing it the
 adjacent vertex attributes as well.

mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
MapType - Class in org.apache.spark.sql.api.java

The data type representing Maps.

MapValuedDStream<K,V,U> - Class in org.apache.spark.streaming.dstream
 
MapValuedDStream(DStream<Tuple2<K, V>>, Function1<V, U>, ClassTag<K>, ClassTag<V>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MapValuedDStream
 
mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD

Pass each value in the key-value pair RDD through a map function without changing the keys;
 this also retains the original RDD's partitioning.

mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD

Map the values in an edge partitioning preserving the structure but changing the values.

mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD

Maps each vertex attribute, preserving the index.

mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD

Maps each vertex attribute, additionally supplying the vertex ID.

mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Pass each value in the key-value pair RDD through a map function without changing the keys;
 this also retains the original RDD's partitioning.

mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying a map function to the value of each key-value pairs in
 'this' DStream without changing the key.

mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying a map function to the value of each key-value pairs in
 'this' DStream without changing the key.

mapVertexPartitions(Function1<ShippableVertexPartition<VD>, ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapVertexPartitions(Function1<ShippableVertexPartition<VD>, ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD

Applies a function to each VertexPartition of this RDD and returns a new VertexRDD.

mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph

Transforms each vertex attribute in the graph using the map function.

mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

Maps f over this RDD, where f takes an additional parameter of type A.

markCheckpointed(RDD<?>) - Method in class org.apache.spark.rdd.RDD

Changes the dependencies of this RDD from its original parents to a new RDD (newRDD)
 created from the checkpoint file, and forget its old dependencies and partitions.

MarkedForCheckpoint() - Static method in class org.apache.spark.rdd.CheckpointState
 
markFailed(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markFailure() - Method in class org.apache.spark.storage.BlockInfo

Mark this BlockInfo as ready but failed

markForCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
markGettingResult(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markInterrupted() - Method in class org.apache.spark.TaskContextImpl

Marks the task for interruption, i.e.

markPartiallyConstructed(SparkContext, boolean) - Static method in class org.apache.spark.SparkContext

Called at the beginning of the SparkContext constructor to ensure that no SparkContext is
 running.

markReady(long) - Method in class org.apache.spark.storage.BlockInfo

Mark this BlockInfo as ready (i.e.

markSuccessful(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markTaskCompleted() - Method in class org.apache.spark.TaskContextImpl

Marks the task as completed and triggers the listeners.

mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph

Restricts the graph to only the vertices and edges that are also in other, but keeps the
 attributes from this graph.

mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mask() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
mask() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
mask() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
master() - Method in class org.apache.spark.api.java.JavaSparkContext
 
master() - Method in class org.apache.spark.SparkContext
 
master() - Method in class org.apache.spark.storage.BlockManager
 
master() - Method in class org.apache.spark.storage.TachyonBlockManager
 
master() - Method in class org.apache.spark.streaming.Checkpoint
 
Matrices - Class in org.apache.spark.mllib.linalg

Factory methods for Matrix.

Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
 
Matrix - Interface in org.apache.spark.mllib.linalg

Trait for a local matrix.

MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed

:: Experimental ::
 Represents an entry in an distributed matrix.

MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation

Model representing the result of matrix factorization.

MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Returns the maximum element from this RDD as defined by the specified
 Comparator[T].

max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary

Maximum value of each column.

max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD

Returns the max of this RDD as defined by the implicit Ordering[T].

MAX() - Static method in class org.apache.spark.sql.hive.HiveQl
 
max(Duration) - Method in class org.apache.spark.streaming.Duration
 
max(Time) - Method in class org.apache.spark.streaming.Time
 
max(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
max() - Method in class org.apache.spark.util.StatCounter
 
MAX_ATTEMPTS() - Method in class org.apache.spark.streaming.CheckpointWriter
 
MAX_DICT_SIZE() - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
MAX_SLAVE_FAILURES() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
maxAkkaFrameSize() - Method in class org.apache.spark.MapOutputTrackerMasterActor
 
maxBatchSize() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxBins() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxDepth() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
maxFrameSizeBytes(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils

Returns the configured max frame size for Akka messages in bytes.

maxIter() - Method in interface org.apache.spark.ml.param.HasMaxIter

param for max number of iterations

maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
maxMem() - Method in class org.apache.spark.storage.BlockManagerInfo
 
maxMem() - Method in class org.apache.spark.storage.StorageStatus
 
maxMemory() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxMemSize() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node

Return the maximum number of nodes which can be in the given level of the tree.

maxRegisteredWaitingTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
maxResultSize() - Method in class org.apache.spark.scheduler.TaskSetManager
 
maxTaskFailures() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
maxTaskFailures() - Method in class org.apache.spark.scheduler.TaskSetManager
 
maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD

Compute the mean of this RDD's elements.

mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary

Sample mean vector.

mean() - Method in class org.apache.spark.partial.BoundedDouble
 
mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Compute the mean of this RDD's elements.

mean() - Method in class org.apache.spark.util.StatCounter
 
meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics

Returns the mean absolute error, which is a risk function corresponding to the
 expected value of the absolute error loss or l1-norm loss.

meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return the approximate mean of the elements in this RDD.

meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD

:: Experimental ::
 Approximate operation to return the mean within a timeout.

meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions

:: Experimental ::
 Approximate operation to return the mean within a timeout.

meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics

Returns the mean average precision (MAP) of all the queries.

MeanEvaluator - Class in org.apache.spark.partial

An ApproximateEvaluator for means.

MeanEvaluator(int, double) - Constructor for class org.apache.spark.partial.MeanEvaluator
 
meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics

Returns the mean squared error, which is a risk function corresponding to the
 expected value of the squared error loss or quadratic loss.

megabytesToString(long) - Static method in class org.apache.spark.util.Utils

Convert a quantity in megabytes to a human-readable string such as "4.0 MB".

MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
 
memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
MemoryEntry - Class in org.apache.spark.storage
 
MemoryEntry(Object, long, boolean) - Constructor for class org.apache.spark.storage.MemoryEntry
 
MemoryParam - Class in org.apache.spark.util

An extractor object for parsing JVM memory strings, such as "10g", into an Int representing
 the number of megabytes.

MemoryParam() - Constructor for class org.apache.spark.util.MemoryParam
 
memoryStore() - Method in class org.apache.spark.storage.BlockManager
 
MemoryStore - Class in org.apache.spark.storage

Stores blocks in memory, either as Arrays of deserialized Java objects or as
 serialized ByteBuffers.

MemoryStore(BlockManager, long) - Constructor for class org.apache.spark.storage.MemoryStore
 
memoryStringToMb(String) - Static method in class org.apache.spark.util.Utils

Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) to a number of megabytes.

memoryUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
MemoryUtils - Class in org.apache.spark.scheduler.cluster.mesos
 
MemoryUtils() - Constructor for class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
memRemaining() - Method in class org.apache.spark.storage.StorageStatus

Return the memory remaining in this block manager.

memSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
memSize() - Method in class org.apache.spark.storage.BlockStatus
 
memSize() - Method in class org.apache.spark.storage.RDDInfo
 
memUsed() - Method in class org.apache.spark.storage.StorageStatus

Return the memory used by this block manager.

memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus

Return the memory used by the given RDD in this block manager in O(1) time.

merge(R) - Method in class org.apache.spark.Accumulable

Merge two accumulable objects together

merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator

Merges another.

merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer

Merge another MultivariateOnlineSummarizer, and update the statistical summary.

merge(DTStatsAggregator) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator

Merge this aggregator with another, and returns this aggregator.

merge(double[], int, int) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator

Merge the stats from one bin into another.

merge(int, U) - Method in interface org.apache.spark.partial.ApproximateEvaluator
 
merge(int, long) - Method in class org.apache.spark.partial.CountEvaluator
 
merge(int, OpenHashMap<T, Object>) - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
merge(int, HashMap<T, StatCounter>) - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
merge(int, HashMap<T, StatCounter>) - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
merge(int, StatCounter) - Method in class org.apache.spark.partial.MeanEvaluator
 
merge(int, StatCounter) - Method in class org.apache.spark.partial.SumEvaluator
 
merge(Option<AcceptanceResult>) - Method in class org.apache.spark.util.random.AcceptanceResult
 
merge(double) - Method in class org.apache.spark.util.StatCounter

Add a value into this StatCounter, updating the internal statistics.

merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter

Add multiple values into this StatCounter, updating the internal statistics.

merge(StatCounter) - Method in class org.apache.spark.util.StatCounter

Merge another StatCounter into this one, adding up the internal statistics.

mergeCombiners() - Method in class org.apache.spark.Aggregator
 
mergeForFeature(int, int, int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator

For a given feature, merge the stats for two bins.

mergeValue() - Method in class org.apache.spark.Aggregator
 
MesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos

A SchedulerBackend for running fine-grained tasks on Mesos.

MesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
message() - Method in class org.apache.spark.FetchFailed
 
message() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
 
message() - Method in class org.apache.spark.scheduler.ExecutorLossReason
 
message() - Method in exception org.apache.spark.storage.BlockException
 
message() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
metadata() - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
 
MetadataBuilder - Class in org.apache.spark.sql.api.java

Builder for [[Metadata]].

MetadataBuilder() - Constructor for class org.apache.spark.sql.api.java.MetadataBuilder
 
metadataCleaner() - Method in class org.apache.spark.SparkContext
 
MetadataCleaner - Class in org.apache.spark.util

Runs a timer task to periodically clean up metadata (e.g.

MetadataCleaner(Enumeration.Value, Function1<Object, BoxedUnit>, SparkConf) - Constructor for class org.apache.spark.util.MetadataCleaner
 
MetadataCleanerType - Class in org.apache.spark.util
 
MetadataCleanerType() - Constructor for class org.apache.spark.util.MetadataCleanerType
 
metastorePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
 
metastorePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
MetastoreRelation - Class in org.apache.spark.sql.hive
 
MetastoreRelation(String, String, Option<String>, Table, Seq<Partition>, SQLContext) - Constructor for class org.apache.spark.sql.hive.MetastoreRelation
 
MetastoreRelation.SchemaAttribute - Class in org.apache.spark.sql.hive
 
MetastoreRelation.SchemaAttribute(FieldSchema) - Constructor for class org.apache.spark.sql.hive.MetastoreRelation.SchemaAttribute
 
method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator

param for metric name in evaluation

metricRegistry() - Method in class org.apache.spark.metrics.source.JvmSource
 
metricRegistry() - Method in interface org.apache.spark.metrics.source.Source
 
metricRegistry() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
metricRegistry() - Method in class org.apache.spark.storage.BlockManagerSource
 
metricRegistry() - Method in class org.apache.spark.streaming.StreamingSource
 
metrics() - Method in class org.apache.spark.ExceptionFailure
 
metrics() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
metrics() - Method in class org.apache.spark.scheduler.Task
 
METRICS_CONF() - Method in class org.apache.spark.metrics.MetricsConfig
 
MetricsConfig - Class in org.apache.spark.metrics
 
MetricsConfig(Option<String>) - Constructor for class org.apache.spark.metrics.MetricsConfig
 
MetricsServlet - Class in org.apache.spark.metrics.sink
 
MetricsServlet(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.MetricsServlet
 
MetricsSystem - Class in org.apache.spark.metrics

Spark Metrics System, created by specific "instance", combined by source,
 sink, periodically poll source metrics data to sink destinations.

metricsSystem() - Method in class org.apache.spark.SparkContext
 
metricsSystem() - Method in class org.apache.spark.SparkEnv
 
MFDataGenerator - Class in org.apache.spark.mllib.util

:: DeveloperApi ::
 Generate RDD(s) containing data for Matrix Factorization.

MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
 
microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns micro-averaged label-based f1-measure
 (equals to micro-averaged document-based f1-measure)

microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns micro-averaged label-based precision
 (equals to micro-averaged document-based precision)

microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns micro-averaged label-based recall
 (equals to micro-averaged document-based recall)

milliseconds() - Method in class org.apache.spark.streaming.Duration
 
milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
 
Milliseconds - Class in org.apache.spark.streaming

Helper object that creates instance of Duration representing
 a given number of milliseconds.

Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
 
milliseconds() - Method in class org.apache.spark.streaming.Time
 
millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener

Reformat a time interval in milliseconds to a prettier format for output

min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Returns the minimum element from this RDD as defined by the specified
 Comparator[T].

min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary

Minimum value of each column.

min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD

Returns the min of this RDD as defined by the implicit Ordering[T].

MIN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
min(Duration) - Method in class org.apache.spark.streaming.Duration
 
min(Time) - Method in class org.apache.spark.streaming.Time
 
min() - Method in class org.apache.spark.util.StatCounter
 
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
 
MINIMUM_INTERVAL_SECONDS() - Static method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
MINIMUM_SHARES_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
MINIMUM_SIZE_BYTES() - Static method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
minInfoGain() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
minMemoryMapBytes() - Method in class org.apache.spark.storage.DiskStore
 
minPollTime() - Method in class org.apache.spark.streaming.util.SystemClock
 
minRegisteredRatio() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
minSamplingRate() - Static method in class org.apache.spark.util.random.BinomialBounds
 
minShare() - Method in class org.apache.spark.scheduler.Pool
 
minShare() - Method in interface org.apache.spark.scheduler.Schedulable
 
minShare() - Method in class org.apache.spark.scheduler.TaskSetManager
 
minus(Duration) - Method in class org.apache.spark.streaming.Duration
 
minus(Time) - Method in class org.apache.spark.streaming.Time
 
minus(Duration) - Method in class org.apache.spark.streaming.Time
 
minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
minutes(long) - Static method in class org.apache.spark.streaming.Durations
 
Minutes - Class in org.apache.spark.streaming

Helper object that creates instance of Duration representing
 a given number of minutes.

Minutes() - Constructor for class org.apache.spark.streaming.Minutes
 
minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
MLUtils - Class in org.apache.spark.mllib.util

Helper methods to load, save and pre-process data used in ML Lib.

MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
 
Model<M extends Model<M>> - Class in org.apache.spark.ml

:: AlphaComponent ::
 A fitted model, i.e., a Transformer produced by an Estimator.

Model() - Constructor for class org.apache.spark.ml.Model
 
model() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
MODULE$ - Static variable in class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.mllib.recommendation.ALS.BlockStats$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.Method$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.SparkContext.DoubleAccumulatorParam$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.SparkContext.FloatAccumulatorParam$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.SparkContext.IntAccumulatorParam$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.SparkContext.LongAccumulatorParam$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveQl.Token$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.sql.SQLConf.Deprecated$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.ExpireDeadHosts$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocations$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetPeers$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.ui.JettyUtils.ServletParams$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.ui.jobs.UIData.JobUIData$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.ui.jobs.UIData.TaskUIData$

Static reference to the singleton instance of this Scala object.

MODULE$ - Static variable in class org.apache.spark.util.Vector.VectorAccumParam$

Static reference to the singleton instance of this Scala object.

MQTTInputDStream - Class in org.apache.spark.streaming.mqtt

Input stream that subscribe messages from a Mqtt Broker.

MQTTInputDStream(StreamingContext, String, String, StorageLevel) - Constructor for class org.apache.spark.streaming.mqtt.MQTTInputDStream
 
MQTTReceiver - Class in org.apache.spark.streaming.mqtt
 
MQTTReceiver(String, String, StorageLevel) - Constructor for class org.apache.spark.streaming.mqtt.MQTTReceiver
 
MQTTUtils - Class in org.apache.spark.streaming.mqtt
 
MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
 
msDurationToString(long) - Static method in class org.apache.spark.util.Utils

Returns a human-readable string representing a duration such as "35ms"

msg() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
msg() - Method in class org.apache.spark.streaming.scheduler.ErrorReported
 
MulticlassMetrics - Class in org.apache.spark.mllib.evaluation

::Experimental::
 Evaluator for multiclass classification.

MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
 
MultilabelMetrics - Class in org.apache.spark.mllib.evaluation

Evaluator for multilabel classification.

MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
 
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix

Multiply this matrix by a local matrix on the right.

multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix

Multiply this matrix by a local matrix on the right.

multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix

Convenience method for `Matrix`-`DenseMatrix` multiplication.

multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix

Convenience method for `Matrix`-`DenseVector` multiplication.

multiply(double) - Method in class org.apache.spark.util.Vector
 
multiplyGramianMatrixBy(DenseVector<Object>) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix

Multiplies the Gramian matrix A^T A by a dense vector on the right without computing A^T A.

MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat

:: DeveloperApi ::
 MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean,
 variance, minimum, maximum, counts, and nonzero counts for samples in sparse or dense vector
 format in a online fashion.

MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat

Trait for multivariate statistical summary of a data matrix.

mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
 
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
MutablePair<T1,T2> - Class in org.apache.spark.util

:: DeveloperApi ::
 A tuple of 2 elements.

MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
 
MutablePair() - Constructor for class org.apache.spark.util.MutablePair

No-arg constructor for serialization

MutableRowWriteSupport - Class in org.apache.spark.sql.parquet
 
MutableRowWriteSupport() - Constructor for class org.apache.spark.sql.parquet.MutableRowWriteSupport
 
myLocalityLevels() - Method in class org.apache.spark.scheduler.TaskSetManager
 
myName() - Method in class org.apache.spark.util.InnerClosureFinder
 




N

n() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
n() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
NaiveBayes - Class in org.apache.spark.mllib.classification

Trains a Naive Bayes model given an RDD of (label, features) pairs.

NaiveBayes() - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
 
NaiveBayesModel - Class in org.apache.spark.mllib.classification

Model for Naive Bayes Classifiers.

NaiveBayesModel(double[], double[], double[][]) - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel
 
name() - Method in class org.apache.spark.Accumulable
 
name() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
name() - Method in class org.apache.spark.ml.param.Param
 
name() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
name() - Method in class org.apache.spark.rdd.RDD

A friendly name for this RDD

name() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
name() - Method in class org.apache.spark.scheduler.Pool
 
name() - Method in interface org.apache.spark.scheduler.Schedulable
 
name() - Method in class org.apache.spark.scheduler.Stage
 
name() - Method in class org.apache.spark.scheduler.StageInfo
 
name() - Method in class org.apache.spark.scheduler.TaskDescription
 
name() - Method in class org.apache.spark.scheduler.TaskSetManager
 
name() - Method in interface org.apache.spark.SparkStageInfo
 
name() - Method in class org.apache.spark.SparkStageInfoImpl
 
name() - Method in class org.apache.spark.sql.execution.PythonUDF
 
name() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
 
name() - Method in class org.apache.spark.storage.BlockId

A globally unique identifier for this Block.

name() - Method in class org.apache.spark.storage.BroadcastBlockId
 
name() - Method in class org.apache.spark.storage.RDDBlockId
 
name() - Method in class org.apache.spark.storage.RDDInfo
 
name() - Method in class org.apache.spark.storage.ShuffleBlockId
 
name() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
name() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
name() - Method in class org.apache.spark.storage.StreamBlockId
 
name() - Method in class org.apache.spark.storage.TaskResultBlockId
 
name() - Method in class org.apache.spark.storage.TempLocalBlockId
 
name() - Method in class org.apache.spark.storage.TempShuffleBlockId
 
name() - Method in class org.apache.spark.storage.TestBlockId
 
name() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
name() - Method in class org.apache.spark.ui.WebUITab
 
name() - Method in class org.apache.spark.util.MetadataCleaner
 
namedThreadFactory(String) - Static method in class org.apache.spark.util.Utils

Create a thread factory that names threads with a prefix and also sets the threads to daemon.

nameToObjectMap() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
NarrowCoGroupSplitDep - Class in org.apache.spark.rdd
 
NarrowCoGroupSplitDep(RDD<?>, int, Partition) - Constructor for class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
NarrowDependency<T> - Class in org.apache.spark

:: DeveloperApi ::
 Base class for dependencies where each partition of the child RDD depends on a small number
 of partitions of the parent RDD.

NarrowDependency(RDD<T>) - Constructor for class org.apache.spark.NarrowDependency
 
NativeColumnAccessor<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnAccessor(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.NativeColumnAccessor
 
NativeColumnBuilder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnBuilder(ColumnStats, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.NativeColumnBuilder
 
NativeColumnType<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnType(T, int, int) - Constructor for class org.apache.spark.sql.columnar.NativeColumnType
 
NativeCommand - Class in org.apache.spark.sql.hive.execution

:: DeveloperApi ::

NativeCommand(String, Seq<Attribute>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.NativeCommand
 
NativePlaceholder - Class in org.apache.spark.sql.hive

Used when we need to start parsing the AST before deciding that we are going to pass the command
 back for Hive to execute natively.

NativePlaceholder() - Constructor for class org.apache.spark.sql.hive.NativePlaceholder
 
ndcgAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics

Compute the average NDCG value of all the queries, truncated at ranking position k.

networkStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext

Create an input stream with any arbitrary user implemented receiver.

newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD for a given Hadoop file with an arbitrary new API InputFormat
 and extra configuration options to pass to the input format.

newAPIHadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext

Get an RDD for a Hadoop file with an arbitrary new API InputFormat.

newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.SparkContext

Get an RDD for a given Hadoop file with an arbitrary new API InputFormat
 and extra configuration options to pass to the input format.

newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD for a given Hadoop file with an arbitrary new API InputFormat
 and extra configuration options to pass to the input format.

newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext

Get an RDD for a given Hadoop file with an arbitrary new API InputFormat
 and extra configuration options to pass to the input format.

newAttemptId() - Method in class org.apache.spark.scheduler.Stage

Return a new attempt id, starting with 0.

newBroadcast(T, boolean, long, ClassTag<T>) - Method in interface org.apache.spark.broadcast.BroadcastFactory

Creates a new broadcast variable.

newBroadcast(T, boolean, ClassTag<T>) - Method in class org.apache.spark.broadcast.BroadcastManager
 
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
newDaemonCachedThreadPool(String) - Static method in class org.apache.spark.util.Utils

Wrapper over newCachedThreadPool.

newDaemonFixedThreadPool(int, String) - Static method in class org.apache.spark.util.Utils

Wrapper over newFixedThreadPool.

newFile(String, Option<FsPermission>) - Method in class org.apache.spark.util.FileLogger

Start a writer for a new file, closing the existing one if it exists.

newGetLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
NewHadoopPartition - Class in org.apache.spark.rdd
 
NewHadoopPartition(int, int, InputSplit) - Constructor for class org.apache.spark.rdd.NewHadoopPartition
 
NewHadoopRDD<K,V> - Class in org.apache.spark.rdd

:: DeveloperApi ::
 An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS,
 sources in HBase, or S3), using the new MapReduce API (org.apache.hadoop.mapreduce).

NewHadoopRDD(SparkContext, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, Configuration) - Constructor for class org.apache.spark.rdd.NewHadoopRDD
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD<U,T> - Class in org.apache.spark.rdd

Analogous to MapPartitionsRDD, but passes in an InputSplit to
 the given function rather than the index of the partition.

NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD(RDD<T>, Function2<InputSplit, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$ - Class in org.apache.spark.rdd
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$() - Constructor for class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
 
newId() - Static method in class org.apache.spark.Accumulators
 
newInputSplit() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
newInstance() - Method in class org.apache.spark.serializer.JavaSerializer
 
newInstance() - Method in class org.apache.spark.serializer.KryoSerializer
 
newInstance() - Method in class org.apache.spark.serializer.Serializer

Creates a new SerializerInstance.

newInstance() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
newInstance() - Method in class org.apache.spark.sql.execution.KryoResourcePool
 
newInstance() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
newInstance() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
newInstance() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
newInstance() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
newInstance() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
newInstance() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
newJobContext(JobConf, JobID) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newJobContext(Configuration, JobID) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newKryo() - Method in class org.apache.spark.serializer.KryoSerializer
 
newKryo() - Method in class org.apache.spark.sql.execution.SparkSqlSerializer
 
newKryoOutput() - Method in class org.apache.spark.serializer.KryoSerializer
 
newMesosTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
newPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
 
newRddId() - Method in class org.apache.spark.SparkContext

Register a new RDD, returning its RDD ID

newShuffleId() - Method in class org.apache.spark.SparkContext
 
newTaskAttemptContext(JobConf, TaskAttemptID) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newTaskAttemptContext(Configuration, TaskAttemptID) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newTaskAttemptID(String, int, boolean, int, int) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newTaskAttemptID(String, int, boolean, int, int) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newTaskId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
next() - Method in class org.apache.spark.InterruptibleIterator
 
next() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
next(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.compression.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
next() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
next() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
next() - Method in class org.apache.spark.util.CompletionIterator
 
next() - Method in class org.apache.spark.util.IdGenerator
 
next() - Method in class org.apache.spark.util.NextIterator
 
next() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
next() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
NextIterator<U> - Class in org.apache.spark.util

Provides a basic/boilerplate Iterator implementation.

NextIterator() - Constructor for class org.apache.spark.util.NextIterator
 
nextJobId() - Method in class org.apache.spark.scheduler.DAGScheduler
 
nextKeyValue() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
nextKeyValue() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
nextKeyValue() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
nextMesosTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
nextNullIndex() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
nextTaskId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
nextValue() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
nextValue() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator

Returns an i.i.d.

nextValue() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
nextValue() - Method in class org.apache.spark.mllib.random.UniformGenerator
 
NNLS - Class in org.apache.spark.mllib.optimization

Object used to solve nonnegative least squares problems using a modified
 projected gradient method.

NNLS() - Constructor for class org.apache.spark.mllib.optimization.NNLS
 
NNLS.Workspace - Class in org.apache.spark.mllib.optimization
 
NNLS.Workspace(int) - Constructor for class org.apache.spark.mllib.optimization.NNLS.Workspace
 
NO_PREF() - Static method in class org.apache.spark.scheduler.TaskLocality
 
Node - Class in org.apache.spark.mllib.tree.model

:: DeveloperApi ::
 Node in a decision tree.

Node(int, Predict, double, boolean, Option<Split>, Option<Node>, Option<Node>, Option<InformationGainStats>) - Constructor for class org.apache.spark.mllib.tree.model.Node
 
NODE_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
NodeIdCache - Class in org.apache.spark.mllib.tree.impl

:: DeveloperApi ::
 A given TreePoint would belong to a particular node per tree.

NodeIdCache(RDD<int[]>, Option<String>, int) - Constructor for class org.apache.spark.mllib.tree.impl.NodeIdCache
 
nodeIdsForInstances() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
 
nodeIndex() - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
nodeIndexInGroup() - Method in class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
NodeIndexUpdater - Class in org.apache.spark.mllib.tree.impl

:: DeveloperApi ::
 This is used by the node id cache to find the child id that a data point would belong to.

NodeIndexUpdater(Split, int) - Constructor for class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
nodesToGenerator(Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
nodeToRelation(Node) - Static method in class org.apache.spark.sql.hive.HiveQl
 
nodeToSortOrder(Node) - Static method in class org.apache.spark.sql.hive.HiveQl
 
noLocality() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
NONE - Static variable in class org.apache.spark.api.java.StorageLevels
 
None - Static variable in class org.apache.spark.graphx.TripletFields

None of the triplet fields are exposed.

NONE() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
NONE() - Static method in class org.apache.spark.storage.StorageLevel
 
nonLocalPaths(String, boolean) - Static method in class org.apache.spark.util.Utils

Return all non-local paths from a comma-separated list of paths.

nonNegativeHash(Object) - Static method in class org.apache.spark.util.Utils
 
nonNegativeMod(int, int) - Static method in class org.apache.spark.util.Utils
 
NoopColumnStats - Class in org.apache.spark.sql.columnar

A no-op ColumnStats only used for testing purposes.

NoopColumnStats() - Constructor for class org.apache.spark.sql.columnar.NoopColumnStats
 
norm() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
 
norm(Vector, double) - Static method in class org.apache.spark.mllib.linalg.Vectors

Returns the p-norm of this vector.

NORMAL_APPROX_SAMPLE_SIZE() - Method in class org.apache.spark.partial.StudentTCacher
 
normalApprox() - Method in class org.apache.spark.partial.StudentTCacher
 
Normalizer - Class in org.apache.spark.mllib.feature

:: Experimental ::
 Normalizes samples individually to unit L^p^ norm

Normalizer(double) - Constructor for class org.apache.spark.mllib.feature.Normalizer
 
Normalizer() - Constructor for class org.apache.spark.mllib.feature.Normalizer
 
normalJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Java-friendly version of RandomRDDs.normalRDD(org.apache.spark.SparkContext, long, int, long).

normalJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default seed.

normalJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.

normalJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Java-friendly version of RandomRDDs.normalVectorRDD(org.apache.spark.SparkContext, long, int, int, long).

normalJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default seed.

normalJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default number of partitions and the default seed.

normalRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Generates an RDD comprised of i.i.d.

normalVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Generates an RDD[Vector] with vectors containing i.i.d.

normL1() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
normL1() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary

L1 norm of each column

normL2() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
normL2() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary

Euclidean magnitude of each column

NOT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
NOT_SET() - Static method in class org.apache.spark.ExecutorAllocationManager
 
notifyError(Throwable) - Method in class org.apache.spark.streaming.ContextWaiter
 
notifyStop() - Method in class org.apache.spark.streaming.ContextWaiter
 
nullable() - Method in class org.apache.spark.sql.execution.PythonUDF
 
nullable() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
NullableColumnAccessor - Interface in org.apache.spark.sql.columnar
 
NullableColumnBuilder - Interface in org.apache.spark.sql.columnar

A stackable trait used for building byte buffer for a column containing null values.

nullCount() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
nullCount() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
nullCount() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
nullCount() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
nullHypothesis() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
nullHypothesis() - Method in interface org.apache.spark.mllib.stat.test.TestResult

Null hypothesis of the test.

nulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
nullsBuffer() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
NullType - Static variable in class org.apache.spark.sql.api.java.DataType

Gets the NullType object.

NullType - Class in org.apache.spark.sql.api.java

The data type representing null and NULL values.

nullTypeToStringType(StructType) - Static method in class org.apache.spark.sql.json.JsonRDD
 
NUM_PARTITIONS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
numAccepted() - Method in class org.apache.spark.util.random.AcceptanceResult
 
numActives() - Method in class org.apache.spark.graphx.impl.EdgePartition

The number of active vertices, if any exist.

numActiveStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numActiveTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numActiveTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numActiveTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numActiveTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numAvailableOutputs() - Method in class org.apache.spark.scheduler.Stage
 
numberOfHiccups() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numberOfMsgs() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numberOfWorkers() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numBins() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numBlocks() - Method in class org.apache.spark.storage.StorageStatus

Return the number of blocks stored in this block manager in O(RDDs) time.

numCachedPartitions() - Method in class org.apache.spark.storage.RDDInfo
 
numClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
numClasses() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numCols() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix

Gets or computes the number of columns.

numCols() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix

Gets or computes the number of columns.

numCols() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix

Gets or computes the number of columns.

numCols() - Method in interface org.apache.spark.mllib.linalg.Matrix

Number of columns.

numCols() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
numCompletedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
numCompletedTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numCompletedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numCompletedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numCompleteTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numDescendants() - Method in class org.apache.spark.mllib.tree.model.Node

Get the number of nodes in tree below this node, including leaf nodes.

numEdgePartitions() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition

The maximum number of edge partitions this `RoutingTablePartition` is built to join with.

numEdges() - Method in class org.apache.spark.graphx.GraphOps

The number of edges in the graph.

numericAstTypes() - Static method in class org.apache.spark.sql.hive.HiveQl
 
NumericParser - Class in org.apache.spark.mllib.util

Simple parser for a numeric structure consisting of three types:

NumericParser() - Constructor for class org.apache.spark.mllib.util.NumericParser
 
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.SparkContext
 
numExamples() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numExistingExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend

Return the number of executors currently registered with this backend.

numFailedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
numFailedStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numFailedTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numFailedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numFailedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numFailedTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numFalseNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix

number of false negatives

numFalseNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl

number of false negatives

numFalsePositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix

number of false positives

numFalsePositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl

number of false positives

numFeatures() - Method in class org.apache.spark.ml.feature.HashingTF

number of features

numFeatures() - Method in class org.apache.spark.mllib.feature.HashingTF
 
numFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numFeaturesPerNode() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numFinished() - Method in class org.apache.spark.scheduler.ActiveJob
 
numFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams

param for number of folds for cross validation

numInLinks() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
numItems() - Method in class org.apache.spark.util.random.AcceptanceResult
 
numIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
numNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix

number of negatives

numNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl

number of negatives

numNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
numNodes() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel

Get number of nodes in tree, including leaf nodes.

numNonzeros() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
numNonzeros() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary

Number of nonzero elements (including explicitly presented zero values) in each column.

numOutLinks() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
numPartitions() - Method in class org.apache.spark.HashPartitioner
 
numPartitions() - Method in class org.apache.spark.mllib.recommendation.ALSPartitioner
 
numPartitions() - Method in class org.apache.spark.Partitioner
 
numPartitions() - Method in class org.apache.spark.RangePartitioner
 
numPartitions() - Method in class org.apache.spark.scheduler.ActiveJob
 
numPartitions() - Method in class org.apache.spark.scheduler.Stage
 
numPartitions() - Method in class org.apache.spark.sql.execution.AddExchange
 
numPartitions() - Method in class org.apache.spark.sql.execution.SparkStrategies.BasicOperators
 
numPartitions() - Method in class org.apache.spark.storage.RDDInfo
 
numPartitionsInRdd2() - Method in class org.apache.spark.rdd.CartesianRDD
 
numPositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix

number of positives

numPositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl

number of positives

numPositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
numRatings() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
 
numRddBlocks() - Method in class org.apache.spark.storage.StorageStatus

Return the number of RDD blocks stored in this block manager in O(RDDs) time.

numRddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus

Return the number of blocks that belong to the given RDD in O(1) time.

numReceivers() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numRecords() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
numRetries(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils

Returns the configured number of times to retry connecting

numRows() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix

Gets or computes the number of rows.

numRows() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix

Gets or computes the number of rows.

numRows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix

Gets or computes the number of rows.

numRows() - Method in interface org.apache.spark.mllib.linalg.Matrix

Number of rows.

numRows() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
numShufflePartitions() - Method in class org.apache.spark.sql.hive.test.TestHiveContext

Fewer partitions to speed up testing.

numShufflePartitions() - Method in interface org.apache.spark.sql.SQLConf

Number of partitions to use for shuffle operators.

numShufflePartitions() - Static method in class org.apache.spark.sql.test.TestSQLContext

Fewer partitions to speed up testing.

numSkippedStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numSkippedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numSplits(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata

Number of splits for the given feature.

numTasks() - Method in class org.apache.spark.scheduler.Stage
 
numTasks() - Method in class org.apache.spark.scheduler.StageInfo
 
numTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
numTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numThreadsUnrolling() - Method in class org.apache.spark.storage.MemoryStore

Return the number of threads currently unrolling blocks.

numTotalCompletedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTotalJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
numTotalProcessedRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTotalReceivedRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTrees() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numTrees() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel

Get number of trees in forest.

numTrueNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix

number of true negatives

numTrueNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl

number of true negatives

numTruePositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix

number of true positives

numTruePositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl

number of true positives

numUnorderedBins(int) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata

Given the arity of a categorical feature (arity = number of categories),
 return the number of bins for the feature if it is to be treated as an unordered feature.

numUnprocessedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numVertices() - Method in class org.apache.spark.graphx.GraphOps

The number of vertices in the graph.





O

objectFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and
 BytesWritable values that contain a serialized partition.

objectFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext

Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and
 BytesWritable values that contain a serialized partition.

objectFile(String, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext

Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and
 BytesWritable values that contain a serialized partition.

ObjectInputStreamWithLoader - Class in org.apache.spark.streaming
 
ObjectInputStreamWithLoader(InputStream, ClassLoader) - Constructor for class org.apache.spark.streaming.ObjectInputStreamWithLoader
 
of(RDD<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.evaluation.AreaUnderCurve

Returns the area under the given curve.

of(Iterable<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.evaluation.AreaUnderCurve

Returns the area under the given curve.

OFF_HEAP - Static variable in class org.apache.spark.api.java.StorageLevels
 
OFF_HEAP() - Static method in class org.apache.spark.storage.StorageLevel
 
offerRescinded(SchedulerDriver, Protos.OfferID) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
offerRescinded(SchedulerDriver, Protos.OfferID) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
offHeapUsed() - Method in class org.apache.spark.storage.StorageStatus

Return the off-heap space used by this block manager.

offHeapUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus

Return the off-heap space used by the given RDD in this block manager in O(1) time.

offset() - Method in class org.apache.spark.storage.FileSegment
 
offset() - Method in class org.apache.spark.storage.TachyonFileSegment
 
offset() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
offsetBytes(String, long, long) - Static method in class org.apache.spark.util.Utils

Return a string containing part of a file from byte 'start' to 'end'.

offsetBytes(Seq<File>, long, long) - Static method in class org.apache.spark.util.Utils

Return a string containing data across a set of files.

onAddData(Object, Object) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener

Called after a data item is added into the BlockGenerator.

onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onApplicationEnd(SparkListenerApplicationEnd) - Method in interface org.apache.spark.scheduler.SparkListener

Called when the application ends

onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onApplicationStart(SparkListenerApplicationStart) - Method in interface org.apache.spark.scheduler.SparkListener

Called when the application starts

onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
onBatchCompleted(StreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener

Called when processing of a batch of jobs has completed.

onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBatchCompletion(Time) - Method in class org.apache.spark.streaming.scheduler.JobGenerator

Callback called when a batch has been completely processed.

onBatchStarted(StreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener

Called when processing of a batch of jobs has started.

onBatchStarted(StreamingListenerBatchStarted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener

Called when a batch of jobs has been submitted for processing.

onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in interface org.apache.spark.scheduler.SparkListener

Called when a new block manager has joined

onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.storage.StorageStatusListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in interface org.apache.spark.scheduler.SparkListener

Called when an existing block manager has been removed

onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.storage.StorageStatusListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onCheckpointCompletion(Time) - Method in class org.apache.spark.streaming.scheduler.JobGenerator

Callback called when the checkpoint of a batch has been written.

onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
 
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in interface org.apache.spark.FutureAction

When this action is completed, either through an exception, or a value, applies the provided
 function.

onComplete(Function1<R, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult

Set a handler to be called when this PartialResult completes.

onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in interface org.apache.spark.scheduler.SparkListener

Called when environment properties have been updated

onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.env.EnvironmentListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onError(String, Throwable) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener

Called when an error has occurred in the BlockGenerator.

ones(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices

Generate a DenseMatrix consisting of ones.

ones(int) - Static method in class org.apache.spark.util.Vector
 
OneToOneDependency<T> - Class in org.apache.spark

:: DeveloperApi ::
 Represents a one-to-one dependency between partitions of the parent and child RDDs.

OneToOneDependency(RDD<T>) - Constructor for class org.apache.spark.OneToOneDependency
 
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in interface org.apache.spark.scheduler.SparkListener

Called when the driver receives task metrics from an executor in a heartbeat.

onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onFail(Function1<Exception, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult

Set a handler to be called if this PartialResult's job fails.

onGenerateBlock(StreamBlockId) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener

Called when a new block of data is generated by the block generator.

onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.JobLogger

When job ends, recording job completion status and close log file

onJobEnd(SparkListenerJobEnd) - Method in interface org.apache.spark.scheduler.SparkListener

Called when a job ends

onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.JobLogger

When job starts, record job property and stage graph

onJobStart(SparkListenerJobStart) - Method in interface org.apache.spark.scheduler.SparkListener

Called when a job starts

onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onPushBlock(StreamBlockId, ArrayBuffer<?>) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener

Called when a new block is ready to be pushed.

onReceiverError(StreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener

Called when a receiver has reported an error

onReceiverError(StreamingListenerReceiverError) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onReceiverStarted(StreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener

Called when a receiver has been started

onReceiverStarted(StreamingListenerReceiverStarted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onReceiverStopped(StreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener

Called when a receiver has been stopped

onReceiverStopped(StreamingListenerReceiverStopped) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.JobLogger

When stage is completed, record stage completion status

onStageCompleted(SparkListenerStageCompleted) - Method in interface org.apache.spark.scheduler.SparkListener

Called when a stage completes successfully or fails, with information on the completed stage.

onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.StatsReportListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.storage.StorageListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.JobLogger

When stage is submitted, record stage submit info

onStageSubmitted(SparkListenerStageSubmitted) - Method in interface org.apache.spark.scheduler.SparkListener

Called when a stage is submitted

onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.jobs.JobProgressListener

For FIFO, all stages are contained by "default" pool but "default" pool here is meaningless

onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.storage.StorageListener
 
onStart() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
onStart() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
 
onStart() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
onStart() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
onStart() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
onStart() - Method in class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
onStart() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver

This is called when the KinesisReceiver starts and must be non-blocking.

onStart() - Method in class org.apache.spark.streaming.mqtt.MQTTReceiver
 
onStart() - Method in class org.apache.spark.streaming.receiver.ActorReceiver
 
onStart() - Method in class org.apache.spark.streaming.receiver.Receiver

This method is called by the system when the receiver is started.

onStart() - Method in class org.apache.spark.streaming.twitter.TwitterReceiver
 
onStop() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
onStop() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
 
onStop() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
onStop() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
onStop() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
onStop() - Method in class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
onStop() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver

This is called when the KinesisReceiver stops.

onStop() - Method in class org.apache.spark.streaming.mqtt.MQTTReceiver
 
onStop() - Method in class org.apache.spark.streaming.receiver.ActorReceiver
 
onStop() - Method in class org.apache.spark.streaming.receiver.Receiver

This method is called by the system when the receiver is stopped.

onStop() - Method in class org.apache.spark.streaming.twitter.TwitterReceiver
 
onTaskCompletion(TaskContext) - Method in interface org.apache.spark.util.TaskCompletionListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.JobLogger

When task ends, record task completion status and metrics

onTaskEnd(SparkListenerTaskEnd) - Method in interface org.apache.spark.scheduler.SparkListener

Called when a task ends

onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.StatsReportListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.storage.StorageStatusListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.storage.StorageListener

Assumes the storage status list is fully up-to-date.

onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in interface org.apache.spark.scheduler.SparkListener

Called when a task begins remotely fetching its result (will not be called for tasks that do
 not need to fetch the result remotely).

onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskStart(SparkListenerTaskStart) - Method in interface org.apache.spark.scheduler.SparkListener

Called when a task starts

onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in interface org.apache.spark.scheduler.SparkListener

Called when an RDD is manually unpersisted by the application

onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.storage.StorageStatusListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.ui.storage.StorageListener
 
OOM() - Static method in class org.apache.spark.util.SparkExitCode

The default uncaught exception handler was reached, and the uncaught exception was an
   OutOfMemoryError.

open() - Method in class org.apache.spark.input.PortableDataStream

Create a new DataInputStream from the split and context

open() - Method in class org.apache.spark.SparkHadoopWriter
 
open() - Method in class org.apache.spark.storage.BlockObjectWriter
 
open() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
OpenHashSetSerializer - Class in org.apache.spark.sql.execution
 
OpenHashSetSerializer() - Constructor for class org.apache.spark.sql.execution.OpenHashSetSerializer
 
ops() - Method in class org.apache.spark.graphx.Graph

The associated GraphOps object.

optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.GradientDescent

:: DeveloperApi ::
 Runs gradient descent on the given training data.

optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.LBFGS
 
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in interface org.apache.spark.mllib.optimization.Optimizer

Solve the provided convex optimization problem.

optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
 
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
 
Optimizer - Interface in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Trait for optimization problem solvers.

optimizer() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm

The optimizer to solve the problem.

optimizer() - Method in class org.apache.spark.mllib.regression.LassoWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
 
options() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
optionToOptional(Option<T>) - Static method in class org.apache.spark.api.java.JavaUtils
 
OR() - Static method in class org.apache.spark.sql.hive.HiveQl
 
ord() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
orderBy(Seq<SortOrder>) - Method in class org.apache.spark.sql.SchemaRDD

Sorts the results by the given expressions.

OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd

Extra functions available on RDDs of (key, value) pairs where the key is sortable through
 an implicit conversion.

OrderedRDDFunctions(RDD<P>, Ordering<K>, ClassTag<K>, ClassTag<V>, ClassTag<P>) - Constructor for class org.apache.spark.rdd.OrderedRDDFunctions
 
ordering() - Static method in class org.apache.spark.streaming.Time
 
org.apache.spark - package org.apache.spark

Core Spark classes in Scala.

org.apache.spark.annotation - package org.apache.spark.annotation

Spark annotations to mark an API experimental or intended only for advanced usages by developers.

org.apache.spark.api.java - package org.apache.spark.api.java

Spark Java programming APIs.

org.apache.spark.api.java.function - package org.apache.spark.api.java.function

Set of interfaces to represent functions in Spark's Java API.

org.apache.spark.broadcast - package org.apache.spark.broadcast

Spark's broadcast variables, used to broadcast immutable datasets to all nodes.

org.apache.spark.examples.streaming - package org.apache.spark.examples.streaming
 
org.apache.spark.graphx - package org.apache.spark.graphx

ALPHA COMPONENT
 GraphX is a graph processing framework built on top of Spark.

org.apache.spark.graphx.impl - package org.apache.spark.graphx.impl
 
org.apache.spark.graphx.lib - package org.apache.spark.graphx.lib

Various analytics functions for graphs.

org.apache.spark.graphx.util - package org.apache.spark.graphx.util

Collections of utilities used by graphx.

org.apache.spark.input - package org.apache.spark.input
 
org.apache.spark.io - package org.apache.spark.io

IO codecs used for compression.

org.apache.spark.mapred - package org.apache.spark.mapred
 
org.apache.spark.mapreduce - package org.apache.spark.mapreduce
 
org.apache.spark.metrics - package org.apache.spark.metrics
 
org.apache.spark.metrics.sink - package org.apache.spark.metrics.sink
 
org.apache.spark.metrics.source - package org.apache.spark.metrics.source
 
org.apache.spark.ml - package org.apache.spark.ml

Spark ML is an ALPHA component that adds a new set of machine learning APIs to let users quickly
 assemble and configure practical machine learning pipelines.

org.apache.spark.ml.classification - package org.apache.spark.ml.classification
 
org.apache.spark.ml.evaluation - package org.apache.spark.ml.evaluation
 
org.apache.spark.ml.feature - package org.apache.spark.ml.feature
 
org.apache.spark.ml.param - package org.apache.spark.ml.param
 
org.apache.spark.ml.tuning - package org.apache.spark.ml.tuning
 
org.apache.spark.mllib.classification - package org.apache.spark.mllib.classification
 
org.apache.spark.mllib.clustering - package org.apache.spark.mllib.clustering
 
org.apache.spark.mllib.evaluation - package org.apache.spark.mllib.evaluation
 
org.apache.spark.mllib.evaluation.binary - package org.apache.spark.mllib.evaluation.binary
 
org.apache.spark.mllib.feature - package org.apache.spark.mllib.feature
 
org.apache.spark.mllib.linalg - package org.apache.spark.mllib.linalg
 
org.apache.spark.mllib.linalg.distributed - package org.apache.spark.mllib.linalg.distributed
 
org.apache.spark.mllib.optimization - package org.apache.spark.mllib.optimization
 
org.apache.spark.mllib.random - package org.apache.spark.mllib.random
 
org.apache.spark.mllib.rdd - package org.apache.spark.mllib.rdd
 
org.apache.spark.mllib.recommendation - package org.apache.spark.mllib.recommendation
 
org.apache.spark.mllib.regression - package org.apache.spark.mllib.regression
 
org.apache.spark.mllib.stat - package org.apache.spark.mllib.stat
 
org.apache.spark.mllib.stat.correlation - package org.apache.spark.mllib.stat.correlation
 
org.apache.spark.mllib.stat.test - package org.apache.spark.mllib.stat.test
 
org.apache.spark.mllib.tree - package org.apache.spark.mllib.tree
 
org.apache.spark.mllib.tree.configuration - package org.apache.spark.mllib.tree.configuration
 
org.apache.spark.mllib.tree.impl - package org.apache.spark.mllib.tree.impl
 
org.apache.spark.mllib.tree.impurity - package org.apache.spark.mllib.tree.impurity
 
org.apache.spark.mllib.tree.loss - package org.apache.spark.mllib.tree.loss
 
org.apache.spark.mllib.tree.model - package org.apache.spark.mllib.tree.model
 
org.apache.spark.mllib.util - package org.apache.spark.mllib.util
 
org.apache.spark.partial - package org.apache.spark.partial
 
org.apache.spark.rdd - package org.apache.spark.rdd

Provides implementation's of various RDDs.

org.apache.spark.scheduler - package org.apache.spark.scheduler

Spark's DAG scheduler.

org.apache.spark.scheduler.cluster - package org.apache.spark.scheduler.cluster
 
org.apache.spark.scheduler.cluster.mesos - package org.apache.spark.scheduler.cluster.mesos
 
org.apache.spark.scheduler.local - package org.apache.spark.scheduler.local
 
org.apache.spark.serializer - package org.apache.spark.serializer

Pluggable serializers for RDD and shuffle data.

org.apache.spark.sql - package org.apache.spark.sql
 
org.apache.spark.sql.api.java - package org.apache.spark.sql.api.java

Allows the execution of relational queries, including those expressed in SQL using Spark.

org.apache.spark.sql.columnar - package org.apache.spark.sql.columnar
 
org.apache.spark.sql.columnar.compression - package org.apache.spark.sql.columnar.compression
 
org.apache.spark.sql.execution - package org.apache.spark.sql.execution
 
org.apache.spark.sql.execution.joins - package org.apache.spark.sql.execution.joins
 
org.apache.spark.sql.hive - package org.apache.spark.sql.hive
 
org.apache.spark.sql.hive.api.java - package org.apache.spark.sql.hive.api.java
 
org.apache.spark.sql.hive.execution - package org.apache.spark.sql.hive.execution
 
org.apache.spark.sql.hive.parquet - package org.apache.spark.sql.hive.parquet
 
org.apache.spark.sql.hive.test - package org.apache.spark.sql.hive.test
 
org.apache.spark.sql.json - package org.apache.spark.sql.json
 
org.apache.spark.sql.parquet - package org.apache.spark.sql.parquet
 
org.apache.spark.sql.sources - package org.apache.spark.sql.sources
 
org.apache.spark.sql.test - package org.apache.spark.sql.test
 
org.apache.spark.sql.types.util - package org.apache.spark.sql.types.util
 
org.apache.spark.storage - package org.apache.spark.storage
 
org.apache.spark.streaming - package org.apache.spark.streaming
 
org.apache.spark.streaming.api.java - package org.apache.spark.streaming.api.java

Java APIs for spark streaming.

org.apache.spark.streaming.dstream - package org.apache.spark.streaming.dstream

Various implementations of DStreams.

org.apache.spark.streaming.flume - package org.apache.spark.streaming.flume

Spark streaming receiver for Flume.

org.apache.spark.streaming.kafka - package org.apache.spark.streaming.kafka

Kafka receiver for spark streaming.

org.apache.spark.streaming.kinesis - package org.apache.spark.streaming.kinesis
 
org.apache.spark.streaming.mqtt - package org.apache.spark.streaming.mqtt

MQTT receiver for Spark Streaming.

org.apache.spark.streaming.rdd - package org.apache.spark.streaming.rdd
 
org.apache.spark.streaming.receiver - package org.apache.spark.streaming.receiver
 
org.apache.spark.streaming.scheduler - package org.apache.spark.streaming.scheduler
 
org.apache.spark.streaming.twitter - package org.apache.spark.streaming.twitter

Twitter feed receiver for spark streaming.

org.apache.spark.streaming.ui - package org.apache.spark.streaming.ui
 
org.apache.spark.streaming.util - package org.apache.spark.streaming.util
 
org.apache.spark.streaming.zeromq - package org.apache.spark.streaming.zeromq

Zeromq receiver for spark streaming.

org.apache.spark.ui - package org.apache.spark.ui
 
org.apache.spark.ui.env - package org.apache.spark.ui.env
 
org.apache.spark.ui.exec - package org.apache.spark.ui.exec
 
org.apache.spark.ui.jobs - package org.apache.spark.ui.jobs
 
org.apache.spark.ui.storage - package org.apache.spark.ui.storage
 
org.apache.spark.util - package org.apache.spark.util

Spark utilities.

org.apache.spark.util.io - package org.apache.spark.util.io
 
org.apache.spark.util.logging - package org.apache.spark.util.logging
 
org.apache.spark.util.random - package org.apache.spark.util.random

Utilities for random number generation.

originals() - Static method in class org.apache.spark.Accumulators
 
originalType() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
other() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.execution.SetCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
otherInfo() - Method in class org.apache.spark.streaming.receiver.Statistics
 
otherVertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet

Given one vertex in the edge return the other vertex.

otherVertexId(long) - Method in class org.apache.spark.graphx.Edge

Given one vertex in the edge return the other vertex.

Out() - Static method in class org.apache.spark.graphx.EdgeDirection

Edges originating from a vertex.

outDegrees() - Method in class org.apache.spark.graphx.GraphOps

The out-degree of each vertex in the graph.

outer() - Method in class org.apache.spark.sql.execution.Generate
 
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option<U>, VD2>, ClassTag<U>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph

Joins the vertices with entries in the table RDD and merges the results using mapFunc.

outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option<U>, VD2>, ClassTag<U>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
OutLinkBlock - Class in org.apache.spark.mllib.recommendation

Out-link information for a user or product block.

OutLinkBlock(int[], BitSet[]) - Constructor for class org.apache.spark.mllib.recommendation.OutLinkBlock
 
output() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
output() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
output() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
output() - Method in class org.apache.spark.sql.execution.Aggregate
 
output() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
 
output() - Method in class org.apache.spark.sql.execution.CacheTableCommand
 
output() - Method in class org.apache.spark.sql.execution.DescribeCommand
 
output() - Method in class org.apache.spark.sql.execution.Distinct
 
output() - Method in class org.apache.spark.sql.execution.EvaluatePython
 
output() - Method in class org.apache.spark.sql.execution.Except
 
output() - Method in class org.apache.spark.sql.execution.Exchange
 
output() - Method in class org.apache.spark.sql.execution.ExecutedCommand
 
output() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
output() - Method in class org.apache.spark.sql.execution.ExplainCommand
 
output() - Method in class org.apache.spark.sql.execution.ExternalSort
 
output() - Method in class org.apache.spark.sql.execution.Filter
 
output() - Method in class org.apache.spark.sql.execution.Generate
 
output() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
output() - Method in class org.apache.spark.sql.execution.Intersect
 
output() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
output() - Method in class org.apache.spark.sql.execution.joins.CartesianProduct
 
output() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
output() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
output() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
output() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
output() - Method in class org.apache.spark.sql.execution.Limit
 
output() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
output() - Method in class org.apache.spark.sql.execution.OutputFaker
 
output() - Method in class org.apache.spark.sql.execution.PhysicalRDD
 
output() - Method in class org.apache.spark.sql.execution.Project
 
output() - Method in interface org.apache.spark.sql.execution.RunnableCommand
 
output() - Method in class org.apache.spark.sql.execution.Sample
 
output() - Method in class org.apache.spark.sql.execution.SetCommand
 
output() - Method in class org.apache.spark.sql.execution.Sort
 
output() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
output() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
output() - Method in class org.apache.spark.sql.execution.UncacheTableCommand
 
output() - Method in class org.apache.spark.sql.execution.Union
 
output() - Method in class org.apache.spark.sql.hive.execution.AddFile
 
output() - Method in class org.apache.spark.sql.hive.execution.AddJar
 
output() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
 
output() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
output() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
output() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
output() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
output() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
output() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
output() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
output() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
output() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
output() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
output() - Method in class org.apache.spark.sql.parquet.ParquetRelation

Attributes

output() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
output() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
OUTPUT() - Static method in class org.apache.spark.ui.ToolTips
 
outputBytes() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
outputBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
outputClass() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
outputCol() - Method in interface org.apache.spark.ml.param.HasOutputCol

param for output column name

OutputFaker - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 A plan node that does nothing but lie about the output of its child.

OutputFaker(Seq<Attribute>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.OutputFaker
 
outputId() - Method in class org.apache.spark.scheduler.ResultTask
 
outputLocs() - Method in class org.apache.spark.scheduler.Stage
 
outputMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
outputMetricsToJson(OutputMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.Exchange
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.Limit
 
outputPartitioning() - Method in class org.apache.spark.sql.execution.SparkPlan

Specifies how data is partitioned across different nodes in the cluster.

outputPartitioning() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
outputPartitioning() - Method in interface org.apache.spark.sql.execution.UnaryNode
 
outputsMerged() - Method in class org.apache.spark.partial.CountEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.MeanEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.SumEvaluator
 
OVERHEAD_FRACTION() - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
OVERHEAD_MINIMUM() - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
overwrite() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
overwrite() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
overwrite() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 




P

pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps

Run a dynamic version of PageRank returning a graph with vertex attributes containing the
 PageRank and edge attributes containing the normalized edge weight.

PageRank - Class in org.apache.spark.graphx.lib

PageRank algorithm implementation.

PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
 
pages() - Method in class org.apache.spark.ui.WebUITab
 
PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream

Extra functions available on DStream of (key, value) pairs through an implicit conversion.

PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function

A function that returns zero or more key-value pair records from each input record.

PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function

A function that returns key-value pairs (Tuple2<K, V>), and can be used to
 construct PairRDDs.

pairFunToScalaFun(PairFunction<A, B, C>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
PairRDDFunctions<K,V> - Class in org.apache.spark.rdd

Extra functions available on RDDs of (key, value) pairs through an implicit conversion.

PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
 
ParallelCollectionPartition<T> - Class in org.apache.spark.rdd
 
ParallelCollectionPartition(long, int, Seq<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ParallelCollectionPartition
 
ParallelCollectionRDD<T> - Class in org.apache.spark.rdd
 
ParallelCollectionRDD(SparkContext, Seq<T>, int, Map<Object, Seq<String>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ParallelCollectionRDD
 
parallelism() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Distribute a local Scala collection to form an RDD.

parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext

Distribute a local Scala collection to form an RDD.

parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext

Distribute a local Scala collection to form an RDD.

parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Distribute a local Scala collection to form an RDD.

parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext

Distribute a local Scala collection to form an RDD.

parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Distribute a local Scala collection to form an RDD.

parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext

Distribute a local Scala collection to form an RDD.

Param<T> - Class in org.apache.spark.ml.param

:: AlphaComponent ::
 A param with self-contained documentation and optionally default value.

Param(Params, String, String, Option<T>) - Constructor for class org.apache.spark.ml.param.Param
 
param() - Method in class org.apache.spark.ml.param.ParamPair
 
ParamGridBuilder - Class in org.apache.spark.ml.tuning

:: AlphaComponent ::
 Builder for a param grid used in grid search-based model selection.

ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
 
ParamMap - Class in org.apache.spark.ml.param

:: AlphaComponent ::
 A param to value map.

ParamMap(Map<Param<Object>, Object>) - Constructor for class org.apache.spark.ml.param.ParamMap
 
ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap

Creates an empty param map.

paramMap() - Method in interface org.apache.spark.ml.param.Params

Internal param map.

ParamPair<T> - Class in org.apache.spark.ml.param

A param amd its value.

ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
 
Params - Interface in org.apache.spark.ml.param

:: AlphaComponent ::
 Trait for components that take parameters.

params() - Method in interface org.apache.spark.ml.param.Params

Returns all params.

parent() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
parent() - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
parent() - Method in class org.apache.spark.ml.Model

The parent estimator that produced this model.

parent() - Method in class org.apache.spark.ml.param.Param
 
parent() - Method in class org.apache.spark.ml.PipelineModel
 
parent() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
parent() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
parent() - Method in class org.apache.spark.scheduler.Pool
 
parent() - Method in interface org.apache.spark.scheduler.Schedulable
 
parent() - Method in class org.apache.spark.scheduler.TaskSetManager
 
parent() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
parent() - Method in class org.apache.spark.streaming.ui.StreamingTab
 
ParentClassLoader - Class in org.apache.spark.util

A class loader which makes findClass accesible to the child

ParentClassLoader(ClassLoader) - Constructor for class org.apache.spark.util.ParentClassLoader
 
parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node

Get the parent index of the given node, or 0 if it is the root.

parentPartition() - Method in class org.apache.spark.rdd.UnionPartition
 
parentRddIndex() - Method in class org.apache.spark.rdd.UnionPartition
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
parents() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
parents() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
parents() - Method in class org.apache.spark.scheduler.Stage
 
parentsIndices() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
parentSplit() - Method in class org.apache.spark.rdd.PartitionPruningRDDPartition
 
PARQUET_FILTER_DATA() - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
parquetCompressionCodec() - Method in interface org.apache.spark.sql.SQLConf

The compression codec for writing to a Parquetfile

ParquetConversion() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
parquetFile(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext

Loads a parquet file, returning the result as a JavaSchemaRDD.

parquetFile(String) - Method in class org.apache.spark.sql.SQLContext

Loads a Parquet file, returning the result as a SchemaRDD.

parquetFilterPushDown() - Method in interface org.apache.spark.sql.SQLConf

When true predicates will be passed to the parquet record reader when possible.

ParquetFilters - Class in org.apache.spark.sql.parquet
 
ParquetFilters() - Constructor for class org.apache.spark.sql.parquet.ParquetFilters
 
ParquetOperations() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
ParquetRelation - Class in org.apache.spark.sql.parquet

Relation that consists of data stored in a Parquet columnar format.

ParquetRelation(String, Option<Configuration>, SQLContext, Seq<Attribute>) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation
 
ParquetRelation2 - Class in org.apache.spark.sql.parquet

An alternative to ParquetRelation that plugs in using the data sources API.

ParquetRelation2(String, SQLContext) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation2
 
parquetSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation

Schema derived from ParquetFile

ParquetTableScan - Class in org.apache.spark.sql.parquet

:: DeveloperApi ::
 Parquet table scan operator.

ParquetTableScan(Seq<Attribute>, ParquetRelation, Seq<Expression>) - Constructor for class org.apache.spark.sql.parquet.ParquetTableScan
 
ParquetTestData - Class in org.apache.spark.sql.parquet
 
ParquetTestData() - Constructor for class org.apache.spark.sql.parquet.ParquetTestData
 
ParquetTypeInfo - Class in org.apache.spark.sql.parquet

A class representing Parquet info fields we care about, for passing back to Parquet

ParquetTypeInfo(PrimitiveType.PrimitiveTypeName, Option<OriginalType>, Option<DecimalMetadata>, Option<Object>) - Constructor for class org.apache.spark.sql.parquet.ParquetTypeInfo
 
ParquetTypesConverter - Class in org.apache.spark.sql.parquet
 
ParquetTypesConverter() - Constructor for class org.apache.spark.sql.parquet.ParquetTypesConverter
 
parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors

Parses a string resulted from Vector#toString into
 an Vector.

parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint

Parses a string resulted from LabeledPoint#toString into
 an LabeledPoint.

parse(String) - Static method in class org.apache.spark.mllib.util.NumericParser

Parses a string into a Double, an Array[Double], or a Seq[Any].

parseCompressionCodec(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
parseDataType(String) - Method in class org.apache.spark.sql.SQLContext

Parses the data type in our internal string representation.

parseDdl(String) - Static method in class org.apache.spark.sql.hive.HiveQl
 
parseHostPort(String) - Static method in class org.apache.spark.util.Utils
 
parseLoggingInfo(Path, FileSystem) - Static method in class org.apache.spark.scheduler.EventLoggingListener

Parse the event logging information associated with the logs in the given directory.

parseLoggingInfo(String, FileSystem) - Static method in class org.apache.spark.scheduler.EventLoggingListener

Parse the event logging information associated with the logs in the given directory.

parseNumeric(Object) - Static method in class org.apache.spark.mllib.linalg.Vectors
 
parseSparkVersion(String) - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
parseSql(String) - Static method in class org.apache.spark.sql.hive.HiveQl

Returns a LogicalPlan for a given HiveQL string.

parseStream(PortableDataStream) - Method in class org.apache.spark.input.StreamBasedRecordReader

Parse the stream (and close it afterwards) and return the value as in type T

parseStream(PortableDataStream) - Method in class org.apache.spark.input.StreamRecordReader
 
partial() - Method in class org.apache.spark.sql.execution.Aggregate
 
partial() - Method in class org.apache.spark.sql.execution.Distinct
 
partial() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
PartialResult<R> - Class in org.apache.spark.partial
 
PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
 
Partition - Interface in org.apache.spark

An identifier for a partition in an RDD.

partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
partition() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
Partition - Class in org.apache.spark.sql.parquet
 
Partition(Map<String, Object>, Seq<FileStatus>) - Constructor for class org.apache.spark.sql.parquet.Partition
 
partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Return a copy of the RDD partitioned using the specified partitioner.

partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph

Repartitions the edges in the graph according to partitionStrategy.

partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph

Repartitions the edges in the graph according to partitionStrategy.

partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return a copy of the RDD partitioned using the specified partitioner.

PartitionCoalescer - Class in org.apache.spark.rdd

Coalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of
 this RDD computes one or more of the parent ones.

PartitionCoalescer(int, RDD<?>, double) - Constructor for class org.apache.spark.rdd.PartitionCoalescer
 
PartitionCoalescer.LocationIterator - Class in org.apache.spark.rdd
 
PartitionCoalescer.LocationIterator(RDD<?>) - Constructor for class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl

If partitionsRDD already has a partitioner, use it.

partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
Partitioner - Class in org.apache.spark

An object that defines how the elements in a key-value pair RDD are partitioned by key.

Partitioner() - Constructor for class org.apache.spark.Partitioner
 
partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
partitioner() - Method in class org.apache.spark.rdd.FilteredRDD
 
partitioner() - Method in class org.apache.spark.rdd.FlatMappedValuesRDD
 
partitioner() - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
partitioner() - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
partitioner() - Method in class org.apache.spark.rdd.MappedValuesRDD
 
partitioner() - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
partitioner() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
partitioner() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
partitioner() - Method in class org.apache.spark.rdd.RDD

Optionally overridden by subclasses to specify how they are partitioned.

partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
 
partitioner() - Method in class org.apache.spark.rdd.SubtractedRDD
 
partitioner() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
partitioner() - Method in class org.apache.spark.ShuffleDependency
 
PartitionerAwareUnionRDD<T> - Class in org.apache.spark.rdd

Class representing an RDD that can take multiple RDDs partitioned by the same partitioner and
 unify them into a single RDD while preserving the partitioner.

PartitionerAwareUnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
PartitionerAwareUnionRDDPartition - Class in org.apache.spark.rdd

Class representing partitions of PartitionerAwareUnionRDD, which maintains the list of
 corresponding partitions of parent RDDs.

PartitionerAwareUnionRDDPartition(Seq<RDD<?>>, int) - Constructor for class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
partitionFilters() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
PartitionGroup - Class in org.apache.spark.rdd
 
PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
 
partitionId() - Method in class org.apache.spark.scheduler.Task
 
partitionId() - Method in class org.apache.spark.TaskContext
 
partitionId() - Method in class org.apache.spark.TaskContextImpl
 
partitioningAttributes() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
partitionKeys() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
partitionPruningPred() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
PartitionPruningRDD<T> - Class in org.apache.spark.rdd

:: DeveloperApi ::
 A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on
 all partitions.

PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
 
PartitionPruningRDDPartition - Class in org.apache.spark.rdd
 
PartitionPruningRDDPartition(int, Partition) - Constructor for class org.apache.spark.rdd.PartitionPruningRDDPartition
 
partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike

Set of partitions in this RDD.

partitions() - Method in class org.apache.spark.rdd.PruneDependency
 
partitions() - Method in class org.apache.spark.rdd.RDD

Get the array of partitions of this RDD, taking into account whether the
 RDD is checkpointed or not.

partitions() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
partitions() - Method in class org.apache.spark.scheduler.ActiveJob
 
partitions() - Method in class org.apache.spark.scheduler.JobSubmitted
 
partitions() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
partitionSize(int) - Method in class org.apache.spark.graphx.impl.RoutingTablePartition

Returns the number of vertices that will be sent to the specified edge partition.

partitionsRDD() - Method in class org.apache.spark.graphx.EdgeRDD
 
partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
partitionsRDD() - Method in class org.apache.spark.graphx.VertexRDD
 
partitionStatistics() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
PartitionStatistics - Class in org.apache.spark.sql.columnar
 
PartitionStatistics(Seq<Attribute>) - Constructor for class org.apache.spark.sql.columnar.PartitionStatistics
 
PartitionStrategy - Interface in org.apache.spark.graphx

Represents the way edges are assigned to edge partitions based on their source and destination
 vertex IDs.

PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx

Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical
 direction, resulting in a random vertex cut that colocates all edges between two vertices,
 regardless of direction.

PartitionStrategy.CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
 
PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx

Assigns edges to partitions using only the source vertex ID, colocating edges with the same
 source.

PartitionStrategy.EdgePartition1D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
 
PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx

Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix,
 guaranteeing a 2 * sqrt(numParts) bound on vertex replication.

PartitionStrategy.EdgePartition2D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
 
PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx

Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a
 random vertex cut that colocates all same-direction edges between two vertices.

PartitionStrategy.RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
 
partitionToOps(VertexPartition<VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition

Implicit conversion to allow invoking VertexPartitionBase operations directly on a
 VertexPartition.

partitionValues() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
partitionValues() - Method in class org.apache.spark.sql.parquet.Partition
 
PartitionwiseSampledRDD<T,U> - Class in org.apache.spark.rdd

A RDD sampled from its parent RDD partition-wise.

PartitionwiseSampledRDD(RDD<T>, RandomSampler<T, U>, boolean, long, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.PartitionwiseSampledRDD
 
PartitionwiseSampledRDDPartition - Class in org.apache.spark.rdd
 
PartitionwiseSampledRDDPartition(Partition, long) - Constructor for class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
PassThrough - Class in org.apache.spark.sql.columnar.compression
 
PassThrough() - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough
 
PassThrough.Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
PassThrough.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
PassThrough.Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
PassThrough.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
path() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
path() - Method in class org.apache.spark.scheduler.SplitInfo
 
path() - Method in class org.apache.spark.sql.hive.AddJar
 
path() - Method in class org.apache.spark.sql.hive.execution.AddFile
 
path() - Method in class org.apache.spark.sql.hive.execution.AddJar
 
path() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
path() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
path() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
path() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
PEARSON() - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
PearsonCorrelation - Class in org.apache.spark.mllib.stat.correlation

Compute Pearson correlation for two RDDs of the type RDD[Double] or the correlation matrix
 for an RDD of the type RDD[Vector].

PearsonCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
 
pendingTasks() - Method in class org.apache.spark.scheduler.Stage
 
pendingTasksWithNoPrefs() - Method in class org.apache.spark.scheduler.TaskSetManager
 
pendingTimes() - Method in class org.apache.spark.streaming.Checkpoint
 
percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Set this RDD's storage level to persist its values across operations after the first time
 it is computed.

persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD

Set this RDD's storage level to persist its values across operations after the first time
 it is computed.

persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD

Set this RDD's storage level to persist its values across operations after the first time
 it is computed.

persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph

Caches the vertices and edges associated with this graph at the specified storage level,
 ignoring any target storage levels previously set.

persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl

Persists the edge partitions at the specified storage level, ignoring any existing target
 storage level.

persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl

Persists the vertex partitions at the specified storage level, ignoring any existing target
 storage level.

persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD

Set this RDD's storage level to persist its values across operations after the first time
 it is computed.

persist() - Method in class org.apache.spark.rdd.RDD

Persist this RDD with the default storage level (`MEMORY_ONLY`).

persist() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Persist this RDD with the default storage level (`MEMORY_ONLY`).

persist(StorageLevel) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Set this RDD's storage level to persist its values across operations after the first time
 it is computed.

persist(StorageLevel) - Method in class org.apache.spark.sql.SchemaRDD
 
persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream

Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)

persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream

Persist the RDDs of this DStream with the given storage level

persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)

persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Persist the RDDs of this DStream with the given storage level

persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream

Persist the RDDs of this DStream with the given storage level

persist() - Method in class org.apache.spark.streaming.dstream.DStream

Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)

persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
persistentRdds() - Method in class org.apache.spark.SparkContext
 
persistRDD(RDD<?>) - Method in class org.apache.spark.SparkContext

Register an RDD to be persisted in memory and/or disk storage

PhysicalRDD - Class in org.apache.spark.sql.execution
 
PhysicalRDD(Seq<Attribute>, RDD<Row>) - Constructor for class org.apache.spark.sql.execution.PhysicalRDD
 
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
pickBin(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer

Takes a parent RDD partition and decides which of the partition groups to put it in
 Takes locality into account, but also uses power of 2 choices to load balance
 It strikes a balance between the two use the balanceSlack variable

pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps

Picks a random vertex from the graph and returns its ID.

pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return an RDD created by piping elements to a forked external process.

pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return an RDD created by piping elements to a forked external process.

pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Return an RDD created by piping elements to a forked external process.

pipe(String) - Method in class org.apache.spark.rdd.RDD

Return an RDD created by piping elements to a forked external process.

pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD

Return an RDD created by piping elements to a forked external process.

pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD

Return an RDD created by piping elements to a forked external process.

PipedRDD<T> - Class in org.apache.spark.rdd

An RDD that pipes the contents of each parent partition through an external command
 (printing them one per line) and returns the output as a collection of strings.

PipedRDD(RDD<T>, Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PipedRDD
 
PipedRDD(RDD<T>, String, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PipedRDD
 
PipedRDD.NotEqualsFileNameFilter - Class in org.apache.spark.rdd

A FilenameFilter that accepts anything that isn't equal to the name passed in.

PipedRDD.NotEqualsFileNameFilter(String) - Constructor for class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
 
Pipeline - Class in org.apache.spark.ml

:: AlphaComponent ::
 A simple pipeline, which acts as an estimator.

Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
 
PipelineModel - Class in org.apache.spark.ml

:: AlphaComponent ::
 Represents a compiled pipeline.

PipelineModel(Pipeline, ParamMap, Transformer[]) - Constructor for class org.apache.spark.ml.PipelineModel
 
PipelineStage - Class in org.apache.spark.ml

:: AlphaComponent ::
 A stage in a pipeline, either an Estimator or a Transformer.

PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
 
plan() - Method in class org.apache.spark.sql.CachedData
 
plan() - Method in class org.apache.spark.sql.execution.CacheTableCommand
 
PluggableInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
PluggableInputDStream(StreamingContext, Receiver<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.PluggableInputDStream
 
plus(Duration) - Method in class org.apache.spark.streaming.Duration
 
plus(Duration) - Method in class org.apache.spark.streaming.Time
 
plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector

return (this + plus) dot other, but without creating any intermediate storage

point() - Method in class org.apache.spark.mllib.feature.VocabWord
 
pointCost(TraversableOnce<VectorWithNorm>, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans

Returns the K-means cost of a given point against the given cluster centers.

POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
PoissonBounds - Class in org.apache.spark.util.random

Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact
 sample sizes with high confidence when sampling with replacement.

PoissonBounds() - Constructor for class org.apache.spark.util.random.PoissonBounds
 
PoissonGenerator - Class in org.apache.spark.mllib.random

:: DeveloperApi ::
 Generates i.i.d.

PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
 
poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Java-friendly version of RandomRDDs.poissonRDD(org.apache.spark.SparkContext, double, long, int, long).

poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default seed.

poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default number of partitions and the default seed.

poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Java-friendly version of RandomRDDs.poissonVectorRDD(org.apache.spark.SparkContext, double, long, int, int, long).

poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default seed.

poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default number of partitions and the default seed.

poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Generates an RDD comprised of i.i.d.

PoissonSampler<T> - Class in org.apache.spark.util.random

:: DeveloperApi ::
 A sampler for sampling with replacement, based on values drawn from Poisson distribution.

PoissonSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
 
poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Generates an RDD[Vector] with vectors containing i.i.d.

POLL_TIMEOUT() - Static method in class org.apache.spark.scheduler.DAGScheduler
 
pollDir() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
Pool - Class in org.apache.spark.scheduler

An Schedulable entity that represent collection of Pools or TaskSetManagers

Pool(String, Enumeration.Value, int, int) - Constructor for class org.apache.spark.scheduler.Pool
 
POOL_NAME_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
poolName() - Method in class org.apache.spark.scheduler.Pool
 
PoolPage - Class in org.apache.spark.ui.jobs

Page showing specific pool details

PoolPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.PoolPage
 
POOLS_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
PoolTable - Class in org.apache.spark.ui.jobs

Table showing list of pools

PoolTable(Seq<Schedulable>, StagesTab) - Constructor for class org.apache.spark.ui.jobs.PoolTable
 
poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
port() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
port() - Method in class org.apache.spark.storage.BlockManagerId
 
PortableDataStream - Class in org.apache.spark.input

A class that allows DataStreams to be serialized and moved around by not creating them
 until they need to be read

PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
 
portMaxRetries(SparkConf) - Static method in class org.apache.spark.util.Utils

Maximum number of retries when binding to a port before giving up.

pos() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
pos() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
post(SparkListenerEvent) - Method in class org.apache.spark.scheduler.LiveListenerBus
 
post(StreamingListenerEvent) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
postStartHook() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
postStartHook() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
postStop() - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessActor
 
postToAll(SparkListenerEvent) - Method in interface org.apache.spark.scheduler.SparkListenerBus

Post an event to all attached listeners.

pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

Returns the precision-recall curve, which is an RDD of (recall, precision),
 NOT (precision, recall), with (0.0, 1.0) prepended to it.

Precision - Class in org.apache.spark.mllib.evaluation.binary

Precision.

Precision() - Constructor for class org.apache.spark.mllib.evaluation.binary.Precision
 
precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns precision for a given label (category)

precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns precision

precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns document-based precision averaged by the number of documents

precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns precision for a given label (category)

precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics

Compute the average precision of all the queries, truncated at ranking position k.

precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

Returns the (threshold, precision) curve.

predicates() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel

Predict values for the given data set using the model trained.

predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel

Predict values for a single data point using the model trained.

predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel

Predict values for examples stored in a JavaRDD.

predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel

Returns the cluster index that a given point belongs to.

predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel

Maps given points to their cluster indices.

predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel

Maps given points to their cluster indices.

predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel

Predict the rating of one user for one product.

predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel

Predict the rating of many users for many products.

predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel

Java-friendly version of MatrixFactorizationModel.predict.

predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel

Predict values for the given data set using the model trained.

predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel

Predict values for a single data point using the model trained.

predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel

Predict values for the given data set using the model trained.

predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel

Predict values for a single data point using the model trained.

predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel

Predict values for examples stored in a JavaRDD.

predict() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator

Prediction which should be made based on the sufficient statistics.

predict() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator

Prediction which should be made based on the sufficient statistics.

predict() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator

Prediction which should be made based on the sufficient statistics.

predict() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator

Prediction which should be made based on the sufficient statistics.

predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel

Predict values for a single data point using the model trained.

predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel

Predict values for the given data set using the model trained.

predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel

Predict values for the given data set using the model trained.

predict() - Method in class org.apache.spark.mllib.tree.model.Node
 
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node

predict value if node is not leaf

Predict - Class in org.apache.spark.mllib.tree.model

Predicted value for a node

Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
 
predict() - Method in class org.apache.spark.mllib.tree.model.Predict
 
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel

Predict values for a single data point using the model trained.

predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel

Predict values for the given data set.

predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel

Java-friendly version of TreeEnsembleModel.predict(org.apache.spark.mllib.linalg.Vector).

predictionCol() - Method in interface org.apache.spark.ml.param.HasPredictionCol

param for prediction column name

predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Use the clustering model to make predictions on batches of data from a DStream.

predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm

Use the model to make predictions on batches of data from a DStream

predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Use the model to make predictions on the values of a DStream and carry over its keys.

predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm

Use the model to make predictions on the values of a DStream and carry over its keys.

preferredLocation() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
preferredLocation() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver

Override this to specify a preferred location (hostname).

preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD

Get the preferred locations of a partition, taking into account whether the
 RDD is checkpointed.

preferredLocations() - Method in class org.apache.spark.rdd.UnionPartition
 
preferredLocations() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
preferredLocations() - Method in class org.apache.spark.scheduler.ResultTask
 
preferredLocations() - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
preferredLocations() - Method in class org.apache.spark.scheduler.Task
 
preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
 
prefix() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
PREFIX() - Static method in class org.apache.spark.streaming.Checkpoint
 
prefix() - Method in class org.apache.spark.ui.WebUIPage
 
prefix() - Method in class org.apache.spark.ui.WebUITab
 
prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
 
pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps

Execute a Pregel-like iterative vertex-parallel abstraction.

Pregel - Class in org.apache.spark.graphx

Implements a Pregel-like bulk-synchronous message-passing API.

Pregel() - Constructor for class org.apache.spark.graphx.Pregel
 
PreInsertionCasts() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
prepare(int) - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
prepareForRead(Configuration, Map<String, String>, MessageType, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.RowReadSupport
 
prepareForWrite(RecordConsumer) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
prepareForWrite(RecordConsumer) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
prependBaseUri(String, String) - Static method in class org.apache.spark.ui.UIUtils
 
preSetup() - Method in class org.apache.spark.SparkHadoopWriter
 
preStart() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
preStart() - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessActor
 
preStart() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
preStart() - Method in class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
prettyPrint() - Method in class org.apache.spark.streaming.Duration
 
prev() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
prev() - Method in class org.apache.spark.rdd.CoalescedRDD
 
prev() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
prev() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
prev() - Method in class org.apache.spark.rdd.ShuffledRDD
 
prev() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
prevHandler() - Method in class org.apache.spark.util.SignalLoggerHandler
 
primitiveType() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Print the first ten elements of each RDD generated in this DStream.

print() - Method in class org.apache.spark.streaming.dstream.DStream

Print the first ten elements of each RDD generated in this DStream.

printSchema() - Method in interface org.apache.spark.sql.SchemaRDDLike

Prints out the schema.

printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
prioritizeContainers(HashMap<K, ArrayBuffer<T>>) - Static method in class org.apache.spark.scheduler.TaskSchedulerImpl

Used to balance containers across hosts.

priority() - Method in class org.apache.spark.scheduler.Pool
 
priority() - Method in interface org.apache.spark.scheduler.Schedulable
 
priority() - Method in class org.apache.spark.scheduler.TaskSet
 
priority() - Method in class org.apache.spark.scheduler.TaskSetManager
 
prob(double) - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator

Probability of the label given by predict.

prob(double) - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator

Probability of the label given by predict.

prob(double) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator

Probability of the label given by predict, or -1 if no probability is available.

prob() - Method in class org.apache.spark.mllib.tree.model.Predict
 
probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo

Time taken for the all jobs of this batch to finish processing from the time they started
 processing.

processingDelay() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
processingDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
processRecords(List<Record>, IRecordProcessorCheckpointer) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor

This method is called by the KCL when a batch of records is pulled from the Kinesis stream.

processResults(ArrayList<Object>) - Static method in class org.apache.spark.sql.hive.HiveShim
 
product() - Method in class org.apache.spark.mllib.recommendation.Rating
 
productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
productToRowRdd(RDD<A>, StructType) - Static method in class org.apache.spark.sql.execution.RDDConversions
 
progressBar() - Method in class org.apache.spark.SparkContext
 
progressListener() - Method in class org.apache.spark.streaming.StreamingContext
 
Project - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

Project(Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Project
 
projectList() - Method in class org.apache.spark.sql.execution.Project
 
properties() - Method in class org.apache.spark.metrics.MetricsConfig
 
properties() - Method in class org.apache.spark.scheduler.ActiveJob
 
properties() - Method in class org.apache.spark.scheduler.JobSubmitted
 
properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
properties() - Method in class org.apache.spark.scheduler.TaskSet
 
propertiesFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
propertiesToJson(Properties) - Static method in class org.apache.spark.util.JsonProtocol
 
property() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
property() - Method in class org.apache.spark.metrics.sink.CsvSink
 
property() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
property() - Method in class org.apache.spark.metrics.sink.JmxSink
 
property() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
propertyCategories() - Method in class org.apache.spark.metrics.MetricsConfig
 
propertyToOption(String) - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
provider() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
proxyBase() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
pruneColumns(Seq<Attribute>) - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
PruneDependency<T> - Class in org.apache.spark.rdd

Represents a dependency between the PartitionPruningRDD and its parent.

PruneDependency(RDD<T>, Function1<Object, Object>) - Constructor for class org.apache.spark.rdd.PruneDependency
 
PrunedFilteredScan - Class in org.apache.spark.sql.sources

::DeveloperApi::
 A BaseRelation that can eliminate unneeded columns and filter using selected
 predicates before producing an RDD containing all matching tuples as Row objects.

PrunedFilteredScan() - Constructor for class org.apache.spark.sql.sources.PrunedFilteredScan
 
PrunedScan - Class in org.apache.spark.sql.sources

::DeveloperApi::
 A BaseRelation that can eliminate unneeded columns before producing an RDD
 containing all of its tuples as Row objects.

PrunedScan() - Constructor for class org.apache.spark.sql.sources.PrunedScan
 
prunePartitions(Seq<Partition>) - Method in class org.apache.spark.sql.hive.execution.HiveTableScan

Prunes partitions not involve the query plan.

Pseudorandom - Interface in org.apache.spark.util.random

:: DeveloperApi ::
 A class with pseudorandom behavior.

pushAndReportBlock(ReceivedBlock, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl

Store block and report it to driver

pushArrayBuffer(ArrayBuffer<?>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Store an ArrayBuffer of received data as a data block into Spark's memory.

pushArrayBuffer(ArrayBuffer<?>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl

Store an ArrayBuffer of received data as a data block into Spark's memory.

pushBytes(ByteBuffer, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Store the bytes of received data as a data block into Spark's memory.

pushBytes(ByteBuffer, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl

Store the bytes of received data as a data block into Spark's memory.

pushIterator(Iterator<Object>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Store a iterator of received data as a data block into Spark's memory.

pushIterator(Iterator<Object>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl

Store a iterator of received data as a data block into Spark's memory.

pushSingle(Object) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Push a single data item to backend data store.

pushSingle(Object) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl

Push a single record of received data into block generator.

put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap

Puts a list of param pairs (overwrites if the input params exists).

put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap

Puts a (param, value) pair (overwrites if the input param exists).

put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap

Puts a list of param pairs (overwrites if the input params exists).

putAll(Map<A, B>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
putArray(BlockId, Object[], StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager

Put a new block of values to the block manager.

putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.DiskStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.MemoryStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.TachyonStore
 
putBlockData(BlockId, ManagedBuffer, StorageLevel) - Method in class org.apache.spark.storage.BlockManager

Put the block locally, using the given storage level.

putBytes(BlockId, ByteBuffer, StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager

Put a new block of serialized bytes to the block manager.

putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.BlockStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.DiskStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.MemoryStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.TachyonStore
 
putCachedMetadata(String, Object) - Static method in class org.apache.spark.rdd.HadoopRDD
 
putIfAbsent(A, B) - Method in class org.apache.spark.util.TimeStampedHashMap
 
putIfAbsent(A, B) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockStore

Put in a block and, possibly, also return its content as either bytes or another Iterator.

putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.DiskStore
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.MemoryStore
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean, boolean) - Method in class org.apache.spark.storage.MemoryStore

Attempt to put the given block in memory store.

putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.TachyonStore
 
PutResult - Class in org.apache.spark.storage

Result of adding a block into a BlockStore.

PutResult(long, Either<Iterator<Object>, ByteBuffer>, Seq<Tuple2<BlockId, BlockStatus>>) - Constructor for class org.apache.spark.storage.PutResult
 
putSingle(BlockId, Object, StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockManager

Write a block consisting of a single object.

pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult

The probability of obtaining a test statistic result at least as extreme as the one that was
 actually observed, assuming that the null hypothesis is true.

pythonExec() - Method in class org.apache.spark.sql.execution.PythonUDF
 
pythonIncludes() - Method in class org.apache.spark.sql.execution.PythonUDF
 
PythonUDF - Class in org.apache.spark.sql.execution

A serialized version of a Python lambda function.

PythonUDF(String, byte[], Map<String, String>, List<String>, String, List<Broadcast<PythonBroadcast>>, Accumulator<List<byte[]>>, DataType, Seq<Expression>) - Constructor for class org.apache.spark.sql.execution.PythonUDF
 
pyUDT() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
pyUDT() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 




Q

quantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
QuantileStrategy - Class in org.apache.spark.mllib.tree.configuration

:: Experimental ::
 Enum for selecting the quantile calculation strategy

QuantileStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
quantileStrategy() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
query() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
queryExecution() - Method in interface org.apache.spark.sql.SchemaRDDLike

:: DeveloperApi ::
 A lazily computed query execution workflow.

QueryExecutionException - Exception in org.apache.spark.sql.execution
 
QueryExecutionException(String) - Constructor for exception org.apache.spark.sql.execution.QueryExecutionException
 
queue() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
QueueInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
QueueInputDStream(StreamingContext, Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.QueueInputDStream
 
queueIsEmpty() - Method in class org.apache.spark.scheduler.LiveListenerBus

Return whether the event queue is empty.

queueStream(Queue<JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream from an queue of RDDs.

queueStream(Queue<JavaRDD<T>>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream from an queue of RDDs.

queueStream(Queue<JavaRDD<T>>, boolean, JavaRDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream from an queue of RDDs.

queueStream(Queue<RDD<T>>, boolean, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext

Create an input stream from a queue of RDDs.

queueStream(Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext

Create an input stream from a queue of RDDs.





R

r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics

Returns R^2^, the coefficient of determination.

RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices

Generate a DenseMatrix consisting of i.i.d.

RAND() - Static method in class org.apache.spark.sql.hive.HiveQl
 
randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices

Generate a DenseMatrix consisting of i.i.d.

RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
 
random() - Static method in class org.apache.spark.util.Utils
 
random(int, Random) - Static method in class org.apache.spark.util.Vector

Creates this Vector of given length containing random numbers
 between 0.0 and 1.0.

RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random

:: DeveloperApi ::
 Trait for random data generators that generate i.i.d.

RandomForest - Class in org.apache.spark.mllib.tree

:: Experimental ::
 A class that implements a Random Forest
 learning algorithm for classification and regression.

RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
 
RandomForest.NodeIndexInfo - Class in org.apache.spark.mllib.tree
 
RandomForest.NodeIndexInfo(int, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
RandomForestModel - Class in org.apache.spark.mllib.tree.model

:: Experimental ::
 Represents a random forest model.

RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
 
randomize(TraversableOnce<T>, ClassTag<T>) - Static method in class org.apache.spark.util.Utils

Shuffle the elements of a collection into a random order, returning the
 result in a new collection.

randomizeInPlace(Object, Random) - Static method in class org.apache.spark.util.Utils

Shuffle the elements of an array into a random order, modifying the
 original array.

randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs

:: DeveloperApi ::
 Generates an RDD comprised of i.i.d.

RandomRDD<T> - Class in org.apache.spark.mllib.rdd
 
RandomRDD(SparkContext, long, int, RandomDataGenerator<T>, long, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RandomRDD
 
RandomRDDPartition<T> - Class in org.apache.spark.mllib.rdd
 
RandomRDDPartition(int, int, RandomDataGenerator<T>, long) - Constructor for class org.apache.spark.mllib.rdd.RandomRDDPartition
 
RandomRDDs - Class in org.apache.spark.mllib.random

:: Experimental ::
 Generator methods for creating RDDs comprised of i.i.d.

RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
 
RandomSampler<T,U> - Interface in org.apache.spark.util.random

:: DeveloperApi ::
 A pseudorandom sampler.

randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD

Randomly splits this RDD with the provided weights.

randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD

Randomly splits this RDD with the provided weights.

randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD

Randomly splits this RDD with the provided weights.

randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

:: DeveloperApi ::
 Generates an RDD[Vector] with vectors containing i.i.d.

RandomVectorRDD - Class in org.apache.spark.mllib.rdd
 
RandomVectorRDD(SparkContext, long, int, int, RandomDataGenerator<Object>, long) - Constructor for class org.apache.spark.mllib.rdd.RandomVectorRDD
 
RangeDependency<T> - Class in org.apache.spark

:: DeveloperApi ::
 Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.

RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
 
RangePartitioner<K,V> - Class in org.apache.spark

A Partitioner that partitions sortable records by range into roughly
 equal ranges.

RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
 
rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation

::Experimental::
 Evaluator for ranking algorithms.

RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
 
RateLimitedOutputStream - Class in org.apache.spark.streaming.util
 
RateLimitedOutputStream(OutputStream, int) - Constructor for class org.apache.spark.streaming.util.RateLimitedOutputStream
 
RateLimiter - Class in org.apache.spark.streaming.receiver

Provides waitToPush() method to limit the rate at which receivers consume data.

RateLimiter(SparkConf) - Constructor for class org.apache.spark.streaming.receiver.RateLimiter
 
Rating - Class in org.apache.spark.mllib.recommendation

:: Experimental ::
 A more compact class to represent a rating than Tuple3[Int, Int, Double].

Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
 
rating() - Method in class org.apache.spark.mllib.recommendation.Rating
 
ratingsForBlock() - Method in class org.apache.spark.mllib.recommendation.InLinkBlock
 
RawInputDStream<T> - Class in org.apache.spark.streaming.dstream

An input stream that reads blocks of serialized objects from a given network address.

RawInputDStream(StreamingContext, String, int, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.RawInputDStream
 
RawNetworkReceiver - Class in org.apache.spark.streaming.dstream
 
RawNetworkReceiver(String, int, StorageLevel) - Constructor for class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream from network source hostname:port, where data is received
 as serialized blocks (serialized using the Spark's serializer) that can be directly
 pushed into the block manager without deserializing them.

rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream from network source hostname:port, where data is received
 as serialized blocks (serialized using the Spark's serializer) that can be directly
 pushed into the block manager without deserializing them.

rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext

Create a input stream from network source hostname:port, where data is received
 as serialized blocks (serialized using the Spark's serializer) that can be directly
 pushed into the block manager without deserializing them.

RawTextHelper - Class in org.apache.spark.streaming.util
 
RawTextHelper() - Constructor for class org.apache.spark.streaming.util.RawTextHelper
 
RawTextSender - Class in org.apache.spark.streaming.util

A helper program that sends blocks of Kryo-serialized text strings out on a socket at a
 specified rate.

RawTextSender() - Constructor for class org.apache.spark.streaming.util.RawTextSender
 
rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
 
rdd() - Method in class org.apache.spark.api.java.JavaRDD
 
rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
rdd() - Method in class org.apache.spark.Dependency
 
rdd() - Method in class org.apache.spark.NarrowDependency
 
rdd() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
rdd() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
RDD<T> - Class in org.apache.spark.rdd

A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.

RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
 
RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD

Construct an RDD with just a one-to-one dependency on one parent

rdd() - Method in class org.apache.spark.scheduler.Stage
 
rdd() - Method in class org.apache.spark.ShuffleDependency
 
rdd() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
rdd() - Method in class org.apache.spark.sql.execution.ExistingRdd
 
rdd() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
rdd() - Method in class org.apache.spark.sql.execution.PhysicalRDD
 
RDD() - Static method in class org.apache.spark.storage.BlockId
 
rdd1() - Method in class org.apache.spark.rdd.CartesianRDD
 
rdd1() - Method in class org.apache.spark.rdd.SubtractedRDD
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd2() - Method in class org.apache.spark.rdd.CartesianRDD
 
rdd2() - Method in class org.apache.spark.rdd.SubtractedRDD
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd3() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd3() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd4() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
RDDBlockId - Class in org.apache.spark.storage
 
RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
 
rddBlocks() - Method in class org.apache.spark.storage.StorageStatus

Return the RDD blocks stored in this block manager.

rddBlocks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus

Return the blocks that belong to the given RDD stored in this block manager.

RDDCheckpointData<T> - Class in org.apache.spark.rdd

This class contains all the information related to RDD checkpointing.

RDDCheckpointData(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDDCheckpointData
 
rddCleaned(int) - Method in interface org.apache.spark.CleanerListener
 
RDDConversions - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

RDDConversions() - Constructor for class org.apache.spark.sql.execution.RDDConversions
 
RDDFunctions<T> - Class in org.apache.spark.mllib.rdd

Machine learning specific RDD functions.

RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
 
rddId() - Method in class org.apache.spark.CleanRDD
 
rddId() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
rddId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
 
rddId() - Method in class org.apache.spark.storage.RDDBlockId
 
RDDInfo - Class in org.apache.spark.storage
 
RDDInfo(int, String, int, StorageLevel) - Constructor for class org.apache.spark.storage.RDDInfo
 
rddInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener

Filter RDD info to include only those with cached partitions

rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
 
rddInfoToJson(RDDInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
RDDPage - Class in org.apache.spark.ui.storage

Page showing storage details for a given RDD

RDDPage(StorageTab) - Constructor for class org.apache.spark.ui.storage.RDDPage
 
rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
rdds() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
rdds() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
rdds() - Method in class org.apache.spark.rdd.UnionRDD
 
rdds() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus

Return the storage level, if any, used by the given RDD in this block manager.

rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
 
rddToFileName(String, String, Time) - Static method in class org.apache.spark.streaming.StreamingContext
 
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
 
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
 
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
 
read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
read(Kryo, Input, Class<BigDecimal>) - Method in class org.apache.spark.sql.execution.BigDecimalSerializer
 
read(Kryo, Input, Class<HyperLogLog>) - Method in class org.apache.spark.sql.execution.HyperLogLogSerializer
 
read(Kryo, Input, Class<IntegerHashSet>) - Method in class org.apache.spark.sql.execution.IntegerHashSetSerializer
 
read(Kryo, Input, Class<LongHashSet>) - Method in class org.apache.spark.sql.execution.LongHashSetSerializer
 
read(Kryo, Input, Class<OpenHashSet<?>>) - Method in class org.apache.spark.sql.execution.OpenHashSetSerializer
 
read(String, SparkConf, Configuration) - Static method in class org.apache.spark.streaming.CheckpointReader
 
read(WriteAheadLogFileSegment) - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
read() - Method in class org.apache.spark.util.ByteBufferInputStream
 
read(byte[]) - Method in class org.apache.spark.util.ByteBufferInputStream
 
read(byte[], int, int) - Method in class org.apache.spark.util.ByteBufferInputStream
 
readBatches() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.DirectTaskResult
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
 
readExternal(ObjectInput) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
 
readExternal(ObjectInput) - Static method in class org.apache.spark.streaming.flume.EventTransformer
 
readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
readFromFile(Path, Broadcast<SerializableWritable<Configuration>>, TaskContext) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
readFromLog() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager

Read all the existing logs from the log directory.

readLock(Function0<A>) - Method in interface org.apache.spark.sql.CacheManager

Acquires a read lock on the cache for the duration of `f`.

readMetaData(Path, Option<Configuration>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter

Try to read Parquet metadata at the given Path.

readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
 
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.JavaDeserializationStream
 
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.KryoDeserializationStream
 
readPartitions() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
readSchemaFromFile(Path, Option<Configuration>, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter

Reads in Parquet Metadata from the given path and tries to extract the schema
 (Catalyst attributes) from the application-specific key-value map.

ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
 
ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction

Blocks until this action completes.

ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
 
RealClock - Class in org.apache.spark

A clock backed by a monotonically increasing time source.

RealClock() - Constructor for class org.apache.spark.RealClock
 
reason() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
reason() - Method in class org.apache.spark.scheduler.CompletionEvent
 
reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
reason() - Method in class org.apache.spark.scheduler.TaskSetFailed
 
recache() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
Recall - Class in org.apache.spark.mllib.evaluation.binary

Recall.

Recall() - Constructor for class org.apache.spark.mllib.evaluation.binary.Recall
 
recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns recall for a given label (category)

recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns recall
 (equals to precision for multiclass classifier
 because sum of all false positives is equal to sum
 of all false negatives)

recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns document-based recall averaged by the number of documents

recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns recall for a given label (category)

recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

Returns the (threshold, recall) curve.

receive() - Method in class org.apache.spark.scheduler.DAGSchedulerActorSupervisor
 
receive() - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessActor

The main event loop of the DAG scheduler.

receive() - Method in class org.apache.spark.streaming.dstream.SocketReceiver

Create a socket connection and receive data until receiver is stopped

receive() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
receive() - Method in class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
receive() - Method in interface org.apache.spark.util.ActorLogReceive
 
ReceivedBlock - Interface in org.apache.spark.streaming.receiver

Trait representing a received block

ReceivedBlockHandler - Interface in org.apache.spark.streaming.receiver

Trait that represents a class that handles the storage of blocks received by receiver

receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.AddBlock
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BlockAdditionEvent
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
ReceivedBlockInfo - Class in org.apache.spark.streaming.scheduler

Information about blocks received by the receiver

ReceivedBlockInfo(int, long, ReceivedBlockStoreResult) - Constructor for class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
ReceivedBlockStoreResult - Interface in org.apache.spark.streaming.receiver

Trait that represents the metadata related to storage of blocks

ReceivedBlockTracker - Class in org.apache.spark.streaming.scheduler

Class that keep track of all the received blocks, and allocate them to batches
 when required.

ReceivedBlockTracker(SparkConf, Configuration, Seq<Object>, Clock, Option<String>) - Constructor for class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
 
ReceivedBlockTrackerLogEvent - Interface in org.apache.spark.streaming.scheduler

Trait representing any event in the ReceivedBlockTracker that updates its state.

receivedRecordsDistributions() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
Receiver<T> - Class in org.apache.spark.streaming.receiver

:: DeveloperApi ::
 Abstract class of a receiver that can be run on worker nodes to receive external data.

Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
 
receiverActor() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
receiverExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
ReceiverInfo - Class in org.apache.spark.streaming.scheduler

:: DeveloperApi ::
 Class having information about a receiver

ReceiverInfo(int, String, ActorRef, boolean, String, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
receiverInfo(int) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream

Abstract class for defining any InputDStream
 that has to start a receiver on worker nodes to receive external data.

ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
ReceiverMessage - Interface in org.apache.spark.streaming.receiver

Messages sent to the Receiver.

ReceiverState() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
 
receiverState() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

State of the receiver

receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream with any arbitrary user implemented receiver.

receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext

Create an input stream with any arbitrary user implemented receiver.

ReceiverSupervisor - Class in org.apache.spark.streaming.receiver

Abstract class that is responsible for supervising a Receiver in the worker.

ReceiverSupervisor(Receiver<?>, SparkConf) - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisor
 
ReceiverSupervisor.ReceiverState - Class in org.apache.spark.streaming.receiver
 
ReceiverSupervisor.ReceiverState() - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState

Enumeration to identify current state of the StreamingContext

ReceiverSupervisorImpl - Class in org.apache.spark.streaming.receiver

Concrete implementation of ReceiverSupervisor
 which provides all the necessary functionality for handling the data received by
 the receiver.

ReceiverSupervisorImpl(Receiver<?>, SparkEnv, Configuration, Option<String>) - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
 
receiverTracker() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
ReceiverTracker - Class in org.apache.spark.streaming.scheduler

This class manages the execution of the receivers of ReceiverInputDStreams.

ReceiverTracker(StreamingContext, boolean) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverTracker
 
ReceiverTracker.ReceiverLauncher - Class in org.apache.spark.streaming.scheduler

This thread class runs all the receivers on the cluster.

ReceiverTracker.ReceiverLauncher() - Constructor for class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
ReceiverTrackerMessage - Interface in org.apache.spark.streaming.scheduler

Messages used by the NetworkReceiver and the ReceiverTracker to communicate
 with each other.

receiveWithLogging() - Method in class org.apache.spark.HeartbeatReceiver
 
receiveWithLogging() - Method in class org.apache.spark.MapOutputTrackerMasterActor
 
receiveWithLogging() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
receiveWithLogging() - Method in class org.apache.spark.scheduler.local.LocalActor
 
receiveWithLogging() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
receiveWithLogging() - Method in class org.apache.spark.storage.BlockManagerSlaveActor
 
receiveWithLogging() - Method in interface org.apache.spark.util.ActorLogReceive
 
recentExceptions() - Method in class org.apache.spark.scheduler.TaskSetManager
 
recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel

Recommends products to a user.

recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel

Recommends users to a product.

recomputeLocality() - Method in class org.apache.spark.scheduler.TaskSetManager
 
RECORD_LENGTH_PROPERTY() - Static method in class org.apache.spark.input.FixedLengthBinaryInputFormat

Property name to set in Hadoop JobConfs for record length

recordProcessorFactory() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD

Update the input bytes read metric each time this number of records has been read

RECORDS_BETWEEN_BYTES_WRITTEN_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.PairRDDFunctions
 
RecurringTimer - Class in org.apache.spark.streaming.util
 
RecurringTimer(Clock, long, Function1<Object, BoxedUnit>, String) - Constructor for class org.apache.spark.streaming.util.RecurringTimer
 
RedirectThread - Class in org.apache.spark.util

A utility class to redirect the child process's stdout or stderr.

RedirectThread(InputStream, OutputStream, String, boolean) - Constructor for class org.apache.spark.util.RedirectThread
 
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Reduces the elements of this RDD using the specified commutative and associative binary
 operator.

reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD

Reduces the elements of this RDD using the specified commutative and
 associative binary operator.

reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD has a single element generated by reducing each RDD
 of this DStream.

reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD has a single element generated by reducing each RDD
 of this DStream.

reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Merge the values for each key using an associative reduce function.

reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Merge the values for each key using an associative reduce function.

reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Merge the values for each key using an associative reduce function.

reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Merge the values for each key using an associative reduce function.

reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions

Merge the values for each key using an associative reduce function.

reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Merge the values for each key using an associative reduce function.

reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying reduceByKey to each RDD.

reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying reduceByKey to each RDD.

reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying reduceByKey to each RDD.

reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying reduceByKey to each RDD.

reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying reduceByKey to each RDD.

reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying reduceByKey to each RDD.

reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Create a new DStream by applying reduceByKey over a sliding window on this DStream.

reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by reducing over a using incremental computation.

reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying incremental reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying incremental reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying reduceByKey over a sliding window on this DStream.

reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying incremental reduceByKey over a sliding window.

reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying incremental reduceByKey over a sliding window.

reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Merge the values for each key using an associative reduce function, but return the results
 immediately to the master as a Map.

reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Merge the values for each key using an associative reduce function, but return the results
 immediately to the master as a Map.

reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Alias for reduceByKeyLocally

reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD has a single element generated by reducing all
 elements in a sliding window over this DStream.

reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD has a single element generated by reducing all
 elements in a sliding window over this DStream.

reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD has a single element generated by reducing all
 elements in a sliding window over this DStream.

reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD has a single element generated by reducing all
 elements in a sliding window over this DStream.

reducedStream() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
ReducedWindowedDStream<K,V> - Class in org.apache.spark.streaming.dstream
 
ReducedWindowedDStream(DStream<Tuple2<K, V>>, Function2<V, V, V>, Function2<V, V, V>, Option<Function1<Tuple2<K, V>, Object>>, Duration, Duration, Partitioner, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
reduceId() - Method in class org.apache.spark.FetchFailed
 
reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
REGEX() - Static method in class org.apache.spark.streaming.Checkpoint
 
REGEXP() - Static method in class org.apache.spark.sql.hive.HiveQl
 
register(Accumulable<?, ?>, boolean) - Static method in class org.apache.spark.Accumulators
 
register() - Method in class org.apache.spark.streaming.dstream.DStream

Register this streaming as an output stream.

register(Logger) - Static method in class org.apache.spark.util.SignalLogger

Register a signal handler to log signals on UNIX-like systems.

registerAsTable(String) - Method in interface org.apache.spark.sql.SchemaRDDLike
 
registerBlockManager(BlockManagerId, long, ActorRef) - Method in class org.apache.spark.storage.BlockManagerMaster

Register the BlockManager's id with the driver.

registerBroadcastForCleanup(Broadcast<T>) - Method in class org.apache.spark.ContextCleaner

Register a Broadcast for cleanup when it is garbage collected.

registerClasses(Kryo) - Method in class org.apache.spark.graphx.GraphKryoRegistrator
 
registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
 
registered(SchedulerDriver, Protos.FrameworkID, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
registered(SchedulerDriver, Protos.FrameworkID, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
registeredLock() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
registeredLock() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
registerFunction(String, UDF1<?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF2<?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF3<?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in interface org.apache.spark.sql.api.java.UDFRegistration
 
registerFunction(String, Function1<?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration

registerFunction 1-22 were generated by this script

registerFunction(String, Function2<?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function3<?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function4<?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function5<?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function6<?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function7<?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function8<?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function9<?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerFunction(String, Function22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils

Registers classes that GraphX uses with Kryo.

registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf

Use Kryo serialization and register the given set of classes with Kryo.

registerMapOutput(int, int, MapStatus) - Method in class org.apache.spark.MapOutputTrackerMaster
 
registerMapOutputs(int, MapStatus[], boolean) - Method in class org.apache.spark.MapOutputTrackerMaster

Register multiple map output information for the given shuffle

registerPython(String, byte[], Map<String, String>, List<String>, String, List<Broadcast<PythonBroadcast>>, Accumulator<List<byte[]>>, String) - Method in interface org.apache.spark.sql.UDFRegistration
 
registerRDDAsTable(JavaSchemaRDD, String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext

Registers the given RDD as a temporary table in the catalog.

registerRDDAsTable(SchemaRDD, String) - Method in class org.apache.spark.sql.SQLContext

Registers the given RDD as a temporary table in the catalog.

registerRDDForCleanup(RDD<?>) - Method in class org.apache.spark.ContextCleaner

Register a RDD for cleanup when it is garbage collected.

RegisterReceiver - Class in org.apache.spark.streaming.scheduler
 
RegisterReceiver(int, String, String, ActorRef) - Constructor for class org.apache.spark.streaming.scheduler.RegisterReceiver
 
registerShuffle(int, int) - Method in class org.apache.spark.MapOutputTrackerMaster
 
registerShuffleForCleanup(ShuffleDependency<?, ?, ?>) - Method in class org.apache.spark.ContextCleaner

Register a ShuffleDependency for cleanup when it is garbage collected.

registerShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
registerShutdownDeleteDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
registerSource(Source) - Method in class org.apache.spark.metrics.MetricsSystem
 
registerTable(Seq<String>, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog

UNIMPLEMENTED: It needs to be decided how we will persist in-memory tables to the metastore.

registerTempTable(String) - Method in interface org.apache.spark.sql.SchemaRDDLike

Registers this RDD as a temporary table using the given name.

registerTestTable(TestHiveContext.TestTable) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
registrationDone() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
registrationLock() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
registry() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
registry() - Method in class org.apache.spark.metrics.sink.CsvSink
 
registry() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
registry() - Method in class org.apache.spark.metrics.sink.JmxSink
 
registry() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
regParam() - Method in interface org.apache.spark.ml.param.HasRegParam

param for regularization parameter

Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
RegressionMetrics - Class in org.apache.spark.mllib.evaluation

:: Experimental ::
 Evaluator for regression.

RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
 
RegressionModel - Interface in org.apache.spark.mllib.regression
 
reindex() - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps

Construct a new VertexPartition whose index contains only the vertices in the mask.

reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
reindex() - Method in class org.apache.spark.graphx.VertexRDD

Construct a new VertexRDD that is indexed by only the visible vertices.

relation() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
relation() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
relation() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
relation() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
relation() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
RelationProvider - Interface in org.apache.spark.sql.sources

::DeveloperApi::
 Implemented by objects that produce relations for a specific kind of data source.

relativeDirection(long) - Method in class org.apache.spark.graphx.Edge

Return the relative direction of the edge to the corresponding
 vertex.

releasePythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
 
releaseUnrollMemoryForThisThread(long) - Method in class org.apache.spark.storage.MemoryStore

Release memory used by this thread for unrolling blocks.

ReliableKafkaReceiver<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka

ReliableKafkaReceiver offers the ability to reliably store data into BlockManager without loss.

ReliableKafkaReceiver(Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
remainingMem() - Method in class org.apache.spark.storage.BlockManagerInfo
 
remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Sets each DStreams in this context to remember RDDs it generated in the last given duration.

remember(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
 
remember(Duration) - Method in class org.apache.spark.streaming.DStreamGraph
 
remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext

Set each DStreams in this context to remember RDDs it generated in the last given duration.

rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
rememberDuration() - Method in class org.apache.spark.streaming.DStreamGraph
 
remove(String) - Method in class org.apache.spark.SparkConf

Remove a parameter from the configuration

remove(BlockId) - Method in class org.apache.spark.storage.BlockStore

Remove a block, if it exists.

remove(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
remove(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
remove(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
removeBlock(BlockId, boolean) - Method in class org.apache.spark.storage.BlockManager

Remove a block from both memory and disk.

removeBlock(BlockId) - Method in class org.apache.spark.storage.BlockManagerInfo
 
removeBlock(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster

Remove a block from the slaves that have it.

removeBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus

Remove the given block from this storage status.

removeBlocks() - Method in class org.apache.spark.rdd.BlockRDD

Remove the data blocks that this BlockRDD is made from.

removeBroadcast(long, boolean) - Method in class org.apache.spark.storage.BlockManager

Remove all blocks belonging to the given broadcast.

removeBroadcast(long, boolean, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster

Remove all blocks belonging to the given broadcast.

removeExecutor(String, String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
removeExecutor(String, String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
removeExecutor(String) - Method in class org.apache.spark.storage.BlockManagerMaster

Remove a dead executor from the driver actor.

removeFile(TachyonFile) - Method in class org.apache.spark.storage.TachyonBlockManager
 
removeFromDriver() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
removeOutputLoc(int, BlockManagerId) - Method in class org.apache.spark.scheduler.Stage
 
removeOutputsOnExecutor(String) - Method in class org.apache.spark.scheduler.Stage

Removes all shuffle outputs associated with this executor.

removeRdd(int) - Method in class org.apache.spark.storage.BlockManager

Remove all blocks belonging to the given RDD.

removeRdd(int, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster

Remove all blocks belonging to the given RDD.

removeRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager

If the given task ID is in the set of running tasks, removes it.

removeSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
 
removeSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
 
removeSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
 
removeShuffle(int, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster

Remove all blocks belonging to the given shuffle.

removeSource(Source) - Method in class org.apache.spark.metrics.MetricsSystem
 
render(HttpServletRequest) - Method in class org.apache.spark.streaming.ui.StreamingPage

Render the page

render(HttpServletRequest) - Method in class org.apache.spark.ui.env.EnvironmentPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.exec.ExecutorsPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.exec.ExecutorThreadDumpPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.AllJobsPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.AllStagesPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.JobPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.PoolPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.StagePage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.storage.RDDPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.storage.StoragePage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.WebUIPage
 
renderJson(HttpServletRequest) - Method in class org.apache.spark.ui.WebUIPage
 
repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return a new RDD that has exactly numPartitions partitions.

repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD

Return a new RDD that has exactly numPartitions partitions.

repartition(int) - Method in class org.apache.spark.api.java.JavaRDD

Return a new RDD that has exactly numPartitions partitions.

repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD

Return a new RDD that has exactly numPartitions partitions.

repartition(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Return a new RDD that has exactly numPartitions partitions.

repartition(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream

Return a new DStream with an increased or decreased level of parallelism.

repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream with an increased or decreased level of parallelism.

repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream with an increased or decreased level of parallelism.

repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Repartition the RDD according to the given partitioner and, within each resulting partition,
 sort records by their keys.

repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD

Repartition the RDD according to the given partitioner and, within each resulting partition,
 sort records by their keys.

repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions

Repartition the RDD according to the given partitioner and, within each resulting partition,
 sort records by their keys.

replay() - Method in class org.apache.spark.scheduler.ReplayListenerBus

Replay each event in the order maintained in the given logs.

ReplayListenerBus - Class in org.apache.spark.scheduler

A SparkListenerBus that replays logged events from persisted storage.

ReplayListenerBus(Seq<Path>, FileSystem, Option<CompressionCodec>) - Constructor for class org.apache.spark.scheduler.ReplayListenerBus
 
replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
ReplicatedVertexView<VD,ED> - Class in org.apache.spark.graphx.impl

Manages shipping vertex attributes to the edge partitions of an
 EdgeRDD.

ReplicatedVertexView(EdgeRDDImpl<ED, VD>, boolean, boolean, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.impl.ReplicatedVertexView
 
replication() - Method in class org.apache.spark.storage.StorageLevel
 
report() - Method in class org.apache.spark.metrics.MetricsSystem
 
report() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
report() - Method in class org.apache.spark.metrics.sink.CsvSink
 
report() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
report() - Method in class org.apache.spark.metrics.sink.JmxSink
 
report() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
report() - Method in interface org.apache.spark.metrics.sink.Sink
 
reporter() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
reporter() - Method in class org.apache.spark.metrics.sink.CsvSink
 
reporter() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
reporter() - Method in class org.apache.spark.metrics.sink.JmxSink
 
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver

Report exceptions in receiving data.

reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Report errors.

reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl

Report error to the receiver tracker

reportError(String, Throwable) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
ReportError - Class in org.apache.spark.streaming.scheduler
 
ReportError(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReportError
 
requestedAttributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
requestedPartitionOrdinals() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
requestedTotal() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
 
requestExecutors(int) - Method in interface org.apache.spark.ExecutorAllocationClient

Request an additional number of executors from the cluster manager.

requestExecutors(int) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend

Request an additional number of executors from the cluster manager.

requestExecutors(int) - Method in class org.apache.spark.SparkContext

:: DeveloperApi ::
 Request an additional number of executors from the cluster manager.

requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Aggregate
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Distinct
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.ExternalSort
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.Sort
 
requiredChildDistribution() - Method in class org.apache.spark.sql.execution.SparkPlan

Specifies any partition requirements on the input data for this operator.

reregister() - Method in class org.apache.spark.storage.BlockManager

Re-register with the master and report all blocks to it.

reregisterBlockManager() - Method in class org.apache.spark.HeartbeatResponse
 
reregistered(SchedulerDriver, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
reregistered(SchedulerDriver, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
res() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
reservedSizeBytes() - Static method in class org.apache.spark.util.AkkaUtils

Space reserved for extra data in an Akka message besides serialized task or task result.

reserveUnrollMemoryForThisThread(long) - Method in class org.apache.spark.storage.MemoryStore

Reserve additional memory for unrolling blocks used by this thread.

reservoirSampleAndCount(Iterator<T>, int, long, ClassTag<T>) - Static method in class org.apache.spark.util.random.SamplingUtils

Reservoir sampling implementation that also returns the input size.

reset() - Method in class org.apache.spark.sql.hive.test.TestHiveContext

Resets the test instance by deleting any tables that have been created.

resetIterator() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
resolveClass(ObjectStreamClass) - Method in class org.apache.spark.streaming.ObjectInputStreamWithLoader
 
resolved() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
resolveURI(String, boolean) - Static method in class org.apache.spark.util.Utils

Return a well-formed URI for the file described by a user input string.

resolveURIs(String, boolean) - Static method in class org.apache.spark.util.Utils

Resolve a comma-separated list of paths.

resourceOffer(String, String, Enumeration.Value) - Method in class org.apache.spark.scheduler.TaskSetManager

Respond to an offer of a single executor from the scheduler by finding a task

resourceOffers(SchedulerDriver, List<Protos.Offer>) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend

Method called by Mesos to offer resources on slaves.

resourceOffers(SchedulerDriver, List<Protos.Offer>) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend

Method called by Mesos to offer resources on slaves.

resourceOffers(Seq<WorkerOffer>) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl

Called by cluster manager to offer resources on slaves.

resourcePool() - Static method in class org.apache.spark.sql.execution.SparkSqlSerializer
 
responder() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
responder() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
restart(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver

Restart the receiver.

restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver

Restart the receiver.

restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver

Restart the receiver.

restartReceiver(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Restart receiver with delay

restartReceiver(String, Option<Throwable>, int) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Restart receiver with delay

restore() - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData

Restore the checkpoint data.

restore() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
restoreCheckpointData() - Method in class org.apache.spark.streaming.dstream.DStream

Restore the RDDs in generatedRDDs from the checkpointData.

restoreCheckpointData() - Method in class org.apache.spark.streaming.DStreamGraph
 
RESUBMIT_TIMEOUT() - Static method in class org.apache.spark.scheduler.DAGScheduler
 
resubmitFailedStages() - Method in class org.apache.spark.scheduler.DAGScheduler

Resubmit any failed stages.

ResubmitFailedStages - Class in org.apache.spark.scheduler
 
ResubmitFailedStages() - Constructor for class org.apache.spark.scheduler.ResubmitFailedStages
 
Resubmitted - Class in org.apache.spark

:: DeveloperApi ::
 A ShuffleMapTask that completed successfully earlier, but we
 lost the executor before the stage completed.

Resubmitted() - Constructor for class org.apache.spark.Resubmitted
 
result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
 
result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction

Awaits and returns the result (of type T) of this action.

result() - Method in class org.apache.spark.scheduler.CompletionEvent
 
result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
 
result() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
 
result() - Method in class org.apache.spark.streaming.scheduler.Job
 
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
resultAttribute() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
resultAttribute() - Method in class org.apache.spark.sql.execution.EvaluatePython
 
resultObject() - Method in class org.apache.spark.partial.ApproximateActionListener
 
resultOfJob() - Method in class org.apache.spark.scheduler.Stage

For stages that are the final (consists of only ResultTasks), link to the ActiveJob.

resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
 
ResultTask<T,U> - Class in org.apache.spark.scheduler

A task that sends back the output to the driver application.

ResultTask(int, Broadcast<byte[]>, Partition, Seq<TaskLocation>, int) - Constructor for class org.apache.spark.scheduler.ResultTask
 
ResultWithDroppedBlocks - Class in org.apache.spark.storage
 
ResultWithDroppedBlocks(boolean, Seq<Tuple2<BlockId, BlockStatus>>) - Constructor for class org.apache.spark.storage.ResultWithDroppedBlocks
 
retag(Class<T>) - Method in class org.apache.spark.rdd.RDD

Private API for changing an RDD's ClassTag.

retag(ClassTag<T>) - Method in class org.apache.spark.rdd.RDD

Private API for changing an RDD's ClassTag.

RETAINED_FILES_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
retainedCompletedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
retainedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
retryRandom(Function0<T>, int, int) - Static method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor

Retry the given amount of times with a random backoff time (millis) less than the
   given maxBackOffMillis

retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils

Returns the configured number of milliseconds to wait on each retry

returnInspector() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
ReturnStatementFinder - Class in org.apache.spark.util
 
ReturnStatementFinder() - Constructor for class org.apache.spark.util.ReturnStatementFinder
 
reverse() - Method in class org.apache.spark.graphx.EdgeDirection

Reverse the direction of an edge.

reverse() - Method in class org.apache.spark.graphx.EdgeRDD

Reverse all the edges in this RDD.

reverse() - Method in class org.apache.spark.graphx.Graph

Reverses all edges in the graph.

reverse() - Method in class org.apache.spark.graphx.impl.EdgePartition

Reverse all the edges in this partition.

reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
reverse() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView

Return a new ReplicatedVertexView where edges are reversed and shipping levels are swapped to
 match.

reverse() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition

Returns a new RoutingTablePartition reflecting a reversal of all edge directions.

reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD

Returns a new VertexRDD reflecting a reversal of all edge directions in the corresponding
 EdgeRDD.

revertPartialWritesAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter

Reverts writes that haven't been flushed yet.

revertPartialWritesAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
reviveOffers() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
reviveOffers() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
reviveOffers() - Method in class org.apache.spark.scheduler.local.LocalActor
 
reviveOffers() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
ReviveOffers - Class in org.apache.spark.scheduler.local
 
ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
 
reviveOffers() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
RidgeRegressionModel - Class in org.apache.spark.mllib.regression

Regression model trained using RidgeRegression.

RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
 
RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression

Train a regression model with L2-regularization using Stochastic Gradient Descent.

RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD

Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100,
 regParam: 0.01, miniBatchFraction: 1.0}.

right() - Method in class org.apache.spark.sql.execution.Except
 
right() - Method in class org.apache.spark.sql.execution.Intersect
 
right() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
right() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
 
right() - Method in class org.apache.spark.sql.execution.joins.CartesianProduct
 
right() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
right() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
right() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL

The Broadcast relation

right() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
right() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node

Return the index of the right child of this node.

rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
rightKeys() - Method in class org.apache.spark.sql.execution.joins.BroadcastHashJoin
 
rightKeys() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
rightKeys() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
 
rightKeys() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinHash
 
rightKeys() - Method in class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
 
rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a right outer join of this and other.

rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a right outer join of this and other.

rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Perform a right outer join of this and other.

rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a right outer join of this and other.

rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a right outer join of this and other.

rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions

Perform a right outer join of this and other.

rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'right outer join' between RDDs of this DStream and
 other DStream.

rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'right outer join' between RDDs of this DStream and
 other DStream.

rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by applying 'right outer join' between RDDs of this DStream and
 other DStream.

rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'right outer join' between RDDs of this DStream and
 other DStream.

rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'right outer join' between RDDs of this DStream and
 other DStream.

rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new DStream by applying 'right outer join' between RDDs of this DStream and
 other DStream.

rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
RLIKE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators

A random graph generator using the R-MAT model, proposed in
 "R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al.

rnd() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

Returns the receiver operating characteristic (ROC) curve,
 which is an RDD of (false positive rate, true positive rate)
 with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.

rolledOver() - Method in interface org.apache.spark.util.logging.RollingPolicy

Notify that rollover has occurred

rolledOver() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy

Rollover has occurred, so reset the counter

rolledOver() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy

Rollover has occurred, so find the next time to rollover

RollingFileAppender - Class in org.apache.spark.util.logging

Continuously appends data from input stream into the given file, and rolls
 over the file after the given interval.

RollingFileAppender(InputStream, File, RollingPolicy, SparkConf, int) - Constructor for class org.apache.spark.util.logging.RollingFileAppender
 
rollingPolicy() - Method in class org.apache.spark.util.logging.RollingFileAppender
 
RollingPolicy - Interface in org.apache.spark.util.logging

Defines the policy based on which RollingFileAppender will
 generate rolling files.

rolloverIntervalMillis() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
rolloverSizeBytes() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
rootHandler() - Method in class org.apache.spark.ui.ServerInfo
 
rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics

Returns the root mean squared error, which is defined as the square root of
 the mean squared error.

rootPool() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
rootPool() - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
rootPool() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
rootPool() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
rootPool() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
routingTable() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
RoutingTablePartition - Class in org.apache.spark.graphx.impl

Stores the locations of edge-partition join sites for each vertex attribute in a particular
 vertex partition.

RoutingTablePartition(Tuple3<long[], BitSet, BitSet>[]) - Constructor for class org.apache.spark.graphx.impl.RoutingTablePartition
 
Row - Class in org.apache.spark.sql.api.java

A result row from a Spark SQL query.

Row(Row) - Constructor for class org.apache.spark.sql.api.java.Row
 
row() - Method in class org.apache.spark.sql.api.java.Row
 
rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
RowMatrix - Class in org.apache.spark.mllib.linalg.distributed

:: Experimental ::
 Represents a row-oriented distributed Matrix with no meaningful row indices.

RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix

Alternative constructor leaving matrix dimensions to be determined automatically.

RowReadSupport - Class in org.apache.spark.sql.parquet

A parquet.hadoop.api.ReadSupport for Row objects.

RowReadSupport() - Constructor for class org.apache.spark.sql.parquet.RowReadSupport
 
RowRecordMaterializer - Class in org.apache.spark.sql.parquet

A parquet.io.api.RecordMaterializer for Rows.

RowRecordMaterializer(CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.RowRecordMaterializer
 
RowRecordMaterializer(MessageType, Seq<Attribute>) - Constructor for class org.apache.spark.sql.parquet.RowRecordMaterializer
 
rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
rowToArray(Row, Seq<DataType>) - Static method in class org.apache.spark.sql.execution.EvaluatePython

Convert Row into Java Array (for pickled into Python)

rowToJSON(StructType, JsonFactory, Row) - Static method in class org.apache.spark.sql.json.JsonRDD

Transforms a single Row to JSON using Jackson

RowWriteSupport - Class in org.apache.spark.sql.parquet

A parquet.hadoop.api.WriteSupport for Row ojects.

RowWriteSupport() - Constructor for class org.apache.spark.sql.parquet.RowWriteSupport
 
run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction

Executes some action enclosed in the closure.

run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents

Compute the connected component membership of each vertex and return a graph with the vertex
 value containing the lowest vertex id in the connected component containing that vertex.

run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation

Run static Label Propagation for detecting communities in networks.

run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank

Run PageRank for a fixed number of iterations returning a graph
 with vertex attributes containing the PageRank and edge
 attributes the normalized edge weight.

run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths

Computes shortest paths to the given set of landmark vertices.

run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents

Compute the strongly connected component (SCC) of each vertex and return a graph with the
 vertex value containing the lowest vertex id in the SCC containing that vertex.

run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus

Implement SVD++ based on "Factorization Meets the Neighborhood:
 a Multifaceted Collaborative Filtering Model",
 available at http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf.

run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
 
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes

Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.

run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans

Train a K-means model on the given set of points; data should be cached for high
 performance, because this is an iterative algorithm.

run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS

Run ALS with the configured parameters on an input RDD of (user, product, rating) triples.

run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS

Java-friendly version of ALS.run.

run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm

Run the algorithm with the configured parameters on an input
 RDD of LabeledPoint entries.

run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm

Run the algorithm with the configured parameters on an input RDD
 of LabeledPoint entries starting from the initial weights provided.

run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree

Method to train a decision tree model over an RDD

run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees

Method to train a gradient boosting model

run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees

Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#run.

run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest

Method to train a decision tree model over an RDD

run() - Method in class org.apache.spark.rdd.PartitionCoalescer

Runs the packing algorithm and returns an array of PartitionGroups that if possible are
 load balanced and grouped by locality

run(long) - Method in class org.apache.spark.scheduler.Task
 
run(SQLContext) - Method in interface org.apache.spark.sql.execution.RunnableCommand
 
run(SQLContext) - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
run() - Method in class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
 
run() - Method in class org.apache.spark.streaming.flume.FlumeBatchFetcher
 
run() - Method in class org.apache.spark.streaming.scheduler.Job
 
run() - Method in class org.apache.spark.util.RedirectThread
 
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, CallSite, long, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
 
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Method in class org.apache.spark.SparkContext

:: DeveloperApi ::
 Run a job that can return approximate results.

runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction

Runs a Spark job.

runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, CallSite, boolean, Function2<Object, U, BoxedUnit>, Properties, ClassTag<U>) - Method in class org.apache.spark.scheduler.DAGScheduler
 
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext

Run a function on a given set of partitions in an RDD and pass the results to the given
 handler function.

runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext

Run a function on a given set of partitions in an RDD and return the results as an array.

runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext

Run a job on a given set of partitions of an RDD, but take a function of type
 Iterator[T] => U instead of (TaskContext, Iterator[T]) => U.

runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext

Run a job on all partitions in an RDD and return the results in an array.

runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext

Run a job on all partitions in an RDD and return the results in an array.

runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext

Run a job on all partitions in an RDD and pass the results to a handler function.

runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext

Run a job on all partitions in an RDD and pass the results to a handler function.

runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS

Run Limited-memory BFGS (L-BFGS) in parallel.

RunLengthEncoding - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
RunLengthEncoding.Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
RunLengthEncoding.Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent

Run stochastic gradient descent (SGD) in parallel using mini batches.

RunnableCommand - Interface in org.apache.spark.sql.execution
 
running() - Method in class org.apache.spark.scheduler.TaskInfo
 
RUNNING() - Static method in class org.apache.spark.TaskState
 
runningBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
runningLocally() - Method in class org.apache.spark.TaskContext

Deprecated.

runningLocally() - Method in class org.apache.spark.TaskContextImpl
 
runningStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
runningTasks() - Method in class org.apache.spark.scheduler.Pool
 
runningTasks() - Method in interface org.apache.spark.scheduler.Schedulable
 
runningTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
runningTasksSet() - Method in class org.apache.spark.scheduler.TaskSetManager
 
runSqlHive(String) - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
runTask(TaskContext) - Method in class org.apache.spark.scheduler.ResultTask
 
runTask(TaskContext) - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
runTask(TaskContext) - Method in class org.apache.spark.scheduler.Task
 
RuntimePercentage - Class in org.apache.spark.scheduler
 
RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
 
runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank

Run a dynamic version of PageRank returning a graph with vertex attributes containing the
 PageRank and edge attributes containing the normalized edge weight.





S

s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
s1() - Method in class org.apache.spark.rdd.CartesianPartition
 
s2() - Method in class org.apache.spark.rdd.CartesianPartition
 
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.execution.LogicalRDD
 
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.sources.LogicalRelation
 
sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return a sampled subset of this RDD.

sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return a sampled subset of this RDD.

sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD

Return a sampled subset of this RDD.

sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD

Return a sampled subset of this RDD.

sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD

Return a sampled subset of this RDD.

sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD

Return a sampled subset of this RDD.

sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD

Return a sampled subset of this RDD.

Sample - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

Sample(double, boolean, long, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sample
 
sample(boolean, double, long) - Method in class org.apache.spark.sql.SchemaRDD

:: Experimental ::
 Returns a sampled version of the underlying dataset.

sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
 
sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
 
sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler

take a random sample

sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD

Return a subset of this RDD sampled by key (via stratified sampling).

sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD

Return a subset of this RDD sampled by key (via stratified sampling).

sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return a subset of this RDD sampled by key (via stratified sampling).

sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD

::Experimental::
 Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
 math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).

sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD

::Experimental::
 Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
 math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).

sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions

::Experimental::
 Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
 math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).

SampledRDD<T> - Class in org.apache.spark.rdd
 
SampledRDD(RDD<T>, boolean, double, int, ClassTag<T>) - Constructor for class org.apache.spark.rdd.SampledRDD
 
SampledRDDPartition - Class in org.apache.spark.rdd
 
SampledRDDPartition(Partition, int) - Constructor for class org.apache.spark.rdd.SampledRDDPartition
 
sampleLogNormal(double, double, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators

Randomly samples from a log normal distribution whose corresponding normal distribution has
 the given mean and standard deviation.

sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD

Compute the sample standard deviation of this RDD's elements (which corrects for bias in
 estimating the standard deviation by dividing by N-1 instead of N).

sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Compute the sample standard deviation of this RDD's elements (which corrects for bias in
 estimating the standard deviation by dividing by N-1 instead of N).

sampleStdev() - Method in class org.apache.spark.util.StatCounter

Return the sample standard deviation of the values, which corrects for bias in estimating the
 variance by dividing by N-1 instead of N.

sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD

Compute the sample variance of this RDD's elements (which corrects for bias in
 estimating the standard variance by dividing by N-1 instead of N).

sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Compute the sample variance of this RDD's elements (which corrects for bias in
 estimating the variance by dividing by N-1 instead of N).

sampleVariance() - Method in class org.apache.spark.util.StatCounter

Return the sample variance, which corrects for bias in estimating the variance by dividing
 by N-1 instead of N.

samplingRatio() - Method in class org.apache.spark.sql.json.JSONRelation
 
SamplingUtils - Class in org.apache.spark.util.random
 
SamplingUtils() - Constructor for class org.apache.spark.util.random.SamplingUtils
 
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD

Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
 that storage system.

saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions

Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
 that storage system.

saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD

Output the RDD to any Hadoop-supported file system.

saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD

Output the RDD to any Hadoop-supported file system.

saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD

Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.

saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class
 supporting the key and value types K and V in this RDD.

saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class
 supporting the key and value types K and V in this RDD.

saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class
 supporting the key and value types K and V in this RDD.

saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class
 supporting the key and value types K and V in this RDD.

saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Save each RDD in this DStream as a Hadoop file.

saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Save each RDD in this DStream as a Hadoop file.

saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Save each RDD in this DStream as a Hadoop file.

saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Save each RDD in this DStream as a Hadoop file.

saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Save each RDD in this DStream as a Hadoop file.

saveAsHiveFile(RDD<Row>, Class<?>, ShimFileSinkDesc, SerializableWritable<JobConf>, SparkHiveWriterContainer) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils

Save labeled data in LIBSVM format.

saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD

Output the RDD to any Hadoop-supported storage system, using
 a Configuration object for that storage system.

saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions

Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop
 Configuration object for that storage system.

saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD

Output the RDD to any Hadoop-supported file system.

saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD

Output the RDD to any Hadoop-supported file system.

saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
 (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.

saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions

Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
 (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.

saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Save each RDD in this DStream as a Hadoop file.

saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Save each RDD in this DStream as a Hadoop file.

saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Save each RDD in this DStream as a Hadoop file.

saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Save each RDD in this DStream as a Hadoop file.

saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Save each RDD in this DStream as a Hadoop file.

saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike

Save this RDD as a SequenceFile of serialized objects.

saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD

Save this RDD as a SequenceFile of serialized objects.

saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream

Save each RDD in this DStream as a Sequence file of serialized objects.

saveAsParquetFile(String) - Method in interface org.apache.spark.sql.SchemaRDDLike

Saves the contents of this SchemaRDD as a parquet file, preserving the schema.

saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions

Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key
 and value types.

saveAsTable(String) - Method in interface org.apache.spark.sql.SchemaRDDLike

:: Experimental ::
 Creates a table from the the contents of this SchemaRDD.

saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike

Save this RDD as a text file, using string representations of elements.

saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Save this RDD as a compressed text file, using string representations of elements.

saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD

Save this RDD as a text file, using string representations of elements.

saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD

Save this RDD as a compressed text file, using string representations of elements.

saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream

Save each RDD in this DStream as at text file, using string representation
 of elements.

saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils

Deprecated.
Should use RDD.saveAsTextFile(java.lang.String) for saving and
            MLUtils.loadLabeledPoints(org.apache.spark.SparkContext, java.lang.String, int) for loading.


sc() - Method in class org.apache.spark.api.java.JavaSparkContext
 
sc() - Method in class org.apache.spark.scheduler.DAGScheduler
 
sc() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
sc() - Method in class org.apache.spark.streaming.StreamingContext
 
sc() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
sc() - Method in class org.apache.spark.ui.jobs.JobsTab
 
sc() - Method in class org.apache.spark.ui.jobs.StagesTab
 
sc() - Method in class org.apache.spark.ui.SparkUI
 
scal(double, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS

x = a * x

scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
scalaTag() - Method in class org.apache.spark.sql.columnar.NativeColumnType

Scala TypeTag.

scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
ScalaToJavaUDTWrapper<UserType> - Class in org.apache.spark.sql.api.java

Java wrapper for a Scala UserDefinedType

ScalaToJavaUDTWrapper(UserDefinedType<UserType>) - Constructor for class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
 
scalaUDT() - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
 
Schedulable - Interface in org.apache.spark.scheduler

An interface for schedulable entities.

SchedulableBuilder - Interface in org.apache.spark.scheduler

An interface to build Schedulable tree
 buildPools: build the tree nodes(pools)
 addTaskSetManager: build the leaf nodes(TaskSetManagers)

schedulableBuilder() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
schedulableNameToSchedulable() - Method in class org.apache.spark.scheduler.Pool
 
schedulableQueue() - Method in class org.apache.spark.scheduler.Pool
 
schedulableQueue() - Method in interface org.apache.spark.scheduler.Schedulable
 
schedulableQueue() - Method in class org.apache.spark.scheduler.TaskSetManager
 
scheduler() - Method in class org.apache.spark.streaming.StreamingContext
 
SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.ToolTips
 
schedulerAllocFile() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
SchedulerBackend - Interface in org.apache.spark.scheduler

A backend interface for scheduling systems that allows plugging in different ones under
 TaskSchedulerImpl.

schedulerBackend() - Method in class org.apache.spark.SparkContext
 
SCHEDULING_MODE_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
SchedulingAlgorithm - Interface in org.apache.spark.scheduler

An interface for sort algorithm
 FIFO: FIFO algorithm between TaskSetManagers
 FS: FS algorithm between Pools, and FIFO or FS within Pools

schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo

Time taken for the first job of this batch to start processing from the time this batch
 was submitted to the streaming scheduler.

schedulingDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
schedulingMode() - Method in class org.apache.spark.scheduler.Pool
 
schedulingMode() - Method in interface org.apache.spark.scheduler.Schedulable
 
SchedulingMode - Class in org.apache.spark.scheduler

"FAIR" and "FIFO" determines which policy is used
    to order tasks amongst a Schedulable's sub-queues
  "NONE" is used when the a Schedulable has no sub-queues.

SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
 
schedulingMode() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
schedulingMode() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
schedulingMode() - Method in class org.apache.spark.scheduler.TaskSetManager
 
schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
schedulingPool() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
schema() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Returns the schema of this JavaSchemaRDD (represented by a StructType).

schema() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
schema() - Method in class org.apache.spark.sql.columnar.PartitionStatistics
 
schema() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
 
schema() - Method in class org.apache.spark.sql.json.JSONRelation
 
schema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
schema() - Method in class org.apache.spark.sql.SchemaRDD

Returns the schema of this SchemaRDD (represented by a StructType).

schema() - Method in class org.apache.spark.sql.sources.BaseRelation
 
schemaRDD() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Returns the underlying Scala SchemaRDD.

SchemaRDD - Class in org.apache.spark.sql

:: AlphaComponent ::
 An RDD of Row objects that has an associated schema.

SchemaRDD(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.SchemaRDD
 
SchemaRDDLike - Interface in org.apache.spark.sql

Contains functions that are shared between all SchemaRDD types (i.e., Scala, Java)

schemaString() - Method in interface org.apache.spark.sql.SchemaRDDLike

Returns the schema as a string in the tree format.

schemes() - Method in interface org.apache.spark.sql.columnar.compression.AllCompressionSchemes
 
schemes() - Method in interface org.apache.spark.sql.columnar.compression.WithCompressionSchemes
 
scoreCol() - Method in interface org.apache.spark.ml.param.HasScoreCol

param for score column name

scratch() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
Scripts() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
ScriptTransformation - Class in org.apache.spark.sql.hive.execution

:: DeveloperApi ::
 Transforms the input by forking and running the specified script.

ScriptTransformation(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformation
 
seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
seconds(long) - Static method in class org.apache.spark.streaming.Durations
 
Seconds - Class in org.apache.spark.streaming

Helper object that creates instance of Duration representing
 a given number of seconds.

Seconds() - Constructor for class org.apache.spark.streaming.Seconds
 
SecurityManager - Class in org.apache.spark

Spark class responsible for security.

SecurityManager(SparkConf) - Constructor for class org.apache.spark.SecurityManager
 
securityManager() - Method in class org.apache.spark.SparkEnv
 
securityManager() - Method in class org.apache.spark.ui.SparkUI
 
seed() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
seed() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
seed() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
seed() - Method in class org.apache.spark.sql.execution.Sample
 
seenNulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
segment() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
segment() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
select(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD

Changes the output of this relation to the given expressions, similar to the SELECT clause
 in SQL.

selectNodesToSplit(Queue<Tuple2<Object, Node>>, long, DecisionTreeMetadata, Random) - Static method in class org.apache.spark.mllib.tree.RandomForest

Pull nodes off of the queue, and collect a group of nodes to be split on this iteration.

sender() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext

Sends a message to the destination vertex.

sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext

Sends a message to the source vertex.

sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD for a Hadoop SequenceFile with given key and value types.

sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext

Get an RDD for a Hadoop SequenceFile.

sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext

Get an RDD for a Hadoop SequenceFile with given key and value types.

sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext

Get an RDD for a Hadoop SequenceFile with given key and value types.

sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext

Version of sequenceFile() for types implicitly convertible to Writables through a
 WritableConverter.

SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd

Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile,
 through an implicit conversion.

SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
 
ser() - Method in class org.apache.spark.scheduler.TaskSetManager
 
ser() - Method in class org.apache.spark.sql.execution.KryoResourcePool
 
SerializableBuffer - Class in org.apache.spark.util

A wrapper around a java.nio.ByteBuffer that is serializable through Java serialization, to make
 it easier to pass ByteBuffers in case class messages.

SerializableBuffer(ByteBuffer) - Constructor for class org.apache.spark.util.SerializableBuffer
 
serializableHadoopSplit() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
 
SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
 
SerializationStream - Class in org.apache.spark.serializer

:: DeveloperApi ::
 A stream for writing serialized objects.

SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
 
serialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
serialize(Object) - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper

Convert the user type to a SQL datum

serialize(Object) - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper

Convert the user type to a SQL datum

serialize(Object) - Method in class org.apache.spark.sql.api.java.UserDefinedType

Convert the user type to a SQL datum

serialize(T, ClassTag<T>) - Static method in class org.apache.spark.sql.execution.SparkSqlSerializer
 
serialize(Object, ObjectInspector) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
 
serialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
serialize(T) - Static method in class org.apache.spark.util.Utils

Serialize an object using Java serialization

serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
serializedTask() - Method in class org.apache.spark.scheduler.TaskDescription
 
serializeFilterExpressions(Seq<Expression>, Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters

Note: Inside the Hadoop API we only have access to Configuration, not to
 SparkContext, so we cannot use broadcasts to convey
 the actual filter predicate.

serializeMapStatuses(MapStatus[]) - Static method in class org.apache.spark.MapOutputTracker
 
serializePlan(Object, OutputStream) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
Serializer - Class in org.apache.spark.serializer

:: DeveloperApi ::
 A serializer.

Serializer() - Constructor for class org.apache.spark.serializer.Serializer
 
serializer() - Method in class org.apache.spark.ShuffleDependency
 
serializer() - Method in class org.apache.spark.SparkEnv
 
SerializerInstance - Class in org.apache.spark.serializer

:: DeveloperApi ::
 An instance of a serializer, for use by one thread at a time.

SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
 
serializeViaNestedStream(OutputStream, SerializerInstance, Function1<SerializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils

Serialize via nested stream using specific serializer

serializeWithDependencies(Task<?>, HashMap<String, Object>, HashMap<String, Object>, SerializerInstance) - Static method in class org.apache.spark.scheduler.Task

Serialize a task and the current app dependencies (files and JARs added to the SparkContext)

server() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
server() - Method in class org.apache.spark.ui.ServerInfo
 
ServerInfo - Class in org.apache.spark.ui
 
ServerInfo(Server, int, ContextHandlerCollection) - Constructor for class org.apache.spark.ui.ServerInfo
 
ServerStateException - Exception in org.apache.spark

Exception type thrown by HttpServer when it is in the wrong state for an operation.

ServerStateException(String) - Constructor for exception org.apache.spark.ServerStateException
 
serverUri() - Method in class org.apache.spark.HttpFileServer
 
SERVLET_DEFAULT_SAMPLE() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
SERVLET_KEY_PATH() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
SERVLET_KEY_SAMPLE() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
servletPath() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
servletShowSample() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params

Sets a parameter in the embedded param map.

set(String, String) - Method in class org.apache.spark.SparkConf

Set a configuration variable.

set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
 
set(Function0<Object>) - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
setAcls(boolean) - Method in class org.apache.spark.SecurityManager
 
setActiveContext(SparkContext, boolean) - Static method in class org.apache.spark.SparkContext

Called at the end of the SparkContext constructor to ensure that no other SparkContext has
 raced with this constructor and started.

setAdminAcls(String) - Method in class org.apache.spark.SecurityManager

Admin acls should be set before the view or modify acls.

setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD

Set aggregator for RDD's shuffle.

setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy

Sets Algorithm using a String.

setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf

Set multiple parameters together

setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS

:: Experimental ::
 Sets the constant used in computing confidence in implicit ALS.

setAppName(String) - Method in class org.apache.spark.SparkConf

Set a name for your application.

setAppName(String) - Method in class org.apache.spark.ui.SparkUI

Set the app name for this UI.

setBatchDuration(Duration) - Method in class org.apache.spark.streaming.DStreamGraph
 
setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS

Set the number of blocks for both user blocks and product blocks to parallelize the computation
 into; pass -1 for an auto-configured number of blocks.

setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext

Pass-through to SparkContext.setCallSite.

setCallSite(String) - Method in class org.apache.spark.SparkContext

Set the thread-local property for overriding the call sites
 of actions and RDDs.

setCallSite(CallSite) - Method in class org.apache.spark.SparkContext

Set the thread-local property for overriding the call sites
 of actions and RDDs.

setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy

Sets categoricalFeaturesInfo using a Java Map.

setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext

Set the directory under which RDDs are going to be checkpointed.

setCheckpointDir(Option<String>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setCheckpointDir(String) - Method in class org.apache.spark.SparkContext

Set the directory under which RDDs are going to be checkpointed.

setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setClock(Clock) - Method in class org.apache.spark.ExecutorAllocationManager

Use a different clock for this allocation manager.

SetCommand - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

SetCommand(Option<Tuple2<String, Option<String>>>, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.SetCommand
 
setCompressCodec(String) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setCompressed(boolean) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setCompressType(String) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setConf(Configuration) - Method in class org.apache.spark.input.WholeCombineFileRecordReader
 
setConf(Configuration) - Method in class org.apache.spark.input.WholeTextFileInputFormat
 
setConf(Configuration) - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
 
setConf(Properties) - Method in interface org.apache.spark.sql.SQLConf

Set Spark SQL configuration properties.

setConf(String, String) - Method in interface org.apache.spark.sql.SQLConf

Set the given Spark SQL configuration property.

setContext(StreamingContext) - Method in class org.apache.spark.streaming.dstream.DStream
 
setContext(StreamingContext) - Method in class org.apache.spark.streaming.DStreamGraph
 
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS

Set the convergence tolerance of iterations for L-BFGS.

setCustomHostname(String) - Static method in class org.apache.spark.util.Utils

Allow setting a custom host name because when we run on Mesos we need to use the same
 hostname it reports to the master.

setDAGScheduler(DAGScheduler) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
setDAGScheduler(DAGScheduler) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Set the decay factor directly (for forgetful algorithms).

setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer

Sets a class loader for the serializer to use in deserialization.

setDelaySeconds(SparkConf, Enumeration.Value, int) - Static method in class org.apache.spark.util.MetadataCleaner
 
setDelaySeconds(SparkConf, int, boolean) - Static method in class org.apache.spark.util.MetadataCleaner

Set the default delay time (in seconds).

setDestTableId(int) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans

Set the distance threshold within which we've consider centers to have converged.

setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf

Set an environment variable to be used when launching executors for this application.

setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf

Set multiple environment variables to be used when launching executors.

setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf

Set multiple environment variables to be used when launching executors.

setFailure(Exception) - Method in class org.apache.spark.partial.PartialResult
 
setFeatureScaling(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm

Set if the algorithm should use feature scaling to improve the convergence during optimization.

setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setField(MutableRow, int, byte[]) - Static method in class org.apache.spark.sql.columnar.BINARY
 
setField(MutableRow, int, boolean) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
setField(MutableRow, int, byte) - Static method in class org.apache.spark.sql.columnar.BYTE
 
setField(MutableRow, int, JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType

Sets row(ordinal) to field.

setField(MutableRow, int, Date) - Static method in class org.apache.spark.sql.columnar.DATE
 
setField(MutableRow, int, double) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
setField(MutableRow, int, float) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
setField(MutableRow, int, byte[]) - Static method in class org.apache.spark.sql.columnar.GENERIC
 
setField(MutableRow, int, int) - Static method in class org.apache.spark.sql.columnar.INT
 
setField(MutableRow, int, long) - Static method in class org.apache.spark.sql.columnar.LONG
 
setField(MutableRow, int, short) - Static method in class org.apache.spark.sql.columnar.SHORT
 
setField(MutableRow, int, String) - Static method in class org.apache.spark.sql.columnar.STRING
 
setField(MutableRow, int, Timestamp) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
setFinalValue(R) - Method in class org.apache.spark.partial.PartialResult
 
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent

Set the gradient function (of the loss function of one single data example)
 to be used for SGD.

setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS

Set the gradient function (of the loss function of one single data example)
 to be used for L-BFGS.

setGraph(DStreamGraph) - Method in class org.apache.spark.streaming.dstream.DStream
 
setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Set the half life and time unit ("batches" or "points") for forgetful algorithms.

setId(int) - Method in class org.apache.spark.streaming.scheduler.Job
 
setIfMissing(String, String) - Method in class org.apache.spark.SparkConf

Set a parameter if it isn't already configured

setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS

Sets whether to use implicit preference.

setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Specify initial centers directly.

setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans

Set the initialization algorithm.

setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans

Set the number of steps for the k-means|| initialization mode.

setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

Set the initial weights.

setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
 
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
 
setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm

Set if the algorithm should add an intercept.

setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS

:: DeveloperApi ::
 Sets storage level for intermediate RDDs (user/product in/out links).

setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS

Set the number of iterations to run.

setJars(Seq<String>) - Method in class org.apache.spark.SparkConf

Set JAR files to distribute to the cluster.

setJars(String[]) - Method in class org.apache.spark.SparkConf

Set JAR files to distribute to the cluster.

setJobDescription(String) - Method in class org.apache.spark.SparkContext

Set a human readable description of the current job.

setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext

Assigns a group ID to all the jobs started by this thread until the group ID is set to a
 different value or cleared.

setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext

Assigns a group ID to all the jobs started by this thread until the group ID is set to a
 different value or cleared.

setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext

Assigns a group ID to all the jobs started by this thread until the group ID is set to a
 different value or cleared.

setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans

Set the number of clusters to create (k).

setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Set the number of clusters.

setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD

Set key ordering for RDD's shuffle.

setLabelCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes

Set the smoothing parameter.

setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS

Set the regularization parameter, lambda.

setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec

Sets initial learning rate (default: 0.025).

setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setLocalProperties(Properties) - Method in class org.apache.spark.SparkContext
 
setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext

Set a local property that affects jobs submitted from this thread, such as the
 Spark fair scheduler pool.

setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext

Set a local property that affects jobs submitted from this thread, such as the
 Spark fair scheduler pool.

setLocation(Table, CreateTableDesc) - Static method in class org.apache.spark.sql.hive.HiveShim
 
setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD

Set mapSideCombine flag for RDD's shuffle.

setMaster(String) - Method in class org.apache.spark.SparkConf

The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to
 run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.

setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans

Set maximum number of iterations to run.

setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS

Deprecated.
use LBFGS.setNumIterations(int) instead


setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent

:: Experimental ::
 Set fraction of data to be used for each SGD iteration.

setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

Set the fraction of each batch to use for updates.

setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMinPartitions(JobContext, int) - Method in class org.apache.spark.input.StreamFileInputFormat

Allow minPartitions set by end-user in order to keep compatibility with old Hadoop API
 which is set through setMaxSplitSize

setMinPartitions(JobContext, int) - Method in class org.apache.spark.input.WholeTextFileInputFormat

Allow minPartitions set by end-user in order to keep compatibility with old Hadoop API,
 which is set through setMaxSplitSize

setModifyAcls(Set<String>, String) - Method in class org.apache.spark.SecurityManager

Admin acls should be set before the view or modify acls.

setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Assign a name to this RDD

setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD

Assign a name to this RDD

setName(String) - Method in class org.apache.spark.api.java.JavaRDD

Assign a name to this RDD

setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
setName(String) - Method in class org.apache.spark.rdd.RDD

Assign a name to this RDD

setName(String) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Assign a name to this RDD

setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS

Set whether the least-squares problems solved at each iteration should have
 nonnegativity constraints.

setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS

Set the number of corrections used in the LBFGS update.

setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
 
setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec

Sets number of iterations (default: 1), which should be smaller than or equal to number of
 partitions.

setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent

Set the number of iterations for SGD.

setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS

Set the maximal number of iterations for L-BFGS.

setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

Set the number of iterations of gradient descent to run per update.

setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec

Sets number of partitions (default: 1).

setNumSplits(int, int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata

Set number of splits for a continuous feature.

setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
 
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
 
setPredictionCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setPredictionCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS

Set the number of product blocks to parallelize the computation.

setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Initialize random centers, requiring only the number of dimensions.

setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS

Set the rank of the feature matrices computed (number of features).

setReceiverId(int) - Method in class org.apache.spark.streaming.receiver.Receiver

Set the ID of the DStream that this receiver is associated with.

setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent

Set the regularization parameter.

setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS

Set the regularization parameter.

setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans

:: Experimental ::
 Set the number of runs of the algorithm to execute in parallel.

setSchema(Seq<Attribute>, Configuration) - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
setScoreCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setScoreCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setScoreCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec

Sets random seed (default: a random long integer).

setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS

Sets a random seed to have deterministic results.

setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
 
setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
 
setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom

Set random seed.

setSeed(long) - Method in class org.apache.spark.util.random.XORShiftRandom
 
setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD

Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)

setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD

Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)

setSerializer(Serializer) - Method in class org.apache.spark.rdd.SubtractedRDD

Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)

setSparkHome(String) - Method in class org.apache.spark.SparkConf

Set the location where Spark is installed on worker nodes.

setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
 
setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent

Set the initial step size of SGD for the first step.

setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

Set the step size for gradient descent.

setStreamingLogLevels() - Static method in class org.apache.spark.examples.streaming.StreamingExamples

Set reasonable logging levels for streaming if the user has not configured log4j.

setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setTableInfo(TableDesc) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setTaskContext(TaskContext) - Static method in class org.apache.spark.TaskContextHelper
 
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel

:: Experimental ::
 Sets the threshold that separates positive predictions from negative predictions.

setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel

:: Experimental ::
 Sets the threshold that separates positive predictions from negative predictions.

setTime(long) - Method in class org.apache.spark.streaming.util.ManualClock
 
settings() - Method in interface org.apache.spark.sql.SQLConf

Only low degree of contention is expected for conf, thus NOT using ConcurrentHashMap.

setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setup(int, int, int) - Method in class org.apache.spark.SparkHadoopWriter
 
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent

Set the updater function to actually perform a gradient step in a given direction.

setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS

Set the updater function to actually perform a gradient step in a given direction.

setupGroups(int) - Method in class org.apache.spark.rdd.PartitionCoalescer

Initializes targetLen partition groups and assigns a preferredLocation
 This uses coupon collector to estimate how many preferredLocations it must rotate through
 until it has seen most of the preferred locations (2 * n log(n))

setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS

Set the number of user blocks to parallelize the computation.

setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm

Set if the algorithm should validate data before training.

setValue(R) - Method in class org.apache.spark.Accumulable

Set the accumulator's value; only allowed on master

setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec

Sets vector size (default: 100).

setViewAcls(Set<String>, String) - Method in class org.apache.spark.SecurityManager

Admin acls should be set before the view or modify acls.

setViewAcls(String, String) - Method in class org.apache.spark.SecurityManager
 
shardId() - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
 
ShimFileSinkDesc - Class in org.apache.spark.sql.hive
 
ShimFileSinkDesc(String, TableDesc, boolean) - Constructor for class org.apache.spark.sql.hive.ShimFileSinkDesc
 
shippablePartitionToOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition

Implicit conversion to allow invoking VertexPartitionBase operations directly on a
 ShippableVertexPartition.

ShippableVertexPartition<VD> - Class in org.apache.spark.graphx.impl

A map from vertex id to vertex attribute that additionally stores edge partition join sites for
 each vertex attribute, enabling joining with an EdgeRDD.

ShippableVertexPartition(OpenHashSet<Object>, Object, BitSet, RoutingTablePartition, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartition
 
ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$ - Class in org.apache.spark.graphx.impl

Implicit evidence that ShippableVertexPartition is a member of the
 VertexPartitionBaseOpsConstructor typeclass.

ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$() - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
 
ShippableVertexPartitionOps<VD> - Class in org.apache.spark.graphx.impl
 
ShippableVertexPartitionOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition

Generate a VertexAttributeBlock for each edge partition keyed on the edge partition ID.

shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.VertexRDD

Generates an RDD of vertex attributes suitable for shipping to the edge partitions.

shipVertexIds() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition

Generate a VertexId array for each edge partition keyed on the edge partition ID.

shipVertexIds() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
shipVertexIds() - Method in class org.apache.spark.graphx.VertexRDD

Generates an RDD of vertex IDs suitable for shipping to the edge partitions.

SHORT - Class in org.apache.spark.sql.columnar
 
SHORT() - Constructor for class org.apache.spark.sql.columnar.SHORT
 
SHORT_FORM() - Static method in class org.apache.spark.util.CallSite
 
ShortColumnAccessor - Class in org.apache.spark.sql.columnar
 
ShortColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.ShortColumnAccessor
 
ShortColumnBuilder - Class in org.apache.spark.sql.columnar
 
ShortColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.ShortColumnBuilder
 
ShortColumnStats - Class in org.apache.spark.sql.columnar
 
ShortColumnStats() - Constructor for class org.apache.spark.sql.columnar.ShortColumnStats
 
ShortestPaths - Class in org.apache.spark.graphx.lib

Computes shortest paths to the given set of landmark vertices, returning a graph where each
 vertex attribute is a map containing the shortest-path distance to each reachable landmark.

ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
 
shortForm() - Method in class org.apache.spark.util.CallSite
 
shortParquetCompressionCodecNames() - Static method in class org.apache.spark.sql.parquet.ParquetRelation
 
ShortType - Static variable in class org.apache.spark.sql.api.java.DataType

Gets the ShortType object.

ShortType - Class in org.apache.spark.sql.api.java

The data type representing short and Short values.

shouldCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState

Check if it's time to checkpoint based on the current time and the derived time 
   for the next checkpoint

shouldRollover(long) - Method in interface org.apache.spark.util.logging.RollingPolicy

Whether rollover should be initiated at this moment

shouldRollover(long) - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy

Should rollover if the next set of bytes is going to exceed the size limit

shouldRollover(long) - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy

Should rollover if current time has exceeded next rollover time

shouldSend() - Method in class org.apache.spark.mllib.recommendation.OutLinkBlock
 
showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showBytesDistribution(String, Option<Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showBytesDistribution(String, Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Option<Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Option<Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Option<Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
showQuantiles(PrintStream) - Method in class org.apache.spark.util.Distribution
 
SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_BLOCK_MANAGER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_READ() - Static method in class org.apache.spark.ui.ToolTips
 
SHUFFLE_WRITE() - Static method in class org.apache.spark.ui.ToolTips
 
ShuffleBlockFetcherIterator - Class in org.apache.spark.storage

An iterator that fetches multiple blocks.

ShuffleBlockFetcherIterator(TaskContext, ShuffleClient, BlockManager, Seq<Tuple2<BlockManagerId, Seq<Tuple2<BlockId, Object>>>>, Serializer, long) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
ShuffleBlockFetcherIterator.FailureFetchResult - Class in org.apache.spark.storage

Result of a fetch from a remote block unsuccessfully.

ShuffleBlockFetcherIterator.FailureFetchResult(BlockId, Throwable) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
ShuffleBlockFetcherIterator.FailureFetchResult$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.FailureFetchResult$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult$
 
ShuffleBlockFetcherIterator.FetchRequest - Class in org.apache.spark.storage

A request to fetch blocks from a remote BlockManager.

ShuffleBlockFetcherIterator.FetchRequest(BlockManagerId, Seq<Tuple2<BlockId, Object>>) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
ShuffleBlockFetcherIterator.FetchRequest$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.FetchRequest$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest$
 
ShuffleBlockFetcherIterator.FetchResult - Interface in org.apache.spark.storage

Result of a fetch from a remote block.

ShuffleBlockFetcherIterator.SuccessFetchResult - Class in org.apache.spark.storage

Result of a fetch from a remote block successfully.

ShuffleBlockFetcherIterator.SuccessFetchResult(BlockId, long, ManagedBuffer) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
ShuffleBlockFetcherIterator.SuccessFetchResult$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.SuccessFetchResult$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult$
 
ShuffleBlockId - Class in org.apache.spark.storage
 
ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
 
shuffleCleaned(int) - Method in interface org.apache.spark.CleanerListener
 
shuffleClient() - Method in class org.apache.spark.storage.BlockManager
 
ShuffleCoGroupSplitDep - Class in org.apache.spark.rdd
 
ShuffleCoGroupSplitDep(ShuffleHandle) - Constructor for class org.apache.spark.rdd.ShuffleCoGroupSplitDep
 
ShuffleDataBlockId - Class in org.apache.spark.storage
 
ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
 
ShuffledDStream<K,V,C> - Class in org.apache.spark.streaming.dstream
 
ShuffledDStream(DStream<Tuple2<K, V>>, Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.streaming.dstream.ShuffledDStream
 
shuffleDep() - Method in class org.apache.spark.scheduler.Stage
 
ShuffleDependency<K,V,C> - Class in org.apache.spark

:: DeveloperApi ::
 Represents a dependency on the output of a shuffle stage.

ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean) - Constructor for class org.apache.spark.ShuffleDependency
 
ShuffledHashJoin - Class in org.apache.spark.sql.execution.joins

:: DeveloperApi ::
 Performs an inner hash join of two child relations by first shuffling the data using the join
 keys.

ShuffledHashJoin(Seq<Expression>, Seq<Expression>, org.apache.spark.sql.execution.joins.BuildSide, SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.ShuffledHashJoin
 
ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd

:: DeveloperApi ::
 The resulting RDD from a shuffle (e.g.

ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner) - Constructor for class org.apache.spark.rdd.ShuffledRDD
 
ShuffledRDDPartition - Class in org.apache.spark.rdd
 
ShuffledRDDPartition(int) - Constructor for class org.apache.spark.rdd.ShuffledRDDPartition
 
shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
 
shuffleId() - Method in class org.apache.spark.CleanShuffle
 
shuffleId() - Method in class org.apache.spark.FetchFailed
 
shuffleId() - Method in class org.apache.spark.GetMapOutputStatuses
 
shuffleId() - Method in class org.apache.spark.ShuffleDependency
 
shuffleId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
ShuffleIndexBlockId - Class in org.apache.spark.storage
 
ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
 
shuffleManager() - Method in class org.apache.spark.SparkEnv
 
ShuffleMapTask - Class in org.apache.spark.scheduler

A ShuffleMapTask divides the elements of an RDD into multiple buckets (based on a partitioner
 specified in the ShuffleDependency).

ShuffleMapTask(int, Broadcast<byte[]>, Partition, Seq<TaskLocation>) - Constructor for class org.apache.spark.scheduler.ShuffleMapTask
 
ShuffleMapTask(int) - Constructor for class org.apache.spark.scheduler.ShuffleMapTask

A constructor used only in test suites.

shuffleMemoryManager() - Method in class org.apache.spark.SparkEnv
 
shuffleRead() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
shuffleReadBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
shuffleReadMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleReadMetricsToJson(ShuffleReadMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleServerId() - Method in class org.apache.spark.storage.BlockManager
 
shuffleToMapStage() - Method in class org.apache.spark.scheduler.DAGScheduler
 
shuffleWrite() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
shuffleWriteBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
shuffleWriteMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleWriteMetricsToJson(ShuffleWriteMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
shutdown(IRecordProcessorCheckpointer, ShutdownReason) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor

Kinesis Client Library is shutting down this Worker for 1 of 2 reasons:
 1) the stream is resharding by splitting or merging adjacent shards 
     (ShutdownReason.TERMINATE)
 2) the failed or latent Worker has stopped sending heartbeats for whatever reason 
     (ShutdownReason.ZOMBIE)

shutdownCallback() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
sideEffectResult() - Method in interface org.apache.spark.sql.execution.Command

A concrete command should override this lazy field to wrap up any side effects caused by the
 command or any other computation that should be evaluated exactly once.

SignalLogger - Class in org.apache.spark.util

Used to log signals received.

SignalLogger() - Constructor for class org.apache.spark.util.SignalLogger
 
SignalLoggerHandler - Class in org.apache.spark.util
 
SignalLoggerHandler(String, Logger) - Constructor for class org.apache.spark.util.SignalLoggerHandler
 
SimpleFutureAction<T> - Class in org.apache.spark

A FutureAction holding the result of an action that triggers a single job.

SimpleFutureAction(JobWaiter<?>, Function0<T>) - Constructor for class org.apache.spark.SimpleFutureAction
 
simpleString() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
SimpleUpdater - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 A simple updater for gradient descent *without* any regularization.

SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
 
SimrSchedulerBackend - Class in org.apache.spark.scheduler.cluster
 
SimrSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
SingleItemData<T> - Class in org.apache.spark.streaming.receiver
 
SingleItemData(T) - Constructor for class org.apache.spark.streaming.receiver.SingleItemData
 
SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg

:: Experimental ::
 Represents singular value decomposition (SVD) factors.

SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
Sink - Interface in org.apache.spark.metrics.sink
 
SINK_REGEX() - Static method in class org.apache.spark.metrics.MetricsSystem
 
size() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
size() - Method in class org.apache.spark.graphx.impl.EdgePartition

The number of edges in this partition

size() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
size() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
size() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
size() - Method in interface org.apache.spark.mllib.linalg.Vector

Size of the vector.

size() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
size() - Method in class org.apache.spark.rdd.PartitionGroup
 
size() - Method in class org.apache.spark.scheduler.IndirectTaskResult
 
size() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
size() - Method in class org.apache.spark.storage.BlockInfo
 
size() - Method in class org.apache.spark.storage.MemoryEntry
 
size() - Method in class org.apache.spark.storage.PutResult
 
size() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
size() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
size() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
size() - Method in class org.apache.spark.util.TimeStampedHashMap
 
size() - Method in class org.apache.spark.util.TimeStampedHashSet
 
size() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
SIZE_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
SIZE_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
SizeBasedRollingPolicy - Class in org.apache.spark.util.logging

Defines a RollingPolicy by which files will be rolled
 over after reaching a particular size.

SizeBasedRollingPolicy(long, boolean) - Constructor for class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
SizeEstimator - Class in org.apache.spark.util

Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in
 memory-aware caches.

SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
 
sizeInBytes() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
sizeInBytes() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
sizeInBytes() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation

Returns an estimated size of this relation in bytes.

sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner

Sketches the input RDD via reservoir sampling on each partition.

skip(long) - Method in class org.apache.spark.util.ByteBufferInputStream
 
skippedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
slack() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
slaveActor() - Method in class org.apache.spark.storage.BlockManagerInfo
 
slaveIdsWithExecutors() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
slaveIdsWithExecutors() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
slaveLost(SchedulerDriver, Protos.SlaveID) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
slaveLost(SchedulerDriver, Protos.SlaveID) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
SlaveLost - Class in org.apache.spark.scheduler
 
SlaveLost(String) - Constructor for class org.apache.spark.scheduler.SlaveLost
 
slaveTimeout() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
slice() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
slice(Seq<T>, int, ClassTag<T>) - Static method in class org.apache.spark.rdd.ParallelCollectionRDD

Slice a collection into numSlices sub-collections.

slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return all the RDDs between 'fromDuration' to 'toDuration' (both included)

slice(Interval) - Method in class org.apache.spark.streaming.dstream.DStream

Return all the RDDs defined by the Interval object (both end times included)

slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream

Return all the RDDs between 'fromTime' to 'toTime' (both included)

slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream

Time interval after which the DStream generates a RDD

slideDuration() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions

Returns a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding
 window over them.

SlidingRDD<T> - Class in org.apache.spark.mllib.rdd

Represents a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding
 window over them.

SlidingRDD(RDD<T>, int, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.SlidingRDD
 
SlidingRDDPartition<T> - Class in org.apache.spark.mllib.rdd
 
SlidingRDDPartition(int, Partition, Seq<T>) - Constructor for class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
SnappyCompressionCodec - Class in org.apache.spark.io

:: DeveloperApi ::
 Snappy implementation of CompressionCodec.

SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
 
SocketInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
SocketInputDStream(StreamingContext, String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.SocketInputDStream
 
SocketReceiver<T> - Class in org.apache.spark.streaming.dstream
 
SocketReceiver(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.SocketReceiver
 
socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream from network source hostname:port.

socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext

Create a input stream from TCP source hostname:port.

socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream from network source hostname:port.

socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream from network source hostname:port.

socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext

Create a input stream from TCP source hostname:port.

solve(DoubleMatrix, DoubleMatrix, NNLS.Workspace) - Static method in class org.apache.spark.mllib.optimization.NNLS

Solve a least squares problem, possibly with nonnegativity constraints, by a modified
 projected gradient method.

solveLeastSquares(DoubleMatrix, DoubleMatrix, NNLS.Workspace) - Method in class org.apache.spark.mllib.recommendation.ALS

Given A^T A and A^T b, find the x minimising ||Ax - b||_2, possibly subject
 to nonnegativity constraints if nonnegative is true.

Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
Sort - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Performs a sort on-heap.

Sort(Seq<SortOrder>, boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Sort
 
sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD

Return this RDD sorted by the given key function.

sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD

Return this RDD sorted by the given key function.

sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD

Sort the RDD by key, so that each partition contains a sorted range of the elements in
 ascending order.

sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD

Sort the RDD by key, so that each partition contains a sorted range of the elements.

sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Sort the RDD by key, so that each partition contains a sorted range of the elements.

sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD

Sort the RDD by key, so that each partition contains a sorted range of the elements.

sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD

Sort the RDD by key, so that each partition contains a sorted range of the elements.

sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Sort the RDD by key, so that each partition contains a sorted range of the elements.

sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions

Sort the RDD by key, so that each partition contains a sorted range of the elements.

sortOrder() - Method in class org.apache.spark.sql.execution.ExternalSort
 
sortOrder() - Method in class org.apache.spark.sql.execution.Sort
 
sortOrder() - Method in class org.apache.spark.sql.execution.TakeOrdered
 
Source - Interface in org.apache.spark.metrics.source
 
SOURCE_REGEX() - Static method in class org.apache.spark.metrics.MetricsSystem
 
sourceName() - Method in class org.apache.spark.metrics.source.JvmSource
 
sourceName() - Method in interface org.apache.spark.metrics.source.Source
 
sourceName() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
sourceName() - Method in class org.apache.spark.storage.BlockManagerSource
 
sourceName() - Method in class org.apache.spark.streaming.StreamingSource
 
SPARK_CONTEXT() - Static method in class org.apache.spark.util.MetadataCleanerType
 
SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
 
SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
 
SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
 
SPARK_METADATA_KEY() - Static method in class org.apache.spark.sql.parquet.RowReadSupport
 
SPARK_ROW_REQUESTED_SCHEMA() - Static method in class org.apache.spark.sql.parquet.RowReadSupport
 
SPARK_ROW_SCHEMA() - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
SPARK_UNKNOWN_USER() - Static method in class org.apache.spark.SparkContext
 
SPARK_VERSION_PREFIX() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
SparkConf - Class in org.apache.spark

Configuration for a Spark application.

SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
 
SparkConf() - Constructor for class org.apache.spark.SparkConf

Create a SparkConf that loads defaults from system properties and the classpath

sparkConf() - Method in class org.apache.spark.streaming.Checkpoint
 
sparkConfPairs() - Method in class org.apache.spark.streaming.Checkpoint
 
sparkContext() - Method in class org.apache.spark.rdd.RDD

The SparkContext that created this RDD.

SparkContext - Class in org.apache.spark

Main entry point for Spark functionality.

SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
 
SparkContext() - Constructor for class org.apache.spark.SparkContext

Create a SparkContext that loads settings from system properties (for instance, when
 launching with ./bin/spark-submit).

SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext

:: DeveloperApi ::
 Alternative constructor for setting preferred locations where Spark will create executors.

SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext

Alternative constructor that allows setting common Spark properties directly

SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext

Alternative constructor that allows setting common Spark properties directly

SparkContext(String, String) - Constructor for class org.apache.spark.SparkContext

Alternative constructor that allows setting common Spark properties directly

SparkContext(String, String, String) - Constructor for class org.apache.spark.SparkContext

Alternative constructor that allows setting common Spark properties directly

SparkContext(String, String, String, Seq<String>) - Constructor for class org.apache.spark.SparkContext

Alternative constructor that allows setting common Spark properties directly

sparkContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sparkContext() - Method in class org.apache.spark.sql.SQLContext
 
sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

The underlying SparkContext

sparkContext() - Method in class org.apache.spark.streaming.StreamingContext

Return the associated Spark context

SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
 
SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
 
SparkDeploySchedulerBackend - Class in org.apache.spark.scheduler.cluster
 
SparkDeploySchedulerBackend(TaskSchedulerImpl, SparkContext, String[]) - Constructor for class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
SparkDriverExecutionException - Exception in org.apache.spark

Exception thrown when execution of some user code in the driver process fails, e.g.

SparkDriverExecutionException(Throwable) - Constructor for exception org.apache.spark.SparkDriverExecutionException
 
SparkEnv - Class in org.apache.spark

:: DeveloperApi ::
 Holds all the runtime environment objects for a running Spark instance (either master or worker),
 including the serializer, Akka actor system, block manager, map output tracker, etc.

SparkEnv(String, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, BroadcastManager, BlockTransferService, BlockManager, SecurityManager, HttpFileServer, String, MetricsSystem, ShuffleMemoryManager, SparkConf) - Constructor for class org.apache.spark.SparkEnv
 
sparkEventFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol

--------------------------------------------------- *
 JSON deserialization methods for SparkListenerEvents |
 ----------------------------------------------------

sparkEventToJson(SparkListenerEvent) - Static method in class org.apache.spark.util.JsonProtocol

------------------------------------------------- *
 JSON serialization methods for SparkListenerEvents |
 --------------------------------------------------

SparkException - Exception in org.apache.spark
 
SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
 
SparkException(String) - Constructor for exception org.apache.spark.SparkException
 
SparkExitCode - Class in org.apache.spark.util
 
SparkExitCode() - Constructor for class org.apache.spark.util.SparkExitCode
 
SparkFiles - Class in org.apache.spark

Resolves paths to files added through SparkContext.addFile().

SparkFiles() - Constructor for class org.apache.spark.SparkFiles
 
sparkFilesDir() - Method in class org.apache.spark.SparkEnv
 
SparkFlumeEvent - Class in org.apache.spark.streaming.flume

A wrapper class for AvroFlumeEvent's with a custom serialization format.

SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
 
SparkHadoopMapReduceUtil - Interface in org.apache.spark.mapreduce
 
SparkHadoopMapRedUtil - Interface in org.apache.spark.mapred
 
SparkHadoopWriter - Class in org.apache.spark

Internal helper class that saves an RDD using a Hadoop OutputFormat.

SparkHadoopWriter(JobConf) - Constructor for class org.apache.spark.SparkHadoopWriter
 
SparkHiveDynamicPartitionWriterContainer - Class in org.apache.spark.sql.hive
 
SparkHiveDynamicPartitionWriterContainer(JobConf, ShimFileSinkDesc, String[]) - Constructor for class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
SparkHiveWriterContainer - Class in org.apache.spark.sql.hive

Internal helper class that saves an RDD using a Hive OutputFormat.

SparkHiveWriterContainer(JobConf, ShimFileSinkDesc) - Constructor for class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
sparkJavaOpts(SparkConf, Function1<String, Object>) - Static method in class org.apache.spark.util.Utils

Convert all spark properties set in the given SparkConf to a sequence of java options.

SparkJobInfo - Interface in org.apache.spark

Exposes information about Spark Jobs.

SparkJobInfoImpl - Class in org.apache.spark
 
SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
 
SparkListener - Interface in org.apache.spark.scheduler

:: DeveloperApi ::
 Interface for listening to events from the Spark scheduler.

SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
 
SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
SparkListenerApplicationStart - Class in org.apache.spark.scheduler
 
SparkListenerApplicationStart(String, Option<String>, long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
 
SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
 
SparkListenerBlockManagerAdded(long, BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
 
SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
SparkListenerBus - Interface in org.apache.spark.scheduler

A SparkListenerEvent bus that relays events to its listeners

SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
 
SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
SparkListenerEvent - Interface in org.apache.spark.scheduler
 
SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler

Periodic updates from executors.

SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
SparkListenerJobEnd - Class in org.apache.spark.scheduler
 
SparkListenerJobEnd(int, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
 
SparkListenerJobStart - Class in org.apache.spark.scheduler
 
SparkListenerJobStart(int, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
 
sparkListeners() - Method in interface org.apache.spark.scheduler.SparkListenerBus
 
SparkListenerShutdown - Class in org.apache.spark.scheduler

An event used in the listener to shutdown the listener daemon thread.

SparkListenerShutdown() - Constructor for class org.apache.spark.scheduler.SparkListenerShutdown
 
SparkListenerStageCompleted - Class in org.apache.spark.scheduler
 
SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
 
SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
 
SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
SparkListenerTaskEnd - Class in org.apache.spark.scheduler
 
SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
 
SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
 
SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
SparkListenerTaskStart - Class in org.apache.spark.scheduler
 
SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
 
SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
 
SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
SparkLogicalPlan - Class in org.apache.spark.sql.execution
 
SparkLogicalPlan(SparkPlan, SQLContext) - Constructor for class org.apache.spark.sql.execution.SparkLogicalPlan
 
SparkPlan - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

SparkPlan() - Constructor for class org.apache.spark.sql.execution.SparkPlan
 
sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
SparkSqlSerializer - Class in org.apache.spark.sql.execution
 
SparkSqlSerializer(SparkConf) - Constructor for class org.apache.spark.sql.execution.SparkSqlSerializer
 
SparkStageInfo - Interface in org.apache.spark

Exposes information about Spark Stages.

SparkStageInfoImpl - Class in org.apache.spark
 
SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
 
SparkStatusTracker - Class in org.apache.spark

Low-level status reporting APIs for monitoring job and stage progress.

SparkStatusTracker(SparkContext) - Constructor for class org.apache.spark.SparkStatusTracker
 
SparkStrategies - Class in org.apache.spark.sql.execution
 
SparkStrategies() - Constructor for class org.apache.spark.sql.execution.SparkStrategies
 
SparkStrategies.BasicOperators - Class in org.apache.spark.sql.execution
 
SparkStrategies.BasicOperators() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.BasicOperators
 
SparkStrategies.BroadcastNestedLoopJoin - Class in org.apache.spark.sql.execution
 
SparkStrategies.BroadcastNestedLoopJoin() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.BroadcastNestedLoopJoin
 
SparkStrategies.CartesianProduct - Class in org.apache.spark.sql.execution
 
SparkStrategies.CartesianProduct() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.CartesianProduct
 
SparkStrategies.CommandStrategy - Class in org.apache.spark.sql.execution
 
SparkStrategies.CommandStrategy(SQLContext) - Constructor for class org.apache.spark.sql.execution.SparkStrategies.CommandStrategy
 
SparkStrategies.HashAggregation - Class in org.apache.spark.sql.execution
 
SparkStrategies.HashAggregation() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
 
SparkStrategies.HashJoin - Class in org.apache.spark.sql.execution
 
SparkStrategies.HashJoin() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.HashJoin

Uses the ExtractEquiJoinKeys pattern to find joins where at least some of the predicates can be
 evaluated by matching hash keys.

SparkStrategies.InMemoryScans - Class in org.apache.spark.sql.execution
 
SparkStrategies.InMemoryScans() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.InMemoryScans
 
SparkStrategies.LeftSemiJoin - Class in org.apache.spark.sql.execution
 
SparkStrategies.LeftSemiJoin() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.LeftSemiJoin
 
SparkStrategies.ParquetOperations - Class in org.apache.spark.sql.execution
 
SparkStrategies.ParquetOperations() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.ParquetOperations
 
SparkStrategies.TakeOrdered - Class in org.apache.spark.sql.execution
 
SparkStrategies.TakeOrdered() - Constructor for class org.apache.spark.sql.execution.SparkStrategies.TakeOrdered
 
SparkUI - Class in org.apache.spark.ui

Top level user interface for a Spark application.

SparkUITab - Class in org.apache.spark.ui
 
SparkUITab(SparkUI, String) - Constructor for class org.apache.spark.ui.SparkUITab
 
SparkUncaughtExceptionHandler - Class in org.apache.spark.util

The default uncaught exception handler for Executors terminates the whole process, to avoid
 getting into a bad state indefinitely.

SparkUncaughtExceptionHandler() - Constructor for class org.apache.spark.util.SparkUncaughtExceptionHandler
 
sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
 
sparkUser() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
sparkUser() - Method in class org.apache.spark.SparkContext
 
sparkVersion() - Method in class org.apache.spark.scheduler.EventLoggingInfo
 
sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices

Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.

sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors

Creates a sparse vector providing its index array and value array.

sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors

Creates a sparse vector using unordered (index, value) pairs.

sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors

Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.

SparseMatrix - Class in org.apache.spark.mllib.linalg

Column-major sparse matrix.

SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
 
SparseVector - Class in org.apache.spark.mllib.linalg

A sparse vector represented by an index array and an value array.

SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
 
SpearmanCorrelation - Class in org.apache.spark.mllib.stat.correlation

Compute Spearman's correlation for two RDDs of the type RDD[Double] or the correlation matrix
 for an RDD of the type RDD[Vector].

SpearmanCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
 
speculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SPECULATION_INTERVAL() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
SPECULATION_MULTIPLIER() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SPECULATION_QUANTILE() - Method in class org.apache.spark.scheduler.TaskSetManager
 
speculative() - Method in class org.apache.spark.scheduler.TaskInfo
 
split() - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
split() - Method in class org.apache.spark.mllib.tree.model.Node
 
Split - Class in org.apache.spark.mllib.tree.model

:: DeveloperApi ::
 Split applied to a feature

Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
 
split() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
SPLIT_INFO_REFLECTIONS() - Static method in class org.apache.spark.rdd.HadoopRDD
 
splitAndCountPartitions(Iterator<String>) - Static method in class org.apache.spark.streaming.util.RawTextHelper

Splits lines and counts the words.

splitCommandString(String) - Static method in class org.apache.spark.util.Utils

Split a string of potentially quoted arguments from the command line the way that a shell
 would do it to determine arguments to a command.

splitIdToFile(int) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
splitIndex() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
 
SplitInfo - Class in org.apache.spark.scheduler
 
SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
 
splitLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
sql(String) - Method in class org.apache.spark.sql.api.java.JavaSQLContext

Executes a SQL query using Spark, returning the result as a SchemaRDD.

sql(String) - Method in class org.apache.spark.sql.hive.api.java.JavaHiveContext
 
sql() - Method in class org.apache.spark.sql.hive.execution.NativeCommand
 
sql(String) - Method in class org.apache.spark.sql.hive.HiveContext
 
sql(String) - Method in class org.apache.spark.sql.SQLContext

Executes a SQL query using Spark, returning the result as a SchemaRDD.

SQLConf - Interface in org.apache.spark.sql

A trait that enables the setting and getting of mutable config parameters/hints.

SQLConf.Deprecated$ - Class in org.apache.spark.sql
 
SQLConf.Deprecated$() - Constructor for class org.apache.spark.sql.SQLConf.Deprecated$
 
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
sqlContext() - Method in class org.apache.spark.sql.api.java.JavaSQLContext
 
sqlContext() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
sqlContext() - Method in class org.apache.spark.sql.execution.AddExchange
 
sqlContext() - Method in class org.apache.spark.sql.json.JSONRelation
 
sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sqlContext() - Method in class org.apache.spark.sql.SchemaRDD
 
sqlContext() - Method in interface org.apache.spark.sql.SchemaRDDLike
 
sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
 
SQLContext - Class in org.apache.spark.sql

:: AlphaComponent ::
 The entry point for running relational queries using Spark.

SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
 
sqlType() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
sqlType() - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper

Underlying storage type for this UDT

sqlType() - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper

Underlying storage type for this UDT

sqlType() - Method in class org.apache.spark.sql.api.java.UserDefinedType

Underlying storage type for this UDT

sqlType() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
SQRT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
squaredDist(Vector) - Method in class org.apache.spark.util.Vector
 
SquaredError - Class in org.apache.spark.mllib.tree.loss

:: DeveloperApi ::
 Class for squared error loss calculation.

SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
 
SquaredL2Updater - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Updater for L2 regularized problems.

SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
 
Src - Static variable in class org.apache.spark.graphx.TripletFields

Expose the source and edge fields but not the destination field.

srcAttr() - Method in class org.apache.spark.graphx.EdgeContext

The vertex attribute of the edge's source vertex.

srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet

The source vertex attribute

srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
srcId() - Method in class org.apache.spark.graphx.Edge
 
srcId() - Method in class org.apache.spark.graphx.EdgeContext

The vertex id of the edge's source vertex.

srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
srcId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
ssc() - Method in class org.apache.spark.streaming.dstream.DStream
 
ssc() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
ssc() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
stackTrace() - Method in class org.apache.spark.ExceptionFailure
 
stackTrace() - Method in class org.apache.spark.util.ThreadStackTrace
 
stackTraceFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stackTraceToJson(StackTraceElement[]) - Static method in class org.apache.spark.util.JsonProtocol
 
Stage - Class in org.apache.spark.scheduler

A stage is a set of independent tasks all computing the same function that need to run as part
 of a Spark job, where all the tasks have the same shuffle dependencies.

Stage(int, RDD<?>, int, Option<ShuffleDependency<?, ?, ?>>, List<Stage>, int, CallSite) - Constructor for class org.apache.spark.scheduler.Stage
 
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
StageCancelled - Class in org.apache.spark.scheduler
 
StageCancelled(int) - Constructor for class org.apache.spark.scheduler.StageCancelled
 
stageCompletedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stageCompletedToJson(SparkListenerStageCompleted) - Static method in class org.apache.spark.util.JsonProtocol
 
stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
 
stageId() - Method in class org.apache.spark.scheduler.Pool
 
stageId() - Method in interface org.apache.spark.scheduler.Schedulable
 
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
stageId() - Method in class org.apache.spark.scheduler.StageCancelled
 
stageId() - Method in class org.apache.spark.scheduler.StageInfo
 
stageId() - Method in class org.apache.spark.scheduler.Task
 
stageId() - Method in class org.apache.spark.scheduler.TaskSet
 
stageId() - Method in class org.apache.spark.scheduler.TaskSetManager
 
stageId() - Method in interface org.apache.spark.SparkStageInfo
 
stageId() - Method in class org.apache.spark.SparkStageInfoImpl
 
stageId() - Method in class org.apache.spark.TaskContext
 
stageId() - Method in class org.apache.spark.TaskContextImpl
 
stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
stageIds() - Method in interface org.apache.spark.SparkJobInfo
 
stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
 
stageIds() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
stageIdToActiveJobIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToInfo() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToStage() - Method in class org.apache.spark.scheduler.DAGScheduler
 
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
 
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
StageInfo - Class in org.apache.spark.scheduler

:: DeveloperApi ::
 Stores information about a stage to pass from the scheduler to SparkListeners.

StageInfo(int, int, String, int, Seq<RDDInfo>, String) - Constructor for class org.apache.spark.scheduler.StageInfo
 
stageInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol

--------------------------------------------------------------------- *
 JSON deserialization methods for classes SparkListenerEvents depend on |
 ----------------------------------------------------------------------

stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
stageInfoToJson(StageInfo) - Static method in class org.apache.spark.util.JsonProtocol

------------------------------------------------------------------- *
 JSON serialization methods for classes SparkListenerEvents depend on |
 --------------------------------------------------------------------

StagePage - Class in org.apache.spark.ui.jobs

Page showing statistics and task list for a given stage

StagePage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.StagePage
 
stages() - Method in class org.apache.spark.ml.Pipeline

param for pipeline stages

stages() - Method in class org.apache.spark.ml.PipelineModel
 
StagesTab - Class in org.apache.spark.ui.jobs

Web UI showing progress status of all stages in the given SparkContext.

StagesTab(SparkUI) - Constructor for class org.apache.spark.ui.jobs.StagesTab
 
stageSubmittedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stageSubmittedToJson(SparkListenerStageSubmitted) - Static method in class org.apache.spark.util.JsonProtocol
 
StageTableBase - Class in org.apache.spark.ui.jobs

Page showing list of all ongoing and recently finished stages

StageTableBase(Seq<StageInfo>, String, JobProgressListener, boolean, boolean) - Constructor for class org.apache.spark.ui.jobs.StageTableBase
 
StandardNormalGenerator - Class in org.apache.spark.mllib.random

:: DeveloperApi ::
 Generates i.i.d.

StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
 
StandardScaler - Class in org.apache.spark.ml.feature

:: AlphaComponent ::
 Standardizes features by removing the mean and scaling to unit variance using column summary
 statistics on the samples in the training set.

StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
 
StandardScaler - Class in org.apache.spark.mllib.feature

:: Experimental ::
 Standardizes features by removing the mean and scaling to unit variance using column summary
 statistics on the samples in the training set.

StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
 
StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
 
StandardScalerModel - Class in org.apache.spark.ml.feature

:: AlphaComponent ::
 Model fitted by StandardScaler.

StandardScalerModel(StandardScaler, ParamMap, StandardScalerModel) - Constructor for class org.apache.spark.ml.feature.StandardScalerModel
 
StandardScalerModel - Class in org.apache.spark.mllib.feature

:: Experimental ::
 Represents a StandardScaler model that can transform vectors.

StandardScalerModel(boolean, boolean, Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
 
StandardScalerParams - Interface in org.apache.spark.ml.feature

Params for StandardScaler and StandardScalerModel.

starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators

Create a star graph with vertex 0 being the center.

start() - Method in class org.apache.spark.ContextCleaner

Start the cleaner.

start() - Method in class org.apache.spark.ExecutorAllocationManager

Register for scheduler callbacks to decide when to add and remove executors.

start() - Method in class org.apache.spark.HttpServer
 
start() - Method in class org.apache.spark.metrics.MetricsSystem
 
start() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
start() - Method in class org.apache.spark.metrics.sink.CsvSink
 
start() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
start() - Method in class org.apache.spark.metrics.sink.JmxSink
 
start() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
start() - Method in interface org.apache.spark.metrics.sink.Sink
 
start(String) - Method in class org.apache.spark.mllib.tree.impl.TimeTracker

Starts a new timer, or re-starts a stopped timer.

start() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.EventLoggingListener

Begin logging events.

start() - Method in class org.apache.spark.scheduler.LiveListenerBus

Start sending events to attached listeners.

start() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
start() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
start() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
start() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
start() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Start the execution of the streams.

start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.InputDStream

Method called to start receiving data.

start() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
start(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
start() - Method in class org.apache.spark.streaming.receiver.BlockGenerator

Start block generating and pushing threads.

start() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Start the supervisor

start() - Method in class org.apache.spark.streaming.scheduler.JobGenerator

Start generation of jobs

start() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
start() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
start() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker

Start the actor and receiver execution thread.

start() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
start() - Method in class org.apache.spark.streaming.StreamingContext

Start the execution of the streams.

start(long) - Method in class org.apache.spark.streaming.util.RecurringTimer

Start at the given start time.

start() - Method in class org.apache.spark.streaming.util.RecurringTimer

Start at the earliest time it can start based on the period.

start() - Method in class org.apache.spark.util.FileLogger

Start this logger by creating the logging directory.

Started() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Started() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
startIdx() - Method in class org.apache.spark.util.Distribution
 
startIndex() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node

Return the index of the first node in the given level.

startJettyServer(String, int, Seq<ServletContextHandler>, SparkConf, String) - Static method in class org.apache.spark.ui.JettyUtils

Attempt to start a Jetty server bound to the supplied hostName:port using the given
 context handlers.

startReceiver() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Start receiver

startServiceOnPort(int, Function1<Object, Tuple2<T, Object>>, SparkConf, String) - Static method in class org.apache.spark.util.Utils

Attempt to start a service on the given port, or fail after a number of attempts.

startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
 
startTime() - Method in class org.apache.spark.partial.ApproximateActionListener
 
startTime() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
startTime() - Method in class org.apache.spark.SparkContext
 
startTime() - Method in class org.apache.spark.streaming.DStreamGraph
 
startTime() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
startTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
STARVATION_TIMEOUT() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
statCounter() - Method in class org.apache.spark.util.Distribution
 
StatCounter - Class in org.apache.spark.util

A class for tracking the statistics of a set of numbers (count, mean and variance) in a
 numerically robust way.

StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
 
StatCounter() - Constructor for class org.apache.spark.util.StatCounter

Initialize the StatCounter with no values.

state() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
state() - Method in class org.apache.spark.streaming.StreamingContext
 
StateDStream<K,V,S> - Class in org.apache.spark.streaming.dstream
 
StateDStream(DStream<Tuple2<K, V>>, Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<K>, ClassTag<V>, ClassTag<S>) - Constructor for class org.apache.spark.streaming.dstream.StateDStream
 
STATIC_RESOURCE_DIR() - Static method in class org.apache.spark.ui.SparkUI
 
staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps

Run PageRank for a fixed number of iterations returning a graph with vertex attributes
 containing the PageRank and edge attributes the normalized edge weight.

statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult

Test statistic.

Statistics - Class in org.apache.spark.mllib.stat

API for statistical functions in MLlib.

Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
 
statistics() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
statistics() - Method in class org.apache.spark.sql.execution.LogicalRDD
 
statistics() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
 
statistics() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
statistics() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
statistics() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
Statistics - Class in org.apache.spark.streaming.receiver

:: DeveloperApi ::
 Statistics for querying the supervisor about state of workers.

Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
 
stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return a StatCounter object that captures the mean, variance and
 count of the RDD's elements in one operation.

stats() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
 
stats() - Method in class org.apache.spark.mllib.tree.model.Node
 
stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Return a StatCounter object that captures the mean, variance and
 count of the RDD's elements in one operation.

stats() - Method in class org.apache.spark.sql.columnar.CachedBatch
 
StatsReportListener - Class in org.apache.spark.scheduler

:: DeveloperApi ::
 Simple SparkListener that logs a few summary statistics when each stage completes

StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
 
StatsReportListener - Class in org.apache.spark.streaming.scheduler

:: DeveloperApi ::
 A simple StreamingListener that logs summary statistics across Spark Streaming batches

StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
 
statsSize() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
 
status() - Method in class org.apache.spark.scheduler.TaskInfo
 
status() - Method in interface org.apache.spark.SparkJobInfo
 
status() - Method in class org.apache.spark.SparkJobInfoImpl
 
status() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
 
statusTracker() - Method in class org.apache.spark.SparkContext
 
statusUpdate(SchedulerDriver, Protos.TaskStatus) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
statusUpdate(SchedulerDriver, Protos.TaskStatus) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
statusUpdate(long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.local.LocalBackend
 
StatusUpdate - Class in org.apache.spark.scheduler.local
 
StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
 
statusUpdate(long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD

Compute the standard deviation of this RDD's elements.

stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Compute the standard deviation of this RDD's elements.

stdev() - Method in class org.apache.spark.util.StatCounter

Return the standard deviation of the values.

stop() - Method in class org.apache.spark.api.java.JavaSparkContext

Shut down the SparkContext.

stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
stop() - Method in class org.apache.spark.broadcast.BroadcastManager
 
stop() - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
stop() - Method in class org.apache.spark.ContextCleaner

Stop the cleaner.

stop() - Method in class org.apache.spark.HttpFileServer
 
stop() - Method in class org.apache.spark.HttpServer
 
stop() - Method in class org.apache.spark.MapOutputTracker

Stop the tracker.

stop() - Method in class org.apache.spark.MapOutputTrackerMaster
 
stop() - Method in class org.apache.spark.metrics.MetricsSystem
 
stop() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
stop() - Method in class org.apache.spark.metrics.sink.CsvSink
 
stop() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
stop() - Method in class org.apache.spark.metrics.sink.JmxSink
 
stop() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
stop() - Method in interface org.apache.spark.metrics.sink.Sink
 
stop(String) - Method in class org.apache.spark.mllib.tree.impl.TimeTracker

Stops a timer and returns the elapsed time in seconds.

stop() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.DAGScheduler
 
stop() - Method in class org.apache.spark.scheduler.EventLoggingListener

Stop logging events.

stop() - Method in class org.apache.spark.scheduler.LiveListenerBus
 
stop() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
stop() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.TaskResultGetter
 
stop() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
stop() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
stop() - Method in class org.apache.spark.SparkContext

Shut down the SparkContext.

stop() - Method in class org.apache.spark.SparkEnv
 
stop() - Method in class org.apache.spark.storage.BlockManager
 
stop() - Method in class org.apache.spark.storage.BlockManagerMaster

Stop the driver actor, called only on the Spark driver node

stop() - Method in class org.apache.spark.storage.DiskBlockManager

Cleanup local dirs and stop shuffle sender.

stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Stop the execution of the streams.

stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Stop the execution of the streams.

stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Stop the execution of the streams.

stop() - Method in class org.apache.spark.streaming.CheckpointWriter
 
stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.InputDStream

Method called to stop receiving data.

stop() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
stop() - Method in class org.apache.spark.streaming.DStreamGraph
 
stop() - Method in class org.apache.spark.streaming.receiver.BlockGenerator

Stop all threads.

stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver

Stop the receiver completely.

stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver

Stop the receiver completely due to an exception

stop(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Mark the supervisor and the receiver for stopping

stop() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.JobGenerator

Stop generation of jobs.

stop(boolean) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
stop() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker

Stop the block tracker.

stop() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
stop() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker

Stop the receiver execution thread.

stop() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext

Stop the execution of the streams immediately (does not wait for all received data
 to be processed).

stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext

Stop the execution of the streams, with option of ensuring all received data
 has been processed.

stop(boolean) - Method in class org.apache.spark.streaming.util.RecurringTimer

Stop the timer, and return the last time the callback was made.

stop() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager

Stop the manager, close any open log writer

stop() - Method in class org.apache.spark.ui.SparkUI

Stop the server behind this web interface.

stop() - Method in class org.apache.spark.ui.WebUI

Stop the server behind this web interface.

stop() - Method in class org.apache.spark.util.FileLogger

Close all open writers, streams, and file systems.

stop() - Method in class org.apache.spark.util.logging.FileAppender

Stop the appender

stop() - Method in class org.apache.spark.util.logging.RollingFileAppender

Stop the appender

StopExecutor - Class in org.apache.spark.scheduler.local
 
StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
 
stopExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
StopMapOutputTracker - Class in org.apache.spark
 
StopMapOutputTracker() - Constructor for class org.apache.spark.StopMapOutputTracker
 
Stopped() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Stopped() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
stopping() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
stopReceiver(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor

Stop receiver

StopReceiver - Class in org.apache.spark.streaming.receiver
 
StopReceiver() - Constructor for class org.apache.spark.streaming.receiver.StopReceiver
 
storageLevel() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
storageLevel() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
storageLevel() - Method in class org.apache.spark.storage.BlockStatus
 
storageLevel() - Method in class org.apache.spark.storage.RDDInfo
 
StorageLevel - Class in org.apache.spark.storage

:: DeveloperApi ::
 Flags for controlling the storage of an RDD.

StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
 
storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
 
storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
 
storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel

:: DeveloperApi ::
 Read StorageLevel object from ObjectInput stream.

storageLevelFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
StorageLevels - Class in org.apache.spark.api.java

Expose some commonly useful storage level constants.

StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
 
storageLevelToJson(StorageLevel) - Static method in class org.apache.spark.util.JsonProtocol
 
storageListener() - Method in class org.apache.spark.ui.SparkUI
 
StorageListener - Class in org.apache.spark.ui.storage

:: DeveloperApi ::
 A SparkListener that prepares information to be displayed on the BlockManagerUI.

StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
 
StoragePage - Class in org.apache.spark.ui.storage

Page showing list of RDD's currently stored in the cluster

StoragePage(StorageTab) - Constructor for class org.apache.spark.ui.storage.StoragePage
 
StorageStatus - Class in org.apache.spark.storage

:: DeveloperApi ::
 Storage information for each BlockManager.

StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
 
StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus

Create a storage status with an initial set of blocks, leaving the source unmodified.

storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
 
storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
 
StorageStatusListener - Class in org.apache.spark.storage

:: DeveloperApi ::
 A SparkListener that maintains executor storage status.

StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
 
storageStatusListener() - Method in class org.apache.spark.ui.SparkUI
 
StorageTab - Class in org.apache.spark.ui.storage

Web UI showing storage status of all RDD's in the given SparkContext.

StorageTab(SparkUI) - Constructor for class org.apache.spark.ui.storage.StorageTab
 
StorageUtils - Class in org.apache.spark.storage

Helper methods for storage-related objects.

StorageUtils() - Constructor for class org.apache.spark.storage.StorageUtils
 
store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper

Store an iterator of received data as a data block into Spark's memory.

store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper

Store the bytes of received data as a data block into Spark's memory.

store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper

Store a single item of received data to Spark's memory.

store(T) - Method in class org.apache.spark.streaming.receiver.Receiver

Store a single item of received data to Spark's memory.

store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver

Store an ArrayBuffer of received data as a data block into Spark's memory.

store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver

Store an ArrayBuffer of received data as a data block into Spark's memory.

store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver

Store an iterator of received data as a data block into Spark's memory.

store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver

Store an iterator of received data as a data block into Spark's memory.

store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver

Store an iterator of received data as a data block into Spark's memory.

store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver

Store an iterator of received data as a data block into Spark's memory.

store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver

Store the bytes of received data as a data block into Spark's memory.

store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver

Store the bytes of received data as a data block into Spark's memory.

storeBlock(StreamBlockId, ReceivedBlock) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
storeBlock(StreamBlockId, ReceivedBlock) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler

Store a received block with the given block id and return related metadata

storeBlock(StreamBlockId, ReceivedBlock) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler

This implementation stores the block into the block manager as well as a write ahead log.

Strategy - Class in org.apache.spark.mllib.tree.configuration

:: Experimental ::
 Stores all the configuration options for tree construction

Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, Option<String>, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
 
Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy

Java-friendly constructor for Strategy

STRATEGY_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
STRATEGY_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
StratifiedSamplingUtils - Class in org.apache.spark.util.random

Auxiliary functions and data structures for the sampleByKey method in PairRDDFunctions.

StratifiedSamplingUtils() - Constructor for class org.apache.spark.util.random.StratifiedSamplingUtils
 
STREAM() - Static method in class org.apache.spark.storage.BlockId
 
StreamBasedRecordReader<T> - Class in org.apache.spark.input

An abstract class of RecordReader
 to reading files out as streams

StreamBasedRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.StreamBasedRecordReader
 
StreamBlockId - Class in org.apache.spark.storage
 
StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
 
streamed() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
 
streamedKeys() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
streamedPlan() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
StreamFileInputFormat<T> - Class in org.apache.spark.input

A general format for reading whole files in as streams, byte arrays,
 or other functions to be added

StreamFileInputFormat() - Constructor for class org.apache.spark.input.StreamFileInputFormat
 
streamId() - Method in class org.apache.spark.storage.StreamBlockId
 
streamId() - Method in class org.apache.spark.streaming.receiver.Receiver

Get the unique identifier the receiver input stream that this
 receiver is associated with.

streamId() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
streamId() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
streamIdToAllocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
StreamingContext - Class in org.apache.spark.streaming

Main entry point for Spark Streaming functionality.

StreamingContext(SparkContext, Checkpoint, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
 
StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext

Create a StreamingContext using an existing SparkContext.

StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext

Create a StreamingContext by providing the configuration necessary for a new SparkContext.

StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext

Create a StreamingContext by providing the details necessary for creating a new SparkContext.

StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext

Recreate a StreamingContext from a checkpoint file.

StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext

Recreate a StreamingContext from a checkpoint file.

StreamingContext.StreamingContextState$ - Class in org.apache.spark.streaming

Enumeration to identify current state of the StreamingContext

StreamingContext.StreamingContextState$() - Constructor for class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
StreamingContextState() - Method in class org.apache.spark.streaming.StreamingContext

Accessor for nested Scala object

StreamingExamples - Class in org.apache.spark.examples.streaming

Utility functions for Spark Streaming examples.

StreamingExamples() - Constructor for class org.apache.spark.examples.streaming.StreamingExamples
 
StreamingJobProgressListener - Class in org.apache.spark.streaming.ui
 
StreamingJobProgressListener(StreamingContext) - Constructor for class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
StreamingKMeans - Class in org.apache.spark.mllib.clustering

:: DeveloperApi ::
 StreamingKMeans provides methods for configuring a
 streaming k-means analysis, training the model on streaming,
 and using the model to make predictions on streaming data.

StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
 
StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
 
StreamingKMeansModel - Class in org.apache.spark.mllib.clustering

:: DeveloperApi ::
 StreamingKMeansModel extends MLlib's KMeansModel for streaming
 algorithms, so it can keep track of a continuously updated weight
 associated with each cluster, and also update the model by
 doing a single iteration of the standard k-means algorithm.

StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression

:: DeveloperApi ::
 StreamingLinearAlgorithm implements methods for continuously
 training a generalized linear model model on streaming data,
 and using it for prediction on (possibly different) streaming data.

StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
 
StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression

Train or predict a linear regression model on streaming data.

StreamingLinearRegressionWithSGD(double, int, double, Vector) - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

Construct a StreamingLinearRegression object with default parameters:
 {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.

StreamingListener - Interface in org.apache.spark.streaming.scheduler

:: DeveloperApi ::
 A listener interface for receiving information about an ongoing streaming
 computation.

StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
StreamingListenerBus - Class in org.apache.spark.streaming.scheduler

Asynchronously passes StreamingListenerEvents to registered StreamingListeners.

StreamingListenerBus() - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler

:: DeveloperApi ::
 Base trait for events related to StreamingListener

StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
StreamingListenerShutdown - Class in org.apache.spark.streaming.scheduler

An event used in the listener to shutdown the listener daemon thread.

StreamingListenerShutdown() - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerShutdown
 
StreamingPage - Class in org.apache.spark.streaming.ui

Page for Spark Web UI that shows statistics of a streaming job

StreamingPage(StreamingTab) - Constructor for class org.apache.spark.streaming.ui.StreamingPage
 
StreamingSource - Class in org.apache.spark.streaming
 
StreamingSource(StreamingContext) - Constructor for class org.apache.spark.streaming.StreamingSource
 
StreamingTab - Class in org.apache.spark.streaming.ui

Spark Web UI tab that shows statistics of a streaming job.

StreamingTab(StreamingContext) - Constructor for class org.apache.spark.streaming.ui.StreamingTab
 
StreamInputFormat - Class in org.apache.spark.input

The format for the PortableDataStream files

StreamInputFormat() - Constructor for class org.apache.spark.input.StreamInputFormat
 
StreamRecordReader - Class in org.apache.spark.input

Reads the record in directly as a stream for other objects to manipulate and handle

StreamRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.StreamRecordReader
 
streamSideKeyGenerator() - Method in interface org.apache.spark.sql.execution.joins.HashJoin
 
STRING - Class in org.apache.spark.sql.columnar
 
STRING() - Constructor for class org.apache.spark.sql.columnar.STRING
 
StringColumnAccessor - Class in org.apache.spark.sql.columnar
 
StringColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.StringColumnAccessor
 
StringColumnBuilder - Class in org.apache.spark.sql.columnar
 
StringColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.StringColumnBuilder
 
StringColumnStats - Class in org.apache.spark.sql.columnar
 
StringColumnStats() - Constructor for class org.apache.spark.sql.columnar.StringColumnStats
 
stringifyPartialValue(Object) - Static method in class org.apache.spark.Accumulators
 
stringifyValue(Object) - Static method in class org.apache.spark.Accumulators
 
stringToText(String) - Static method in class org.apache.spark.SparkContext
 
stringToTime(String) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
 
StringType - Static variable in class org.apache.spark.sql.api.java.DataType

Gets the StringType object.

StringType - Class in org.apache.spark.sql.api.java

The data type representing String values.

stringWritableConverter() - Static method in class org.apache.spark.SparkContext
 
stripDirectory(String) - Static method in class org.apache.spark.util.Utils

Strip the directory from a path name

stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps

Compute the strongly connected component (SCC) of each vertex and return a graph with the
 vertex value containing the lowest vertex id in the SCC containing that vertex.

StronglyConnectedComponents - Class in org.apache.spark.graphx.lib

Strongly connected components algorithm implementation.

StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
 
StructField - Class in org.apache.spark.sql.api.java

A StructField object represents a field in a StructType object.

StructType - Class in org.apache.spark.sql.api.java

The data type representing Rows.

StudentTCacher - Class in org.apache.spark.partial

A utility class for caching Student's T distribution values for a given confidence level
 and various sample sizes.

StudentTCacher(double) - Constructor for class org.apache.spark.partial.StudentTCacher
 
subDirsPerLocalDir() - Method in class org.apache.spark.storage.DiskBlockManager
 
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph

Restricts the graph to only the vertices and edges satisfying the predicates.

subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
submissionTime() - Method in class org.apache.spark.scheduler.StageInfo

When this stage was submitted from the DAGScheduler to a TaskScheduler.

submissionTime() - Method in interface org.apache.spark.SparkStageInfo
 
submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
 
submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
submitJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, CallSite, boolean, Function2<Object, U, BoxedUnit>, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler

Submit a job to the job scheduler and get a JobWaiter object back.

submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext

:: Experimental ::
 Submit a job for execution and return a FutureJob holding the result.

submitJobSet(JobSet) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
submitTasks(TaskSet) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
submitTasks(TaskSet) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
subProperties(Properties, Regex) - Method in class org.apache.spark.metrics.MetricsConfig
 
subsampleWeights() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
 
subsamplingFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata

Indicates if feature subsampling is being used.

subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics

Returns subset accuracy
 (for equal sets of labels)

SUBSTR() - Static method in class org.apache.spark.sql.hive.HiveQl
 
subTestSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
subTestSchemaFieldNames() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD

Return an RDD with the elements from this that are not in other.

subtract(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator

Subtract the stats from another calculator from this one, modifying and returning this
 calculator.

subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD

Return an RDD with the elements from this that are not in other.

subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD

Return an RDD with the elements from this that are not in other.

subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD

Return an RDD with the elements from this that are not in other.

subtract(JavaSchemaRDD) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaSchemaRDD, int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Return an RDD with the elements from this that are not in other.

subtract(JavaSchemaRDD, Partitioner) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Return an RDD with the elements from this that are not in other.

subtract(RDD<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(RDD<Row>, int) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(RDD<Row>, Partitioner, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
 
subtract(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
subtract(Vector) - Method in class org.apache.spark.util.Vector
 
subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD with the pairs from this whose keys are not in other.

subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD with the pairs from `this` whose keys are not in `other`.

subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD with the pairs from `this` whose keys are not in `other`.

subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return an RDD with the pairs from this whose keys are not in other.

subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return an RDD with the pairs from `this` whose keys are not in `other`.

subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions

Return an RDD with the pairs from `this` whose keys are not in `other`.

SubtractedRDD<K,V,W> - Class in org.apache.spark.rdd

An optimized version of cogroup for set difference/subtraction.

SubtractedRDD(RDD<? extends Product2<K, V>>, RDD<? extends Product2<K, W>>, Partitioner, ClassTag<K>, ClassTag<V>, ClassTag<W>) - Constructor for class org.apache.spark.rdd.SubtractedRDD
 
subtreeDepth() - Method in class org.apache.spark.mllib.tree.model.Node

Get depth of tree from this node.

subtreeToString(int) - Method in class org.apache.spark.mllib.tree.model.Node

Recursive print function.

succeededTasks() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
success() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
 
Success - Class in org.apache.spark

:: DeveloperApi ::
 Task succeeded.

Success() - Constructor for class org.apache.spark.Success
 
successful() - Method in class org.apache.spark.scheduler.TaskInfo
 
successful() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SUCCESSFUL_JOB_OUTPUT_DIR_MARKER() - Static method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 
sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD

Add up the elements in this RDD.

Sum() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
sum() - Method in class org.apache.spark.partial.CountEvaluator
 
sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Add up the elements in this RDD.

SUM() - Static method in class org.apache.spark.sql.hive.HiveQl
 
sum() - Method in class org.apache.spark.util.StatCounter
 
sum() - Method in class org.apache.spark.util.Vector
 
sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD

:: Experimental ::
 Approximate operation to return the sum within a timeout.

sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD

:: Experimental ::
 Approximate operation to return the sum within a timeout.

sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions

:: Experimental ::
 Approximate operation to return the sum within a timeout.

SumEvaluator - Class in org.apache.spark.partial

An ApproximateEvaluator for sums.

SumEvaluator(int, double) - Constructor for class org.apache.spark.partial.SumEvaluator
 
summary(PrintStream) - Method in class org.apache.spark.util.Distribution

print a summary of this distribution to the given PrintStream.

sums() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
sums() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
sums() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
supervisorStrategy() - Method in class org.apache.spark.scheduler.DAGSchedulerActorSupervisor
 
supervisorStrategy() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest

List of supported feature subset sampling strategies.

supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
supports(ColumnType<?, ?>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
SVDPlusPlus - Class in org.apache.spark.graphx.lib

Implementation of SVD++ algorithm.

SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
 
SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib

Configuration parameters for SVDPlusPlus.

SVDPlusPlus.Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
SVMDataGenerator - Class in org.apache.spark.mllib.util

:: DeveloperApi ::
 Generate sample data used for SVM.

SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
 
SVMModel - Class in org.apache.spark.mllib.classification

Model for Support Vector Machines (SVMs).

SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
 
SVMWithSGD - Class in org.apache.spark.mllib.classification

Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.

SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD

Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100,
 regParm: 0.01, miniBatchFraction: 1.0}.

symlink(File, File) - Static method in class org.apache.spark.util.Utils

Creates a symlink.

symmetricEigs(Function1<DenseVector<Object>, DenseVector<Object>>, int, int, double, int) - Static method in class org.apache.spark.mllib.linalg.EigenValueDecomposition

Compute the leading k eigenvalues and eigenvectors on a symmetric square matrix using ARPACK.

SystemClock - Class in org.apache.spark.streaming.util
 
SystemClock() - Constructor for class org.apache.spark.streaming.util.SystemClock
 
SystemClock - Class in org.apache.spark.util
 
SystemClock() - Constructor for class org.apache.spark.util.SystemClock
 
systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
systemProperty(Enumeration.Value) - Static method in class org.apache.spark.util.MetadataCleanerType
 




T

t() - Method in class org.apache.spark.SerializableWritable
 
table() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
table() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
table() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
table(String) - Method in class org.apache.spark.sql.SQLContext

Returns the specified table as a SchemaRDD

TABLE_CLASS_NOT_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
 
TABLE_CLASS_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
 
tableDesc() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
tableExists(Seq<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
tableInfo() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
tableName() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
tableName() - Method in class org.apache.spark.sql.execution.CacheTableCommand
 
tableName() - Method in class org.apache.spark.sql.execution.UncacheTableCommand
 
tableName() - Method in class org.apache.spark.sql.hive.AnalyzeTable
 
tableName() - Method in class org.apache.spark.sql.hive.DropTable
 
tableName() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
 
tableName() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
tableName() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
tableName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
tableName() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
TableReader - Interface in org.apache.spark.sql.hive

A trait for subclasses that handle table scans.

TableScan - Class in org.apache.spark.sql.sources

::DeveloperApi::
 A BaseRelation that can produce all of its tuples as an RDD of Row objects.

TableScan() - Constructor for class org.apache.spark.sql.sources.TableScan
 
TachyonBlockManager - Class in org.apache.spark.storage

Creates and maintains the logical mapping between logical blocks and tachyon fs locations.

TachyonBlockManager(BlockManager, String, String) - Constructor for class org.apache.spark.storage.TachyonBlockManager
 
TachyonFileSegment - Class in org.apache.spark.storage

References a particular segment of a file (potentially the entire file), based off an offset and
 a length.

TachyonFileSegment(TachyonFile, long, long) - Constructor for class org.apache.spark.storage.TachyonFileSegment
 
tachyonFolderName() - Method in class org.apache.spark.SparkContext
 
tachyonSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
tachyonSize() - Method in class org.apache.spark.storage.BlockStatus
 
tachyonSize() - Method in class org.apache.spark.storage.RDDInfo
 
tachyonStore() - Method in class org.apache.spark.storage.BlockManager
 
TachyonStore - Class in org.apache.spark.storage

Stores BlockManager blocks on Tachyon.

TachyonStore(BlockManager, TachyonBlockManager) - Constructor for class org.apache.spark.storage.TachyonStore
 
tail() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike

Take the first num elements of the RDD.

take(int) - Method in class org.apache.spark.rdd.RDD

Take the first num elements of the RDD.

take(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
take(int) - Method in class org.apache.spark.sql.SchemaRDD
 
takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike

The asynchronous version of the take action, which returns a
 future for retrieving the first num elements of this RDD.

takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions

Returns a future for retrieving the first num elements of the RDD.

takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Returns the first k (smallest) elements from this RDD as defined by
 the specified Comparator[T] and maintains the order.

takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike

Returns the first k (smallest) elements from this RDD using the
 natural ordering for T while maintain the order.

takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD

Returns the first k (smallest) elements from this RDD as defined by the specified
 implicit Ordering[T] and maintains the ordering.

TakeOrdered() - Method in class org.apache.spark.sql.execution.SparkStrategies
 
TakeOrdered - Class in org.apache.spark.sql.execution

:: DeveloperApi ::
 Take the first limit elements as defined by the sortOrder.

TakeOrdered(int, Seq<SortOrder>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.TakeOrdered
 
takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD

Return a fixed-size sampled subset of this RDD in an array

targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
task() - Method in class org.apache.spark.CleanupTaskWeakReference
 
task() - Method in class org.apache.spark.scheduler.BeginEvent
 
task() - Method in class org.apache.spark.scheduler.CompletionEvent
 
Task<T> - Class in org.apache.spark.scheduler

A unit of execution.

Task(int, int) - Constructor for class org.apache.spark.scheduler.Task
 
TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
TASK_SIZE_TO_WARN_KB() - Static method in class org.apache.spark.scheduler.TaskSetManager
 
taskAttempts() - Method in class org.apache.spark.scheduler.TaskSetManager
 
TaskCompletionListener - Interface in org.apache.spark.util

:: DeveloperApi ::

TaskCompletionListenerException - Exception in org.apache.spark.util

Exception thrown when there is an exception in
 executing the callback in TaskCompletionListener.

TaskCompletionListenerException(Seq<String>) - Constructor for exception org.apache.spark.util.TaskCompletionListenerException
 
TaskContext - Class in org.apache.spark

Contextual information about a task which can be read or mutated during
 execution.

TaskContext() - Constructor for class org.apache.spark.TaskContext
 
TaskContextHelper - Class in org.apache.spark

This class exists to restrict the visibility of TaskContext setters.

TaskContextHelper() - Constructor for class org.apache.spark.TaskContextHelper
 
TaskContextImpl - Class in org.apache.spark
 
TaskContextImpl(int, int, long, boolean, TaskMetrics) - Constructor for class org.apache.spark.TaskContextImpl
 
taskData() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
TaskDescription - Class in org.apache.spark.scheduler

Description of a task that gets passed onto executors to be executed, usually created by
 TaskSetManager.resourceOffer.

TaskDescription(long, String, String, int, ByteBuffer) - Constructor for class org.apache.spark.scheduler.TaskDescription
 
TaskDetailsClassNames - Class in org.apache.spark.ui.jobs

Names of the CSS classes corresponding to each type of task detail.

TaskDetailsClassNames() - Constructor for class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
taskEnded(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskEndReason - Interface in org.apache.spark

:: DeveloperApi ::
 Various possible reasons why a task ended.

taskEndReasonFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskEndReasonToJson(TaskEndReason) - Static method in class org.apache.spark.util.JsonProtocol
 
taskEndToJson(SparkListenerTaskEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskFailedReason - Interface in org.apache.spark

:: DeveloperApi ::
 Various possible reasons why a task failed.

taskGettingResult(TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskGettingResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskGettingResultToJson(SparkListenerTaskGettingResult) - Static method in class org.apache.spark.util.JsonProtocol
 
taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
taskId() - Method in class org.apache.spark.scheduler.local.KillTask
 
taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
taskId() - Method in class org.apache.spark.scheduler.TaskDescription
 
taskId() - Method in class org.apache.spark.scheduler.TaskInfo
 
taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
 
taskIdsOnSlave() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
taskIdToExecutorId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
taskIdToSlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
taskIdToSlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
taskIdToTaskSetId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
taskInfo() - Method in class org.apache.spark.scheduler.BeginEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.CompletionEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.GettingResultEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
TaskInfo - Class in org.apache.spark.scheduler

:: DeveloperApi ::
 Information about a running task attempt inside a TaskSet.

TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
 
taskInfo() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
taskInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskInfos() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskInfoToJson(TaskInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskKilled - Class in org.apache.spark

:: DeveloperApi ::
 Task was killed intentionally and needs to be rescheduled.

TaskKilled() - Constructor for class org.apache.spark.TaskKilled
 
TaskKilledException - Exception in org.apache.spark

:: DeveloperApi ::
 Exception thrown when a task is explicitly killed (i.e., task failure is expected).

TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
 
taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
 
TaskLocality - Class in org.apache.spark.scheduler
 
TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
 
TaskLocation - Interface in org.apache.spark.scheduler

A location where a task should run.

taskMetrics() - Method in class org.apache.spark.Heartbeat
 
taskMetrics() - Method in class org.apache.spark.scheduler.CompletionEvent
 
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
taskMetrics() - Method in class org.apache.spark.TaskContext

::DeveloperApi::

taskMetrics() - Method in class org.apache.spark.TaskContextImpl
 
taskMetrics() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
taskMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskMetricsToJson(TaskMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskResult<T> - Interface in org.apache.spark.scheduler
 
TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
 
TaskResultBlockId - Class in org.apache.spark.storage
 
TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
 
TaskResultGetter - Class in org.apache.spark.scheduler

Runs a thread pool that deserializes and remotely fetches (if necessary) task results.

TaskResultGetter(SparkEnv, TaskSchedulerImpl) - Constructor for class org.apache.spark.scheduler.TaskResultGetter
 
taskResultGetter() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskResultLost - Class in org.apache.spark

:: DeveloperApi ::
 The task finished successfully, but the result was lost from the executor's block manager before
 it was fetched.

TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
 
taskRow(boolean, boolean, boolean, boolean, boolean, boolean, UIData.TaskUIData) - Method in class org.apache.spark.ui.jobs.StagePage
 
tasks() - Method in class org.apache.spark.scheduler.TaskSet
 
tasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskScheduler() - Method in class org.apache.spark.scheduler.DAGScheduler
 
TaskScheduler - Interface in org.apache.spark.scheduler

Low-level task scheduler interface, currently implemented exclusively by TaskSchedulerImpl.

taskScheduler() - Method in class org.apache.spark.SparkContext
 
TaskSchedulerImpl - Class in org.apache.spark.scheduler

Schedules tasks for multiple types of clusters by acting through a SchedulerBackend.

TaskSchedulerImpl(SparkContext, int, boolean) - Constructor for class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskSchedulerImpl(SparkContext) - Constructor for class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskSet - Class in org.apache.spark.scheduler

A set of tasks submitted together to the low-level TaskScheduler, usually representing
 missing partitions of a particular stage.

TaskSet(Task<?>[], int, int, int, Properties) - Constructor for class org.apache.spark.scheduler.TaskSet
 
taskSet() - Method in class org.apache.spark.scheduler.TaskSetFailed
 
taskSet() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskSetFailed(TaskSet, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
TaskSetFailed - Class in org.apache.spark.scheduler
 
TaskSetFailed(TaskSet, String) - Constructor for class org.apache.spark.scheduler.TaskSetFailed
 
taskSetFinished(TaskSetManager) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl

Called to indicate that all task attempts (including speculated tasks) associated with the
 given TaskSetManager have completed, so state associated with the TaskSetManager should be
 cleaned up.

TaskSetManager - Class in org.apache.spark.scheduler

Schedules the tasks within a single TaskSet in the TaskSchedulerImpl.

TaskSetManager(TaskSchedulerImpl, TaskSet, int, Clock) - Constructor for class org.apache.spark.scheduler.TaskSetManager
 
taskSetSchedulingAlgorithm() - Method in class org.apache.spark.scheduler.Pool
 
tasksSuccessful() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskStarted(Task<?>, TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskStartToJson(SparkListenerTaskStart) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskState - Class in org.apache.spark
 
TaskState() - Constructor for class org.apache.spark.TaskState
 
taskSucceeded(int, Object) - Method in class org.apache.spark.partial.ApproximateActionListener
 
taskSucceeded(int, Object) - Method in interface org.apache.spark.scheduler.JobListener
 
taskSucceeded(int, Object) - Method in class org.apache.spark.scheduler.JobWaiter
 
taskTime() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
tellMaster() - Method in class org.apache.spark.storage.BlockInfo
 
TempLocalBlockId - Class in org.apache.spark.storage

Id associated with temporary local data managed as blocks.

TempLocalBlockId(UUID) - Constructor for class org.apache.spark.storage.TempLocalBlockId
 
TempShuffleBlockId - Class in org.apache.spark.storage

Id associated with temporary shuffle data managed as blocks.

TempShuffleBlockId(UUID) - Constructor for class org.apache.spark.storage.TempShuffleBlockId
 
TerminalWidth() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
TEST() - Static method in class org.apache.spark.storage.BlockId
 
TestBlockId - Class in org.apache.spark.storage
 
TestBlockId(String) - Constructor for class org.apache.spark.storage.TestBlockId
 
TestClock - Class in org.apache.spark

A clock that allows the caller to customize the time.

TestClock(long) - Constructor for class org.apache.spark.TestClock
 
testData() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testFilterDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testFilterSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestGroupWriteSupport - Class in org.apache.spark.sql.parquet
 
TestGroupWriteSupport(MessageType) - Constructor for class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
TestHive - Class in org.apache.spark.sql.hive.test
 
TestHive() - Constructor for class org.apache.spark.sql.hive.test.TestHive
 
TestHiveContext - Class in org.apache.spark.sql.hive.test

A locally running test instance of Spark's Hive execution engine.

TestHiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext
 
TestHiveContext.QueryExecution - Class in org.apache.spark.sql.hive.test

Override QueryExecution with special debug workflow.

TestHiveContext.QueryExecution() - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
 
TestHiveContext.TestTable - Class in org.apache.spark.sql.hive.test
 
TestHiveContext.TestTable(String, Seq<Function0<BoxedUnit>>) - Constructor for class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
 
testNestedData1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedData2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestResult<DF> - Interface in org.apache.spark.mllib.stat.test

:: Experimental ::
 Trait for hypothesis test results.

testSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testSchemaFieldNames() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestSQLContext - Class in org.apache.spark.sql.test

A SQLContext that can be used for local testing.

TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
 
testTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext

A list of test tables and the DDL required to initialize them.

testTempDir() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
TestUtils - Class in org.apache.spark

Utilities for tests.

TestUtils() - Constructor for class org.apache.spark.TestUtils
 
textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext

Read a text file from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI, and return it as an RDD of Strings.

textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Read a text file from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI, and return it as an RDD of Strings.

textFile(String, int) - Method in class org.apache.spark.SparkContext

Read a text file from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI, and return it as an RDD of Strings.

textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create an input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them as text files (using key as LongWritable, value
 as Text and input format as TextInputFormat).

textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext

Create a input stream that monitors a Hadoop-compatible filesystem
 for new files and reads them as text files (using key as LongWritable, value
 as Text and input format as TextInputFormat).

textResponderToServlet(Function1<HttpServletRequest, String>) - Static method in class org.apache.spark.ui.JettyUtils
 
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
thread() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
threadDumpEnabled() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
threadId() - Method in class org.apache.spark.util.ThreadStackTrace
 
threadName() - Method in class org.apache.spark.util.ThreadStackTrace
 
ThreadStackTrace - Class in org.apache.spark.util

Used for shipping per-thread stacktraces from the executors to driver.

ThreadStackTrace(long, String, Thread.State, String) - Constructor for class org.apache.spark.util.ThreadStackTrace
 
threadState() - Method in class org.apache.spark.util.ThreadStackTrace
 
threshold() - Method in interface org.apache.spark.ml.param.HasThreshold

param for threshold in (binary) prediction

threshold() - Method in class org.apache.spark.mllib.tree.model.Split
 
thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

Returns thresholds in descending order.

threshTime() - Method in class org.apache.spark.streaming.receiver.CleanupOldBlocks
 
throwBalls() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
tick(long) - Method in class org.apache.spark.TestClock
 
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
time() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
time() - Method in class org.apache.spark.streaming.scheduler.ClearCheckpointData
 
time() - Method in class org.apache.spark.streaming.scheduler.ClearMetadata
 
time() - Method in class org.apache.spark.streaming.scheduler.DoCheckpoint
 
time() - Method in class org.apache.spark.streaming.scheduler.GenerateJobs
 
time() - Method in class org.apache.spark.streaming.scheduler.Job
 
time() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
Time - Class in org.apache.spark.streaming

This is a simple class that represents an absolute instant of time.

Time(long) - Constructor for class org.apache.spark.streaming.Time
 
time() - Method in class org.apache.spark.streaming.util.ManualClock
 
TimeBasedRollingPolicy - Class in org.apache.spark.util.logging

Defines a RollingPolicy by which files will be rolled
 over at a fixed interval.

TimeBasedRollingPolicy(long, String, boolean) - Constructor for class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
timeIt(int, Function0<BoxedUnit>, Option<Function0<BoxedUnit>>) - Static method in class org.apache.spark.util.Utils

Timing method based on iterations that permit JVM JIT optimization.

timeout() - Method in class org.apache.spark.storage.BlockManagerMaster
 
timeoutCheckingTask() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
timeRunning(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
times(int) - Method in class org.apache.spark.streaming.Duration
 
times() - Method in class org.apache.spark.streaming.scheduler.BatchCleanupEvent
 
times(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils

Method executed for repeating a task for side effects.

TIMESTAMP - Class in org.apache.spark.sql.columnar
 
TIMESTAMP() - Constructor for class org.apache.spark.sql.columnar.TIMESTAMP
 
timestamp() - Method in class org.apache.spark.util.TimeStampedValue
 
TimestampColumnAccessor - Class in org.apache.spark.sql.columnar
 
TimestampColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.TimestampColumnAccessor
 
TimestampColumnBuilder - Class in org.apache.spark.sql.columnar
 
TimestampColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.TimestampColumnBuilder
 
TimestampColumnStats - Class in org.apache.spark.sql.columnar
 
TimestampColumnStats() - Constructor for class org.apache.spark.sql.columnar.TimestampColumnStats
 
TimeStampedHashMap<A,B> - Class in org.apache.spark.util

This is a custom implementation of scala.collection.mutable.Map which stores the insertion
 timestamp along with each key-value pair.

TimeStampedHashMap(boolean) - Constructor for class org.apache.spark.util.TimeStampedHashMap
 
TimeStampedHashSet<A> - Class in org.apache.spark.util
 
TimeStampedHashSet() - Constructor for class org.apache.spark.util.TimeStampedHashSet
 
TimeStampedValue<V> - Class in org.apache.spark.util
 
TimeStampedValue(V, long) - Constructor for class org.apache.spark.util.TimeStampedValue
 
TimeStampedWeakValueHashMap<A,B> - Class in org.apache.spark.util

A wrapper of TimeStampedHashMap that ensures the values are weakly referenced and timestamped.

TimeStampedWeakValueHashMap(boolean) - Constructor for class org.apache.spark.util.TimeStampedWeakValueHashMap
 
TimestampType - Static variable in class org.apache.spark.sql.api.java.DataType

Gets the TimestampType object.

TimestampType - Class in org.apache.spark.sql.api.java

The data type representing java.sql.Timestamp values.

timeToLogFile(long, long) - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
 
TimeTracker - Class in org.apache.spark.mllib.tree.impl

Time tracker implementation which holds labeled timers.

TimeTracker() - Constructor for class org.apache.spark.mllib.tree.impl.TimeTracker
 
timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
tmpPath() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
to(Time, Duration) - Method in class org.apache.spark.streaming.Time
 
toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike

Deprecated.
As of Spark 1.0.0, toArray() is deprecated, use JavaRDDLike.collect() instead


toArray() - Method in class org.apache.spark.input.PortableDataStream

Read the file as a byte array

toArray() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix

Converts to a dense array in column major.

toArray() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toArray() - Method in interface org.apache.spark.mllib.linalg.Vector

Converts the instance to a double array.

toArray() - Method in class org.apache.spark.rdd.RDD

Return an array that contains all of the elements in this RDD.

toArrays() - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
toAttribute() - Method in class org.apache.spark.sql.hive.MetastoreRelation.SchemaAttribute
 
toBatchInfo() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix

Collects data and assembles a local matrix.

toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix

Collects data and assembles a local dense breeze matrix (for test only).

toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix

Converts to a breeze matrix.

toBreeze() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector

Converts the instance to a breeze vector.

toCatalystDecimal(HiveDecimalObjectInspector, Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
toDataType(String) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
 
toDataType(Type, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter

Converts a given Parquet Type into the corresponding
 DataType.

toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike

A description of this RDD and its recursive dependencies for debugging.

toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel

Print the full model to a string.

toDebugString() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel

Print the full model to a string.

toDebugString() - Method in class org.apache.spark.rdd.RDD

A description of this RDD and its recursive dependencies for debugging.

toDebugString() - Method in class org.apache.spark.SparkConf

Return a string listing all keys and values, one per line.

toDense() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm

Converts the vector to a dense vector.

toEdgePartition() - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
 
toEdgePartition() - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
 
toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext

Converts the edge and vertex properties into an EdgeTriplet for convenience.

toErrorString() - Method in class org.apache.spark.ExceptionFailure
 
toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
 
toErrorString() - Method in class org.apache.spark.FetchFailed
 
toErrorString() - Static method in class org.apache.spark.Resubmitted
 
toErrorString() - Method in interface org.apache.spark.TaskFailedReason

Error message displayed in the web UI.

toErrorString() - Static method in class org.apache.spark.TaskKilled
 
toErrorString() - Static method in class org.apache.spark.TaskResultLost
 
toErrorString() - Static method in class org.apache.spark.UnknownReason
 
toFormattedString() - Method in class org.apache.spark.streaming.Duration
 
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix

Converts to IndexedRowMatrix.

toInspector(DataType) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
toInspector(Expression) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
toInt() - Method in class org.apache.spark.storage.StorageLevel
 
toJava(Object, DataType) - Static method in class org.apache.spark.sql.execution.EvaluatePython

Helper for converting a Scala object to a java suitable for pyspark serialization.

toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Convert to a JavaDStream

toJavaRDD() - Method in class org.apache.spark.rdd.RDD
 
toJavaSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD

Returns this RDD as a JavaSchemaRDD.

toJSON() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Returns a new RDD with each row transformed to a JSON string.

toJSON() - Method in class org.apache.spark.sql.SchemaRDD

Returns a new RDD with each row transformed to a JSON string.

tokenize(String) - Static method in class org.apache.spark.rdd.PipedRDD
 
Tokenizer - Class in org.apache.spark.ml.feature

:: AlphaComponent ::
 A tokenizer that converts the input string to lowercase and then splits it by white spaces.

Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
 
toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike

Return an iterator that contains all of the elements in this RDD.

toLocalIterator() - Method in class org.apache.spark.rdd.RDD

Return an iterator that contains all of the elements in this RDD.

toMap() - Method in class org.apache.spark.util.TimeStampedHashMap
 
toMap() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toMesos(Enumeration.Value) - Static method in class org.apache.spark.TaskState
 
toMetastoreType(DataType) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.ExecutorTable
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.PoolTable
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.StageTableBase
 
ToolTips - Class in org.apache.spark.ui
 
ToolTips() - Constructor for class org.apache.spark.ui.ToolTips
 
toOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
 
toOps(VertexPartition<VD>, ClassTag<VD>) - Method in class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
 
toOps(T, ClassTag<VD>) - Method in interface org.apache.spark.graphx.impl.VertexPartitionBaseOpsConstructor
 
top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Returns the top k (largest) elements from this RDD as defined by
 the specified Comparator[T].

top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike

Returns the top k (largest) elements from this RDD using the
 natural ordering for T.

top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
 
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
 
topK(Iterator<Tuple2<String, Object>>, int) - Static method in class org.apache.spark.streaming.util.RawTextHelper

Gets the top k words in terms of word counts.

topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
toPrimitiveDataType(PrimitiveType, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
 
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix

Converts to RowMatrix, dropping row indices after grouping by row index.

toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix

Drops row indices and converts this matrix to a
 RowMatrix.

TorrentBroadcast<T> - Class in org.apache.spark.broadcast

A BitTorrent-like implementation of Broadcast.

TorrentBroadcast(T, long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.TorrentBroadcast
 
TorrentBroadcastFactory - Class in org.apache.spark.broadcast

A Broadcast implementation that uses a BitTorrent-like
 protocol to do a distributed transfer of the broadcasted data to the executors.

TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
 
toScalaFunction(Function<T, R>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toScalaFunction2(Function2<T1, T2, R>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toSchemaRDD() - Method in class org.apache.spark.sql.SchemaRDD

Returns this RDD as a SchemaRDD.

toSeq() - Method in class org.apache.spark.ml.param.ParamMap

Converts this param map to a sequence of param pairs.

toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
 
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
 
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
 
toString() - Method in class org.apache.spark.Accumulable
 
toString() - Method in class org.apache.spark.api.java.JavaRDD
 
toString() - Method in class org.apache.spark.broadcast.Broadcast
 
toString() - Method in class org.apache.spark.graphx.EdgeDirection
 
toString() - Method in class org.apache.spark.graphx.EdgeTriplet
 
toString() - Method in class org.apache.spark.ml.param.Param
 
toString() - Method in class org.apache.spark.ml.param.ParamMap
 
toString() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toString() - Method in interface org.apache.spark.mllib.linalg.Matrix

A human readable representation of the matrix

toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult

String explaining the hypothesis test result.

toString() - Method in class org.apache.spark.mllib.tree.impl.TimeTracker

Print all timing results in seconds.

toString() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel

Print a summary of the model.

toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
toString() - Method in class org.apache.spark.mllib.tree.model.Node
 
toString() - Method in class org.apache.spark.mllib.tree.model.Predict
 
toString() - Method in class org.apache.spark.mllib.tree.model.Split
 
toString() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel

Print a summary of the model.

toString() - Method in class org.apache.spark.partial.BoundedDouble
 
toString() - Method in class org.apache.spark.partial.PartialResult
 
toString() - Method in class org.apache.spark.rdd.RDD
 
toString() - Method in class org.apache.spark.scheduler.ExecutorLossReason
 
toString() - Method in class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
toString() - Method in class org.apache.spark.scheduler.HostTaskLocation
 
toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
toString() - Method in class org.apache.spark.scheduler.ResultTask
 
toString() - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
toString() - Method in class org.apache.spark.scheduler.SplitInfo
 
toString() - Method in class org.apache.spark.scheduler.Stage
 
toString() - Method in class org.apache.spark.scheduler.TaskDescription
 
toString() - Method in class org.apache.spark.scheduler.TaskSet
 
toString() - Method in class org.apache.spark.SerializableWritable
 
toString() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
toString() - Method in class org.apache.spark.sql.api.java.Row
 
toString() - Method in class org.apache.spark.sql.columnar.ColumnType
 
toString() - Method in class org.apache.spark.sql.execution.PythonUDF
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
toString() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
toString() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
toString() - Method in interface org.apache.spark.sql.SchemaRDDLike
 
toString() - Method in class org.apache.spark.storage.BlockId
 
toString() - Method in class org.apache.spark.storage.BlockManagerId
 
toString() - Method in class org.apache.spark.storage.BlockManagerInfo
 
toString() - Method in class org.apache.spark.storage.FileSegment
 
toString() - Method in class org.apache.spark.storage.RDDInfo
 
toString() - Method in class org.apache.spark.storage.StorageLevel
 
toString() - Method in class org.apache.spark.storage.TachyonFileSegment
 
toString() - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
 
toString() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
toString() - Method in class org.apache.spark.streaming.Duration
 
toString() - Method in class org.apache.spark.streaming.Interval
 
toString() - Method in class org.apache.spark.streaming.scheduler.Job
 
toString() - Method in class org.apache.spark.streaming.Time
 
toString() - Method in class org.apache.spark.util.MutablePair
 
toString() - Method in class org.apache.spark.util.StatCounter
 
toString() - Method in class org.apache.spark.util.Vector
 
totalCoreCount() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
totalCores() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
totalCoresAcquired() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
totalCount() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo

Time taken for all the jobs of this batch to finish processing from the time they
 were submitted.

totalDelay() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
totalDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
totalDuration() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalExpectedCores() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
totalInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalNumNodes() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel

Get total number of nodes, summed over all trees in the forest.

totalRegisteredExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
totalResultSize() - Method in class org.apache.spark.scheduler.TaskSetManager
 
totalShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalTasks() - Method in class org.apache.spark.partial.ApproximateActionListener
 
totalTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
 
toTypeInfo() - Method in class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
 
toWeakReference(V) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toWeakReferenceFunction(Function1<Tuple2<K, V>, R>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toWeakReferenceTuple(Tuple2<K, V>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
trackerActor() - Method in class org.apache.spark.MapOutputTracker

Set to the MapOutputTrackerActor living on the driver.

train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD

Train a logistic regression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD

Train a logistic regression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD

Train a logistic regression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD

Train a logistic regression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes

Trains a Naive Bayes model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes

Trains a Naive Bayes model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD

Train a SVM model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD

Train a SVM model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD

Train a SVM model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD

Train a SVM model given an RDD of (label, features) pairs.

train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans

Trains a k-means model using the given set of parameters.

train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans

Trains a k-means model using specified parameters and the default values for unspecified.

train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans

Trains a k-means model using specified parameters and the default values for unspecified.

train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS

Train a matrix factorization model given an RDD of ratings given by users to some products,
 in the form of (userID, productID, rating) pairs.

train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS

Train a matrix factorization model given an RDD of ratings given by users to some products,
 in the form of (userID, productID, rating) pairs.

train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS

Train a matrix factorization model given an RDD of ratings given by users to some products,
 in the form of (userID, productID, rating) pairs.

train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS

Train a matrix factorization model given an RDD of ratings given by users to some products,
 in the form of (userID, productID, rating) pairs.

train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD

Train a Lasso model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD

Train a Lasso model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD

Train a Lasso model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD

Train a Lasso model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD

Train a Linear Regression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD

Train a LinearRegression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD

Train a LinearRegression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD

Train a LinearRegression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD

Train a RidgeRegression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD

Train a RidgeRegression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD

Train a RidgeRegression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD

Train a RidgeRegression model given an RDD of (label, features) pairs.

train(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree

Trains a decision tree model over an RDD.

train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees

Method to train a gradient boosting model.

train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees

Java-friendly API for GradientBoostedTrees$.train(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.BoostingStrategy)

trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree

Method to train a decision tree model for binary or multiclass classification.

trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree

Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)

trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest

Method to train a decision tree model for binary or multiclass classification.

trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest

Method to train a decision tree model for binary or multiclass classification.

trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest

Java-friendly API for RandomForest$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)

trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS

Train a matrix factorization model given an RDD of 'implicit preferences' given by users
 to some products, in the form of (userID, productID, preference) pairs.

trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS

Train a matrix factorization model given an RDD of 'implicit preferences' given by users
 to some products, in the form of (userID, productID, preference) pairs.

trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS

Train a matrix factorization model given an RDD of 'implicit preferences' given by users to
 some products, in the form of (userID, productID, preference) pairs.

trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS

Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by
 users to some products, in the form of (userID, productID, rating) pairs.

trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans

Update the clustering model by training on batches of data from a DStream.

trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm

Update the model by training on batches of data from a DStream.

trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree

Method to train a decision tree model for regression.

trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree

Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)

trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest

Method to train a decision tree model for regression.

trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest

Method to train a decision tree model for regression.

trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest

Java-friendly API for RandomForest$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)

transceiver() - Method in class org.apache.spark.streaming.flume.FlumeConnection
 
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.PipelineModel
 
transform(SchemaRDD, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer

Transforms the dataset with optional parameters

transform(JavaSchemaRDD, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer

Transforms the dataset with optional parameters.

transform(SchemaRDD, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer

Transforms the dataset with optional parameters

transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Transformer

Transforms the dataset with provided parameter map as additional parameters.

transform(JavaSchemaRDD, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer

Transforms the dataset with optional parameters.

transform(JavaSchemaRDD, ParamMap) - Method in class org.apache.spark.ml.Transformer

Transforms the dataset with provided parameter map as additional parameters.

transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
transform(SchemaRDD, ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
 
transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF

Transforms the input document into a sparse term frequency vector.

transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF

Transforms the input document into a sparse term frequency vector (Java version).

transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF

Transforms the input document to term frequency vectors.

transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF

Transforms the input document to term frequency vectors (Java version).

transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel

Transforms term frequency (TF) vectors to TF-IDF vectors.

transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel

Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).

transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer

Applies unit length normalization on a vector.

transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel

Applies standardization transformation on a vector.

transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer

Applies transformation on a vector.

transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer

Applies transformation on an RDD[Vector].

transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer

Applies transformation on an JavaRDD[Vector].

transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel

Transforms a word to its vector representation

transform(PartialFunction<ASTNode, ASTNode>) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode

Returns a copy of this node where rule has been recursively applied to it and all of its
 children.

transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream.

transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream.

transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a new DStream in which each RDD is generated by applying a function on RDDs of
 the DStreams.

transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream.

transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream.

transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext

Create a new DStream in which each RDD is generated by applying a function on RDDs of
 the DStreams.

TransformedDStream<U> - Class in org.apache.spark.streaming.dstream
 
TransformedDStream(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<U>>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.TransformedDStream
 
Transformer - Class in org.apache.spark.ml

:: AlphaComponent ::
 Abstract class for transformers that transform one dataset into another.

Transformer() - Constructor for class org.apache.spark.ml.Transformer
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.Pipeline
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.PipelineModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.PipelineStage

Derives the output schema from the input schema and parameters.

transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
 
transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream.

transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream.

transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a new DStream in which each RDD is generated by applying a function on RDDs of
 the DStreams.

transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream.

transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream.

transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream.

transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream.

transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream.

transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike

Return a new DStream in which each RDD is generated by applying a function
 on each RDD of 'this' DStream and 'other' DStream.

transposeMultiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix

Convenience method for `Matrix`^T^-`DenseMatrix` multiplication.

transposeMultiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix

Convenience method for `Matrix`^T^-`DenseVector` multiplication.

treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.mllib.rdd.RDDFunctions

Aggregates the elements of this RDD in a multi-level tree pattern.

TreeEnsembleModel - Class in org.apache.spark.mllib.tree.model

Represents a tree ensemble model.

TreeEnsembleModel(Enumeration.Value, DecisionTreeModel[], double[], Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel
 
TreePoint - Class in org.apache.spark.mllib.tree.impl

Internal representation of LabeledPoint for DecisionTree.

TreePoint(double, int[]) - Constructor for class org.apache.spark.mllib.tree.impl.TreePoint
 
treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions

Reduces the elements of this RDD in a multi-level tree pattern.

trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
triangleCount() - Method in class org.apache.spark.graphx.GraphOps

Compute the number of triangles passing through each vertex.

TriangleCount - Class in org.apache.spark.graphx.lib

Compute the number of triangles passing through each vertex.

TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
 
TripletFields - Class in org.apache.spark.graphx

Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].

TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields

Constructs a default TripletFields in which all fields are included.

TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
 
tripletIterator(boolean, boolean) - Method in class org.apache.spark.graphx.impl.EdgePartition

Get an iterator over the edge triplets in this partition.

triplets() - Method in class org.apache.spark.graphx.Graph

An RDD containing the edge triplets, which are edges along with the vertex data associated with
 the adjacent vertices.

triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl

Return a RDD that brings edges together with their source and destination vertices.

TRUE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns true positive rate for a given label (category)

tryLog(Function0<T>) - Static method in class org.apache.spark.util.Utils

Executes the given block in a Try, logging any uncaught exceptions.

tryOrExit(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils

Execute a block of code that evaluates to Unit, forwarding any uncaught exceptions to the
 default UncaughtExceptionHandler

tryOrIOException(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils

Execute a block of code that evaluates to Unit, re-throwing any non-fatal uncaught
 exceptions as IOException.

tryOrIOException(Function0<T>) - Static method in class org.apache.spark.util.Utils

Execute a block of code that returns a value, re-throwing any non-fatal uncaught
 exceptions as IOException.

tryUncacheQuery(SchemaRDD, boolean) - Method in interface org.apache.spark.sql.CacheManager

Tries to remove the data for the given SchemaRDD from the cache if it's cached

TwitterInputDStream - Class in org.apache.spark.streaming.twitter
 
TwitterInputDStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Constructor for class org.apache.spark.streaming.twitter.TwitterInputDStream
 
TwitterReceiver - Class in org.apache.spark.streaming.twitter
 
TwitterReceiver(Authorization, Seq<String>, StorageLevel) - Constructor for class org.apache.spark.streaming.twitter.TwitterReceiver
 
TwitterUtils - Class in org.apache.spark.streaming.twitter
 
TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
 
typ() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
typeId() - Method in class org.apache.spark.sql.columnar.ColumnType
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
typeId() - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 




U

U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
udf() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
 
udf() - Method in class org.apache.spark.sql.execution.EvaluatePython
 
UDF1<T1,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 1 arguments.

UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 10 arguments.

UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 11 arguments.

UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 12 arguments.

UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 13 arguments.

UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 14 arguments.

UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 15 arguments.

UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 16 arguments.

UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 17 arguments.

UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 18 arguments.

UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 19 arguments.

UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 2 arguments.

UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 20 arguments.

UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 21 arguments.

UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 22 arguments.

UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 3 arguments.

UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 4 arguments.

UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 5 arguments.

UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 6 arguments.

UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 7 arguments.

UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 8 arguments.

UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java

A Spark SQL UDF that has 9 arguments.

UDFRegistration - Interface in org.apache.spark.sql.api.java

A collection of functions that allow Java users to register UDFs.

UDFRegistration - Interface in org.apache.spark.sql

Functions for registering scala lambda functions as UDFs in a SQLContext.

UDTWrappers - Class in org.apache.spark.sql.api.java
 
UDTWrappers() - Constructor for class org.apache.spark.sql.api.java.UDTWrappers
 
ui() - Method in class org.apache.spark.SparkContext
 
uid() - Method in interface org.apache.spark.ml.Identifiable

A unique id for the object.

UIData - Class in org.apache.spark.ui.jobs
 
UIData() - Constructor for class org.apache.spark.ui.jobs.UIData
 
UIData.ExecutorSummary - Class in org.apache.spark.ui.jobs
 
UIData.ExecutorSummary() - Constructor for class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
UIData.JobUIData - Class in org.apache.spark.ui.jobs
 
UIData.JobUIData(int, Option<Object>, Option<Object>, Seq<Object>, Option<String>, JobExecutionStatus, int, int, int, int, int, int, OpenHashSet<Object>, int, int) - Constructor for class org.apache.spark.ui.jobs.UIData.JobUIData
 
UIData.JobUIData$ - Class in org.apache.spark.ui.jobs
 
UIData.JobUIData$() - Constructor for class org.apache.spark.ui.jobs.UIData.JobUIData$
 
UIData.StageUIData - Class in org.apache.spark.ui.jobs
 
UIData.StageUIData() - Constructor for class org.apache.spark.ui.jobs.UIData.StageUIData
 
UIData.TaskUIData - Class in org.apache.spark.ui.jobs

These are kept mutable and reused throughout a task's lifetime to avoid excessive reallocation.

UIData.TaskUIData(TaskInfo, Option<TaskMetrics>, Option<String>) - Constructor for class org.apache.spark.ui.jobs.UIData.TaskUIData
 
UIData.TaskUIData$ - Class in org.apache.spark.ui.jobs
 
UIData.TaskUIData$() - Constructor for class org.apache.spark.ui.jobs.UIData.TaskUIData$
 
uiRoot() - Static method in class org.apache.spark.ui.UIUtils
 
uiTab() - Method in class org.apache.spark.streaming.StreamingContext
 
UIUtils - Class in org.apache.spark.ui

Utility functions for generating XML pages with spark content.

UIUtils() - Constructor for class org.apache.spark.ui.UIUtils
 
UIWorkloadGenerator - Class in org.apache.spark.ui

Continuously generates jobs that expose various features of the WebUI (internal testing tool).

UIWorkloadGenerator() - Constructor for class org.apache.spark.ui.UIWorkloadGenerator
 
unapply(Object) - Method in class org.apache.spark.sql.hive.HiveQl.Token$
 
unapply(String) - Static method in class org.apache.spark.util.IntParam
 
unapply(String) - Static method in class org.apache.spark.util.MemoryParam
 
UnaryNode - Interface in org.apache.spark.sql.execution
 
UnaryTransformer<IN,OUT,T extends UnaryTransformer<IN,OUT,T>> - Class in org.apache.spark.ml

Abstract class for transformers that take one input column, apply transformation, and output the
 result as a new column.

UnaryTransformer() - Constructor for class org.apache.spark.ml.UnaryTransformer
 
unBlockifyObject(ByteBuffer[], Serializer, Option<CompressionCodec>, ClassTag<T>) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
 
unbound() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
 
unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.BroadcastManager
 
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory

Remove all persisted state associated with the HTTP broadcast with the given ID.

unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory

Remove all persisted state associated with the torrent broadcast with the given ID.

uncacheQuery(SchemaRDD, boolean) - Method in interface org.apache.spark.sql.CacheManager

Removes the data for the given SchemaRDD from the cache

uncacheTable(String) - Method in interface org.apache.spark.sql.CacheManager

Removes the specified table from the in-memory cache.

UncacheTableCommand - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

UncacheTableCommand(String) - Constructor for class org.apache.spark.sql.execution.UncacheTableCommand
 
UNCAUGHT_EXCEPTION() - Static method in class org.apache.spark.util.SparkExitCode

The default uncaught exception handler was reached.

UNCAUGHT_EXCEPTION_TWICE() - Static method in class org.apache.spark.util.SparkExitCode

The default uncaught exception handler was called and an exception was encountered while
   logging the exception.

uncaughtException(Thread, Throwable) - Static method in class org.apache.spark.util.SparkUncaughtExceptionHandler
 
uncaughtException(Throwable) - Static method in class org.apache.spark.util.SparkUncaughtExceptionHandler
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
uncompressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
underlyingBuffer() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
 
UniformGenerator - Class in org.apache.spark.mllib.random

:: DeveloperApi ::
 Generates i.i.d.

UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
 
uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Java-friendly version of RandomRDDs.uniformRDD(org.apache.spark.SparkContext, long, int, long).

uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default seed.

uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.

uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Java-friendly version of RandomRDDs.uniformVectorRDD(org.apache.spark.SparkContext, long, int, int, long).

uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default seed.

uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs

RandomRDDs.uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default number of partitions and the default seed.

uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Generates an RDD comprised of i.i.d.

uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs

Generates an RDD[Vector] with vectors containing i.i.d.

union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Return the union of this RDD and another one.

union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD

Return the union of this RDD and another one.

union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD

Return the union of this RDD and another one.

union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext

Build the union of two or more RDDs.

union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext

Build the union of two or more RDDs.

union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext

Build the union of two or more RDDs.

union(RDD<T>) - Method in class org.apache.spark.rdd.RDD

Return the union of this RDD and another one.

union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext

Build the union of a list of RDDs.

union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext

Build the union of a list of RDDs passed as variable-length arguments.

Union - Class in org.apache.spark.sql.execution

:: DeveloperApi ::

Union(Seq<SparkPlan>) - Constructor for class org.apache.spark.sql.execution.Union
 
union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream

Return a new DStream by unifying data of another DStream with this DStream.

union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream by unifying data of another DStream with this DStream.

union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a unified DStream from multiple DStreams of the same type and same slide duration.

union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext

Create a unified DStream from multiple DStreams of the same type and same slide duration.

union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream by unifying data of another DStream with this DStream.

union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext

Create a unified DStream from multiple DStreams of the same type and same slide duration.

unionAll(SchemaRDD) - Method in class org.apache.spark.sql.SchemaRDD

Combines the tuples of two RDDs with the same schema, keeping duplicates.

UnionDStream<T> - Class in org.apache.spark.streaming.dstream
 
UnionDStream(DStream<T>[], ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.UnionDStream
 
UnionPartition<T> - Class in org.apache.spark.rdd

Partition for UnionRDD.

UnionPartition(int, RDD<T>, int, int, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionPartition
 
UnionRDD<T> - Class in org.apache.spark.rdd
 
UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
 
uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
 
UniqueKeyHashedRelation - Class in org.apache.spark.sql.execution.joins

A specialized HashedRelation that maps key into a single value.

UniqueKeyHashedRelation(HashMap<Row, Row>) - Constructor for class org.apache.spark.sql.execution.joins.UniqueKeyHashedRelation
 
UnknownReason - Class in org.apache.spark

:: DeveloperApi ::
 We don't know why the task ended -- for example, because of a ClassNotFound exception when
 deserializing the task result.

UnknownReason() - Constructor for class org.apache.spark.UnknownReason
 
unorderedFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

unpersist() - Method in class org.apache.spark.api.java.JavaRDD

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

unpersist() - Method in class org.apache.spark.broadcast.Broadcast

Asynchronously delete cached copies of this broadcast on the executors.

unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast

Delete cached copies of this broadcast on the executors.

unpersist(long, boolean, boolean) - Static method in class org.apache.spark.broadcast.HttpBroadcast

Remove all persisted blocks associated with this HTTP broadcast on the executors.

unpersist(long, boolean, boolean) - Static method in class org.apache.spark.broadcast.TorrentBroadcast

Remove all persisted blocks associated with this torrent broadcast on the executors.

unpersist(boolean) - Method in class org.apache.spark.graphx.Graph

Uncaches both vertices and edges of this graph.

unpersist(boolean) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

Unpersist intermediate RDDs used in the computation.

unpersist(boolean) - Method in class org.apache.spark.rdd.RDD

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

unpersist(boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD

Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

unpersist(boolean) - Method in class org.apache.spark.sql.SchemaRDD
 
unpersistRDD(int, boolean) - Method in class org.apache.spark.SparkContext

Unpersist an RDD from memory and/or disk storage

unpersistRDDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
unpersistRDDToJson(SparkListenerUnpersistRDD) - Static method in class org.apache.spark.util.JsonProtocol
 
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.Graph

Uncaches only the vertices of this graph, leaving the edges alone.

unpersistVertices(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
unregisterAllTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
unregisterMapOutput(int, int, BlockManagerId) - Method in class org.apache.spark.MapOutputTrackerMaster

Unregister map output information of the given shuffle, mapper and block manager

unregisterShuffle(int) - Method in class org.apache.spark.MapOutputTracker

Unregister shuffle data.

unregisterShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster

Unregister shuffle data

unregisterTable(Seq<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog

UNIMPLEMENTED: It needs to be decided how we will persist in-memory tables to the metastore.

unrollSafely(BlockId, Iterator<Object>, ArrayBuffer<Tuple2<BlockId, BlockStatus>>) - Method in class org.apache.spark.storage.MemoryStore

Unroll the given block in memory safely.

unset() - Static method in class org.apache.spark.TaskContextHelper
 
until(Time, Duration) - Method in class org.apache.spark.streaming.Time
 
unwrap(Object, ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors

Converts hive types to native catalyst types.

update(RDD<Vector>, double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel

Perform a k-means update on a batch of data.

update(int, int, double) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
update(int, int, double) - Method in interface org.apache.spark.mllib.linalg.Matrix

Update element at (i, j)

update(int, int, double) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
update(int, int, double, double) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator

Update the stats for a given (feature, bin) for ordered features, using the given label.

update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.EntropyAggregator

Update stats for one (node, feature, bin) with the given label.

update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.GiniAggregator

Update stats for one (node, feature, bin) with the given label.

update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator

Update stats for one (node, feature, bin) with the given label.

update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.VarianceAggregator

Update stats for one (node, feature, bin) with the given label.

update() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
update() - Method in class org.apache.spark.sql.execution.AggregateEvaluation
 
update(Row) - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
update(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData

Updates the checkpoint data of the DStream.

update(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
update(T1, T2) - Method in class org.apache.spark.util.MutablePair

Updates this pair with new values and returns itself

update(A, B) - Method in class org.apache.spark.util.TimeStampedHashMap
 
update(A, B) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
UPDATE_PERIOD() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
updateAggregateMetrics(UIData.StageUIData, String, TaskMetrics, Option<TaskMetrics>) - Method in class org.apache.spark.ui.jobs.JobProgressListener

Upon receiving new metrics for a task, updates the per-stage and per-executor-per-stage
 aggregate metrics by calculating deltas between the currently recorded metrics and the new
 metrics.

updateBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus

Update the given block in this storage status.

updateBlockInfo(BlockId, StorageLevel, long, long, long) - Method in class org.apache.spark.storage.BlockManagerInfo
 
updateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Method in class org.apache.spark.storage.BlockManagerMaster
 
updateCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream

Refresh the list of checkpointed RDDs that will be saved along with checkpoint of
 this stream.

updateCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
updatedConf(SparkConf, String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.SparkContext

Creates a modified version of a SparkConf with the parameters that can be passed separately
 to SparkContext, to make it easier to write SparkContext's constructors.

updateEpoch(long) - Method in class org.apache.spark.MapOutputTracker

Called from executors to update the epoch number, potentially clearing old outputs
 because of a fetch failure.

updateLastSeenMs() - Method in class org.apache.spark.storage.BlockManagerInfo
 
updateNodeIndex(int[], Bin[][]) - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater

Determine a child node index based on the feature value and the split.

updateNodeIndices(RDD<BaggedPoint<TreePoint>>, Map<Object, NodeIndexUpdater>[], Bin[][]) - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache

Update the node index values in the cache.

Updater - Class in org.apache.spark.mllib.optimization

:: DeveloperApi ::
 Class used to perform steps (weight update) using Gradient Descent methods.

Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
 
updateRddInfo(Seq<RDDInfo>, Seq<StorageStatus>) - Static method in class org.apache.spark.storage.StorageUtils

Update the given list of RDDInfo with the given list of storage statuses.

updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new "state" DStream where the state for each key is updated by applying
 the given function on the previous state of the key and the new values of each key.

updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new "state" DStream where the state for each key is updated by applying
 the given function on the previous state of the key and the new values of each key.

updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new "state" DStream where the state for each key is updated by applying
 the given function on the previous state of the key and the new values of the key.

updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new "state" DStream where the state for each key is updated by applying
 the given function on the previous state of the key and the new values of each key.

updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new "state" DStream where the state for each key is updated by applying
 the given function on the previous state of the key and the new values of each key.

updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new "state" DStream where the state for each key is updated by applying
 the given function on the previous state of the key and the new values of the key.

updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions

Return a new "state" DStream where the state for each key is updated by applying
 the given function on the previous state of the key and the new values of each key.

updateVertices(Iterator<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Return a new `EdgePartition` with updates to vertex attributes specified in `iter`.

updateVertices(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView

Return a new ReplicatedVertexView where vertex attributes in edge partition are updated using
 updates.

upgrade(VertexRDD<VD>, boolean, boolean) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView

Upgrade the shipping level in-place to the specified levels by shipping vertex attributes from
 vertices.

upper() - Method in class org.apache.spark.rdd.JdbcPartition
 
UPPER() - Static method in class org.apache.spark.sql.hive.HiveQl
 
upperBound() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
uri() - Method in class org.apache.spark.HttpServer

Get the URI of this HTTP server (http://host:port)

useCachedData(LogicalPlan) - Method in interface org.apache.spark.sql.CacheManager

Replaces segments of the given logical plan with cached versions where possible.

useCompression() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
useCompression() - Method in interface org.apache.spark.sql.SQLConf

When true tables cached using the in-memory columnar caching will be compressed.

useDisk() - Method in class org.apache.spark.storage.StorageLevel
 
useDst - Variable in class org.apache.spark.graphx.TripletFields

Indicates whether the destination vertex attribute is included.

useEdge - Variable in class org.apache.spark.graphx.TripletFields

Indicates whether the edge attribute is included.

useMemory() - Method in class org.apache.spark.storage.StorageLevel
 
useNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
 
user() - Method in class org.apache.spark.mllib.recommendation.Rating
 
user() - Method in class org.apache.spark.scheduler.JobLogger
 
userClass() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
userClass() - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
 
userClass() - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
 
userClass() - Method in class org.apache.spark.sql.api.java.UserDefinedType

Class object for the UserType

userClass() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
UserDefinedType<UserType> - Class in org.apache.spark.sql.api.java

::DeveloperApi::
 The data type representing User-Defined Types (UDTs).

userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
useSrc - Variable in class org.apache.spark.graphx.TripletFields

Indicates whether the source vertex attribute is included.

Utils - Class in org.apache.spark.util

Various utility methods used by Spark.

Utils() - Constructor for class org.apache.spark.util.Utils
 
UUIDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
UUIDToJson(UUID) - Static method in class org.apache.spark.util.JsonProtocol
 




V

V() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
validate(ParamMap) - Method in interface org.apache.spark.ml.param.Params

Validates parameter values stored internally plus the input parameter map.

validate() - Method in interface org.apache.spark.ml.param.Params

Validates parameter values stored internally.

validate() - Method in class org.apache.spark.streaming.Checkpoint
 
validate() - Method in class org.apache.spark.streaming.dstream.DStream
 
validate() - Method in class org.apache.spark.streaming.DStreamGraph
 
validateAndTransformSchema(StructType, ParamMap, boolean) - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams

Validates and transforms the input schema with the provided param map.

validateSettings() - Method in class org.apache.spark.SparkConf

Checks for illegal or deprecated config settings.

value() - Method in class org.apache.spark.Accumulable

Access the accumulator's current value; only allowed on master.

value() - Method in class org.apache.spark.broadcast.Broadcast

Get the broadcasted value.

value() - Method in class org.apache.spark.ComplexFutureAction
 
value() - Method in interface org.apache.spark.FutureAction

The value of this Future.

value() - Method in class org.apache.spark.ml.param.ParamPair
 
value() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
value() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
value() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
value() - Method in class org.apache.spark.SerializableWritable
 
value() - Method in class org.apache.spark.SimpleFutureAction
 
value() - Method in class org.apache.spark.sql.sources.EqualTo
 
value() - Method in class org.apache.spark.sql.sources.GreaterThan
 
value() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
 
value() - Method in class org.apache.spark.sql.sources.LessThan
 
value() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
 
value() - Method in class org.apache.spark.storage.MemoryEntry
 
value() - Method in class org.apache.spark.util.SerializableBuffer
 
value() - Method in class org.apache.spark.util.TimeStampedValue
 
value_() - Method in class org.apache.spark.broadcast.HttpBroadcast
 
valueBytes() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
valueClass() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
valueOf(String) - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness

Returns the enum constant of this type with the specified name.

valueOf(String) - Static method in enum org.apache.spark.JobExecutionStatus

Returns the enum constant of this type with the specified name.

values() - Static method in class org.apache.spark.Accumulators
 
values() - Method in class org.apache.spark.api.java.JavaPairRDD

Return an RDD with the values of each tuple.

values() - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness

Returns an array containing the constants of this enum type, in
the order they are declared.

values() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
values() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
values() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
values() - Static method in enum org.apache.spark.JobExecutionStatus

Returns an array containing the constants of this enum type, in
the order they are declared.

values() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
values() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
values() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
values() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
values() - Method in class org.apache.spark.rdd.PairRDDFunctions

Return an RDD with the values of each tuple.

values() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
values() - Method in class org.apache.spark.sql.sources.In
 
variance() - Method in class org.apache.spark.api.java.JavaDoubleRDD

Compute the variance of this RDD's elements.

variance() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
variance() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
variance() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary

Sample variance vector.

Variance - Class in org.apache.spark.mllib.tree.impurity

:: Experimental ::
 Class for calculating variance during regression

Variance() - Constructor for class org.apache.spark.mllib.tree.impurity.Variance
 
variance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions

Compute the variance of this RDD's elements.

variance() - Method in class org.apache.spark.util.StatCounter

Return the variance of the values.

VarianceAggregator - Class in org.apache.spark.mllib.tree.impurity

Class for updating views of a vector of sufficient statistics,
 in order to compute impurity from a sample.

VarianceAggregator() - Constructor for class org.apache.spark.mllib.tree.impurity.VarianceAggregator
 
VarianceCalculator - Class in org.apache.spark.mllib.tree.impurity

Stores statistics for one (node, feature, bin) for calculating impurity.

VarianceCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.VarianceCalculator
 
vClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
 
vClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
 
vClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
vector() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
 
vector() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
Vector - Interface in org.apache.spark.mllib.linalg

Represents a numeric vector, whose index type is Int and value type is Double.

Vector - Class in org.apache.spark.util
 
Vector(double[]) - Constructor for class org.apache.spark.util.Vector
 
Vector.Multiplier - Class in org.apache.spark.util
 
Vector.Multiplier(double) - Constructor for class org.apache.spark.util.Vector.Multiplier
 
Vector.VectorAccumParam$ - Class in org.apache.spark.util
 
Vector.VectorAccumParam$() - Constructor for class org.apache.spark.util.Vector.VectorAccumParam$
 
Vectors - Class in org.apache.spark.mllib.linalg
 
Vectors() - Constructor for class org.apache.spark.mllib.linalg.Vectors
 
VectorTransformer - Interface in org.apache.spark.mllib.feature

:: DeveloperApi ::
 Trait for transformation of a vector

VectorUDT - Class in org.apache.spark.mllib.linalg

User-defined type for Vector which allows easy interaction with SQL
 via SchemaRDD.

VectorUDT() - Constructor for class org.apache.spark.mllib.linalg.VectorUDT
 
VectorWithNorm - Class in org.apache.spark.mllib.clustering

A vector with its norm for fast distance computation.

VectorWithNorm(Vector, double) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
VectorWithNorm(Vector) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
VectorWithNorm(double[]) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
version() - Method in class org.apache.spark.api.java.JavaSparkContext

The version of Spark on which this application is running.

version() - Method in class org.apache.spark.SparkContext

The version of Spark on which this application is running.

version() - Static method in class org.apache.spark.sql.hive.HiveShim
 
vertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet

Get the vertex object for the given vertex in the edge.

VertexAttributeBlock<VD> - Class in org.apache.spark.graphx.impl

Stores vertex attributes to ship to an edge partition.

VertexAttributeBlock(long[], Object, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexAttributeBlock
 
VertexPartition<VD> - Class in org.apache.spark.graphx.impl

A map from vertex id to vertex attribute.

VertexPartition(OpenHashSet<Object>, Object, BitSet, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartition
 
VertexPartition.VertexPartitionOpsConstructor$ - Class in org.apache.spark.graphx.impl

Implicit evidence that VertexPartition is a member of the VertexPartitionBaseOpsConstructor
 typeclass.

VertexPartition.VertexPartitionOpsConstructor$() - Constructor for class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
 
VertexPartitionBase<VD> - Class in org.apache.spark.graphx.impl

An abstract map from vertex id to vertex attribute.

VertexPartitionBase(ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionBase
 
VertexPartitionBaseOps<VD,Self extends VertexPartitionBase<Object>> - Class in org.apache.spark.graphx.impl

An class containing additional operations for subclasses of VertexPartitionBase that provide
 implicit evidence of membership in the VertexPartitionBaseOpsConstructor typeclass (for
 example, VertexPartition.VertexPartitionOpsConstructor).

VertexPartitionBaseOps(Self, ClassTag<VD>, VertexPartitionBaseOpsConstructor<Self>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
VertexPartitionBaseOpsConstructor<T extends VertexPartitionBase<Object>> - Interface in org.apache.spark.graphx.impl

A typeclass for subclasses of VertexPartitionBase representing the ability to wrap them in a
 VertexPartitionBaseOps.

VertexPartitionOps<VD> - Class in org.apache.spark.graphx.impl
 
VertexPartitionOps(VertexPartition<VD>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionOps
 
VertexRDD<VD> - Class in org.apache.spark.graphx

Extends RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by
 pre-indexing the entries for fast, efficient joins.

VertexRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.VertexRDD
 
VertexRDDImpl<VD> - Class in org.apache.spark.graphx.impl
 
VertexRDDImpl(RDD<ShippableVertexPartition<VD>>, StorageLevel, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexRDDImpl
 
vertices() - Method in class org.apache.spark.graphx.Graph

An RDD containing the vertices and their associated attributes.

vertices() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
vids() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
viewAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
visit(int, int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.FieldAccessFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.ReturnStatementFinder
 
vManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
VocabWord - Class in org.apache.spark.mllib.feature

Entry in vocabulary

VocabWord(String, int, int[], int[], int) - Constructor for class org.apache.spark.mllib.feature.VocabWord
 
VoidFunction<T> - Interface in org.apache.spark.api.java.function

A function with no return value.

Vote() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 




W

w(boolean) - Method in class org.apache.spark.ml.param.BooleanParam
 
w(double) - Method in class org.apache.spark.ml.param.DoubleParam
 
w(float) - Method in class org.apache.spark.ml.param.FloatParam
 
w(int) - Method in class org.apache.spark.ml.param.IntParam
 
w(long) - Method in class org.apache.spark.ml.param.LongParam
 
w(T) - Method in class org.apache.spark.ml.param.Param

Creates a param pair with the given value (for Java).

waiter() - Method in class org.apache.spark.streaming.StreamingContext
 
waitForAsyncReregister() - Method in class org.apache.spark.storage.BlockManager

For testing.

waitForProcess(Process, long) - Static method in class org.apache.spark.util.Utils

Wait for a process to terminate for at most the specified duration.

waitForReady() - Method in class org.apache.spark.storage.BlockInfo

Wait for this BlockInfo to be marked as ready (i.e.

waitForRegister() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
waitForRegister() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
waitForStopOrError(long) - Method in class org.apache.spark.streaming.ContextWaiter

Return true if it's stopped; or throw the reported error if notifyError has been called; or
 false if the waiting time detectably elapsed before return from the method.

waitingBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
waitingStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
waitList() - Method in class org.apache.spark.util.random.AcceptanceResult
 
waitListBound() - Method in class org.apache.spark.util.random.AcceptanceResult
 
waitTillTime(long) - Method in interface org.apache.spark.streaming.util.Clock
 
waitTillTime(long) - Method in class org.apache.spark.streaming.util.ManualClock
 
waitTillTime(long) - Method in class org.apache.spark.streaming.util.SystemClock
 
waitToPush() - Method in class org.apache.spark.streaming.receiver.RateLimiter
 
waitUntilEmpty(int) - Method in class org.apache.spark.scheduler.LiveListenerBus

For testing only.

waitUntilEmpty(int) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus

Waits until there are no more events in the queue, or until the specified time has elapsed.

warehousePath() - Method in class org.apache.spark.sql.hive.LocalHiveContext
 
warehousePath() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
 
warmUp(SparkContext) - Static method in class org.apache.spark.streaming.util.RawTextHelper

Warms up the SparkContext in master and slave by running tasks to force JIT kick in
 before real workload starts.

WebUI - Class in org.apache.spark.ui

The top level component of the UI hierarchy that contains the server.

WebUI(SecurityManager, int, SparkConf, String, String) - Constructor for class org.apache.spark.ui.WebUI
 
WebUIPage - Class in org.apache.spark.ui

A page that represents the leaf node in the UI hierarchy.

WebUIPage(String) - Constructor for class org.apache.spark.ui.WebUIPage
 
WebUITab - Class in org.apache.spark.ui

A tab that represents a collection of pages.

WebUITab(WebUI, String) - Constructor for class org.apache.spark.ui.WebUITab
 
weight() - Method in class org.apache.spark.scheduler.Pool
 
weight() - Method in interface org.apache.spark.scheduler.Schedulable
 
weight() - Method in class org.apache.spark.scheduler.TaskSetManager
 
WEIGHT_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
weightedFalsePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns weighted false positive rate

weightedFMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns weighted averaged f-measure

weightedFMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns weighted averaged f1-measure

weightedPrecision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns weighted averaged precision

weightedRecall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns weighted averaged recall
 (equals to precision, recall and f-measure)

weightedTruePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics

Returns weighted true positive rate
 (equals to precision, recall and f-measure)

weights() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
weights() - Method in class org.apache.spark.mllib.classification.SVMModel
 
weights() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
weights() - Method in class org.apache.spark.mllib.regression.LassoModel
 
weights() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
weights() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
WHEN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
where(Expression) - Method in class org.apache.spark.sql.SchemaRDD

Filters the output, only returning those rows where condition evaluates to true.

where(Symbol, Function1<T1, Object>) - Method in class org.apache.spark.sql.SchemaRDD

Filters tuples using a function over the value of the specified column.

where(Function1<DynamicRow, Object>) - Method in class org.apache.spark.sql.SchemaRDD

:: Experimental ::
 Filters tuples using a function over a Dynamic version of a given Row.

WholeCombineFileRecordReader - Class in org.apache.spark.input

A RecordReader for reading a single whole text file
 out in a key-value pair, where the key is the file path and the value is the entire content of
 the file.

WholeCombineFileRecordReader(InputSplit, TaskAttemptContext) - Constructor for class org.apache.spark.input.WholeCombineFileRecordReader
 
WholeTextFileInputFormat - Class in org.apache.spark.input

A CombineFileInputFormat for
 reading whole text files.

WholeTextFileInputFormat() - Constructor for class org.apache.spark.input.WholeTextFileInputFormat
 
WholeTextFileRDD - Class in org.apache.spark.rdd

Analogous to MapPartitionsRDD, but passes in an InputSplit to
 the given function rather than the index of the partition.

WholeTextFileRDD(SparkContext, Class<? extends WholeTextFileInputFormat>, Class<String>, Class<String>, Configuration, int) - Constructor for class org.apache.spark.rdd.WholeTextFileRDD
 
WholeTextFileRecordReader - Class in org.apache.spark.input

A RecordReader for reading a single whole text file
 out in a key-value pair, where the key is the file path and the value is the entire content of
 the file.

WholeTextFileRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.WholeTextFileRecordReader
 
wholeTextFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext

Read a directory of text files from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI.

wholeTextFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext

Read a directory of text files from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI.

wholeTextFiles(String, int) - Method in class org.apache.spark.SparkContext

Read a directory of text files from HDFS, a local file system (available on all nodes), or any
 Hadoop-supported file system URI.

window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream

Return a new DStream in which each RDD contains all the elements in seen in a
 sliding window of time over this DStream.

window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream

Return a new DStream in which each RDD contains all the elements in seen in a
 sliding window of time over this DStream.

window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream which is computed based on windowed batches of this DStream.

window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream

Return a new DStream which is computed based on windowed batches of this DStream.

window(Duration) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD contains all the elements in seen in a
 sliding window of time over this DStream.

window(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream

Return a new DStream in which each RDD contains all the elements in seen in a
 sliding window of time over this DStream.

windowDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
windowDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
WindowedDStream<T> - Class in org.apache.spark.streaming.dstream
 
WindowedDStream(DStream<T>, Duration, Duration, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.WindowedDStream
 
windowsDrive() - Static method in class org.apache.spark.util.Utils

Pattern for matching a Windows drive, which contains only a single alphabet character.

windowSize() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
wipe() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
withActiveSet(Iterator<Object>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Return a new `EdgePartition` with the specified active set, provided as an iterator.

withActiveSet(VertexRDD<?>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView

Return a new ReplicatedVertexView where the activeSet in each edge partition contains only
 vertex ids present in actives.

withChildren(Seq<ASTNode>) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode

Returns this ASTNode with the children changed to newChildren.

WithCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
 
withData(Object, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Return a new `EdgePartition` with the specified edge data.

withEdges(EdgeRDDImpl<ED2, VD2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView

Return a new ReplicatedVertexView with the specified EdgeRDD, which must have the same
 shipping level.

withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.VertexRDD

Prepares this VertexRDD for efficient joins with the given EdgeRDD.

withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
withMean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
withOutput(Seq<Attribute>) - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
withoutVertexAttributes(ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgePartition

Return a new `EdgePartition` without any locally cached vertex attributes.

withPartitionsRDD(RDD<Tuple2<Object, EdgePartition<ED2, VD2>>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
withPartitionsRDD(RDD<ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withPartitionsRDD(RDD<ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD

Replaces the vertex partitions while preserving all other properties of the VertexRDD.

withReplacement() - Method in class org.apache.spark.sql.execution.Sample
 
withRoutingTable(RoutingTablePartition) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition

Return a new ShippableVertexPartition with the specified routing table.

withStd() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.EdgeRDD

Changes the target storage level while preserving all other properties of the
 EdgeRDD.

withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.VertexRDD

Changes the target storage level while preserving all other properties of the
 VertexRDD.

withText(String) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode

Returns this ASTNode with the text changed to newText.

withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
word() - Method in class org.apache.spark.mllib.feature.VocabWord
 
Word2Vec - Class in org.apache.spark.mllib.feature

:: Experimental ::
 Word2Vec creates vector representation of words in a text corpus.

Word2Vec() - Constructor for class org.apache.spark.mllib.feature.Word2Vec
 
Word2VecModel - Class in org.apache.spark.mllib.feature

:: Experimental ::
 Word2Vec model

Word2VecModel(Map<String, float[]>) - Constructor for class org.apache.spark.mllib.feature.Word2VecModel
 
worker() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
worker() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
workerId() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
WorkerOffer - Class in org.apache.spark.scheduler

Represents free resources available on an executor.

WorkerOffer(String, String, int) - Constructor for class org.apache.spark.scheduler.WorkerOffer
 
wrap(Object, ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors

Converts native catalyst types to the types expected by Hive

wrap(Seq<Object>, Seq<ObjectInspector>, Object[]) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
wrapAsJava(UserDefinedType<?>) - Static method in class org.apache.spark.sql.api.java.UDTWrappers
 
wrapAsScala(UserDefinedType<?>) - Static method in class org.apache.spark.sql.api.java.UDTWrappers
 
wrapForCompression(BlockId, OutputStream) - Method in class org.apache.spark.storage.BlockManager

Wrap an output stream for compression if block compression is enabled for its block type

wrapForCompression(BlockId, InputStream) - Method in class org.apache.spark.storage.BlockManager

Wrap an input stream for compression if block compression is enabled for its block type

wrapperClass() - Static method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
wrapperFor(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors

Wraps with Hive types based on object inspector.

wrapperToFileSinkDesc(ShimFileSinkDesc) - Static method in class org.apache.spark.sql.hive.HiveShim
 
wrapRDD(RDD<Double>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaPairRDD
 
wrapRDD(RDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
 
wrapRDD(RDD<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
wrapRDD(RDD<Row>) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
 
wrapRDD(RDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
wrapRDD(RDD<T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
writableClass() - Method in class org.apache.spark.WritableConverter
 
WritableConverter<T> - Class in org.apache.spark

A class encapsulating how to convert some type T to Writable.

WritableConverter(Function1<ClassTag<T>, Class<? extends Writable>>, Function1<Writable, T>) - Constructor for class org.apache.spark.WritableConverter
 
writableWritableConverter() - Static method in class org.apache.spark.SparkContext
 
write(Kryo, Output, Iterable<?>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
write(Object, Object) - Method in class org.apache.spark.SparkHadoopWriter
 
write(Kryo, Output, BigDecimal) - Method in class org.apache.spark.sql.execution.BigDecimalSerializer
 
write(Kryo, Output, HyperLogLog) - Method in class org.apache.spark.sql.execution.HyperLogLogSerializer
 
write(Kryo, Output, IntegerHashSet) - Method in class org.apache.spark.sql.execution.IntegerHashSetSerializer
 
write(Kryo, Output, LongHashSet) - Method in class org.apache.spark.sql.execution.LongHashSetSerializer
 
write(Kryo, Output, OpenHashSet<?>) - Method in class org.apache.spark.sql.execution.OpenHashSetSerializer
 
write(Row) - Method in class org.apache.spark.sql.parquet.MutableRowWriteSupport
 
write(Row) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
write(Group) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
write(Object) - Method in class org.apache.spark.storage.BlockObjectWriter

Writes an object.

write(Object) - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
write(Checkpoint) - Method in class org.apache.spark.streaming.CheckpointWriter
 
write(int) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(byte[]) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(byte[], int, int) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(ByteBuffer) - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter

Write the bytebuffer to the log file

write(int) - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
write(byte[], int, int) - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
WriteAheadLogBackedBlockRDD<T> - Class in org.apache.spark.streaming.rdd

This class represents a special case of the BlockRDD where the data blocks in
 the block manager are also backed by segments in write ahead logs.

WriteAheadLogBackedBlockRDD(SparkContext, BlockId[], WriteAheadLogFileSegment[], boolean, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
 
WriteAheadLogBackedBlockRDDPartition - Class in org.apache.spark.streaming.rdd

Partition class for WriteAheadLogBackedBlockRDD.

WriteAheadLogBackedBlockRDDPartition(int, BlockId, WriteAheadLogFileSegment) - Constructor for class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
WriteAheadLogBasedBlockHandler - Class in org.apache.spark.streaming.receiver

Implementation of a ReceivedBlockHandler which
 stores the received blocks in both, a write ahead log and a block manager.

WriteAheadLogBasedBlockHandler(BlockManager, int, StorageLevel, SparkConf, Configuration, String, Clock) - Constructor for class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
WriteAheadLogBasedStoreResult - Class in org.apache.spark.streaming.receiver

Implementation of ReceivedBlockStoreResult
 that stores the metadata related to storage of blocks using
 WriteAheadLogBasedBlockHandler

WriteAheadLogBasedStoreResult(StreamBlockId, WriteAheadLogFileSegment) - Constructor for class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
WriteAheadLogFileSegment - Class in org.apache.spark.streaming.util

Class for representing a segment of data in a write ahead log file

WriteAheadLogFileSegment(String, long, int) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
WriteAheadLogManager - Class in org.apache.spark.streaming.util

This class manages write ahead log files.

WriteAheadLogManager(String, Configuration, int, int, String, Clock) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager
 
WriteAheadLogManager.LogInfo - Class in org.apache.spark.streaming.util
 
WriteAheadLogManager.LogInfo(long, long, String) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
WriteAheadLogManager.LogInfo$ - Class in org.apache.spark.streaming.util
 
WriteAheadLogManager.LogInfo$() - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo$
 
WriteAheadLogRandomReader - Class in org.apache.spark.streaming.util

A random access reader for reading write ahead log files written using
 WriteAheadLogWriter.

WriteAheadLogRandomReader(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
WriteAheadLogReader - Class in org.apache.spark.streaming.util

A reader for reading write ahead log files written using
 WriteAheadLogWriter.

WriteAheadLogReader(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogReader
 
WriteAheadLogWriter - Class in org.apache.spark.streaming.util

A writer for writing byte-buffers to a write ahead log file.

WriteAheadLogWriter(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogWriter
 
writeAll(Iterator<T>, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
 
writeArray(ArrayType, Seq<Object>) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeByteBuffer(ByteBuffer, ObjectOutput) - Static method in class org.apache.spark.util.Utils

Primitive often used when writing ByteBuffer to DataOutput

writeDecimal(Decimal, int) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.DirectTaskResult
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.serializer.JavaSerializer
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerId
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.StorageLevel
 
writeExternal(ObjectOutput, Map<CharSequence, CharSequence>, byte[]) - Static method in class org.apache.spark.streaming.flume.EventTransformer
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
writeFile() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeFilterFile(int) - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeLock(Function0<A>) - Method in interface org.apache.spark.sql.CacheManager

Acquires a write lock on the cache for the duration of `f`.

writeMap(MapType, Map<?, Object>) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeMetaData(Seq<Attribute>, Path, Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
writeNestedFile1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializationStream

Calling reset to avoid memory leak:
 http://stackoverflow.com/questions/1281549/memory-leak-traps-in-the-java-standard-api
 But only call it every 100th time to avoid bloated serialization streams (when
 the stream 'resets' object class descriptions have to be re-written)

writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializationStream
 
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
 
writePrimitive(PrimitiveType, Object) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writer() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeStruct(StructType, Seq<Object>) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeToFile(String, Broadcast<SerializableWritable<Configuration>>, int, TaskContext, Iterator<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
writeToLog(ByteBuffer) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager

Write a byte buffer to the log file.

writeValue(DataType, Object) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 




X

x() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
x() - Method in class org.apache.spark.sql.test.ExamplePoint
 
XORShiftRandom - Class in org.apache.spark.util.random

This class implements a XORShift random number generator algorithm
 Source:
 Marsaglia, G.

XORShiftRandom(long) - Constructor for class org.apache.spark.util.random.XORShiftRandom
 
XORShiftRandom() - Constructor for class org.apache.spark.util.random.XORShiftRandom
 




Y

y() - Method in class org.apache.spark.sql.test.ExamplePoint
 
YarnSchedulerBackend - Class in org.apache.spark.scheduler.cluster

Abstract Yarn scheduler backend that contains common logic
 between the client and cluster Yarn scheduler backends.

YarnSchedulerBackend(TaskSchedulerImpl, SparkContext) - Constructor for class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 




Z

zero() - Method in class org.apache.spark.Accumulable
 
zero(R) - Method in interface org.apache.spark.AccumulableParam

Return the "zero" (identity) value for an accumulator type, given its initial value.

zero(R) - Method in class org.apache.spark.GrowableAccumulableParam
 
zero(double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
zero(float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
zero(int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
 
zero(long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
 
zero(Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
 
ZeroMQReceiver<T> - Class in org.apache.spark.streaming.zeromq

A receiver to subscribe to ZeroMQ stream.

ZeroMQReceiver(String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
ZeroMQUtils - Class in org.apache.spark.streaming.zeromq
 
ZeroMQUtils() - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQUtils
 
zeros(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices

Generate a DenseMatrix consisting of zeros.

zeros(int) - Static method in class org.apache.spark.mllib.linalg.Vectors

Creates a dense vector of all zeros.

zeros(int) - Static method in class org.apache.spark.util.Vector
 
zeroTime() - Method in class org.apache.spark.streaming.dstream.DStream
 
zeroTime() - Method in class org.apache.spark.streaming.DStreamGraph
 
zip(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Zips this RDD with another one, returning key-value pairs with the first element in each RDD,
 second element in each RDD, etc.

zip(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD

Zips this RDD with another one, returning key-value pairs with the first element in each RDD,
 second element in each RDD, etc.

zipPartitions(JavaRDDLike<U, ?>, FlatMapFunction2<Iterator<T>, Iterator<U>, V>) - Method in interface org.apache.spark.api.java.JavaRDDLike

Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by
 applying a function to the zipped partitions.

zipPartitions(RDD<B>, boolean, Function2<Iterator<T>, Iterator<B>, Iterator<V>>, ClassTag<B>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD

Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by
 applying a function to the zipped partitions.

zipPartitions(RDD<B>, Function2<Iterator<T>, Iterator<B>, Iterator<V>>, ClassTag<B>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, boolean, Function3<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, Function3<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, RDD<D>, boolean, Function4<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, RDD<D>, Function4<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
ZippedPartitionsBaseRDD<V> - Class in org.apache.spark.rdd
 
ZippedPartitionsBaseRDD(SparkContext, Seq<RDD<?>>, boolean, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
ZippedPartitionsPartition - Class in org.apache.spark.rdd
 
ZippedPartitionsPartition(int, Seq<RDD<?>>, Seq<String>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsPartition
 
ZippedPartitionsRDD2<A,B,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD2(SparkContext, Function2<Iterator<A>, Iterator<B>, Iterator<V>>, RDD<A>, RDD<B>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD2
 
ZippedPartitionsRDD3<A,B,C,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD3(SparkContext, Function3<Iterator<A>, Iterator<B>, Iterator<C>, Iterator<V>>, RDD<A>, RDD<B>, RDD<C>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD3
 
ZippedPartitionsRDD4<A,B,C,D,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD4(SparkContext, Function4<Iterator<A>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, RDD<A>, RDD<B>, RDD<C>, RDD<D>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD4
 
ZippedWithIndexRDD<T> - Class in org.apache.spark.rdd

Represents a RDD zipped with its element indices.

ZippedWithIndexRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ZippedWithIndexRDD
 
ZippedWithIndexRDDPartition - Class in org.apache.spark.rdd
 
ZippedWithIndexRDDPartition(Partition, long) - Constructor for class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
zipWithIndex() - Method in interface org.apache.spark.api.java.JavaRDDLike

Zips this RDD with its element indices.

zipWithIndex() - Method in class org.apache.spark.rdd.RDD

Zips this RDD with its element indices.

zipWithUniqueId() - Method in interface org.apache.spark.api.java.JavaRDDLike

Zips this RDD with generated unique Long ids.

zipWithUniqueId() - Method in class org.apache.spark.rdd.RDD

Zips this RDD with generated unique Long ids.





_

_1() - Method in class org.apache.spark.util.MutablePair
 
_2() - Method in class org.apache.spark.util.MutablePair
 
_message() - Method in class org.apache.spark.scheduler.SlaveLost
 
_rddInfoMap() - Method in class org.apache.spark.ui.storage.StorageListener
 

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _