A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ 

A

abort(String) - Method in class org.apache.spark.scheduler.TaskSetManager
 
abortStage(Stage, String) - Method in class org.apache.spark.scheduler.DAGScheduler
Aborts all jobs depending on a particular Stage.
abs(Column) - Static method in class org.apache.spark.sql.functions
Computes the absolutle value.
AbsoluteError - Class in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Class for absolute error loss calculation (for regression).
AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
 
accept(File, String) - Method in class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
 
AcceptanceResult - Class in org.apache.spark.util.random
Object used by seqOp to keep track of the number of items accepted and items waitlisted per stratum, as well as the bounds for accepting and waitlisting items.
AcceptanceResult(long, long) - Constructor for class org.apache.spark.util.random.AcceptanceResult
 
acceptBound() - Method in class org.apache.spark.util.random.AcceptanceResult
 
Accumulable<R,T> - Class in org.apache.spark
A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
 
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
 
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
Create an Accumulable shared variable, to which tasks can add values with +=.
accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
Create an Accumulable shared variable, with a name for display in the Spark UI.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
Create an accumulator from a "mutable collection" type.
AccumulableInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Information about an Accumulable modified during a task or stage.
AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
 
accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
AccumulableParam<R,T> - Interface in org.apache.spark
Helper object defining how to accumulate values of a particular type.
accumulables() - Method in class org.apache.spark.scheduler.StageInfo
Terminal values of accumulables updated during this stage.
accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
Intermediate updates to accumulables during this task.
accumulables() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
Accumulator<T> - Class in org.apache.spark
A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.e.
Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
 
Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
 
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the += method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Create an Accumulator variable of a given type, with a name for display in the Spark UI.
accumulator() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
AccumulatorParam<T> - Interface in org.apache.spark
A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.
AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark
 
AccumulatorParam.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
 
AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark
 
AccumulatorParam.FloatAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
 
AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark
 
AccumulatorParam.IntAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
 
AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark
 
AccumulatorParam.LongAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
 
Accumulators - Class in org.apache.spark
 
Accumulators() - Constructor for class org.apache.spark.Accumulators
 
accumUpdates() - Method in class org.apache.spark.scheduler.CompletionEvent
 
accumUpdates() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns accuracy
aclsEnabled() - Method in class org.apache.spark.SecurityManager
Check to see if Acls for the UI are enabled
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
activeExecutorIds() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
ActiveJob - Class in org.apache.spark.scheduler
Tracks information about an active job in the DAGScheduler.
ActiveJob(int, Stage, Function2<TaskContext, Iterator<Object>, ?>, int[], CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.ActiveJob
 
activeJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
activeTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
activeTaskSets() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 
ActorHelper - Interface in org.apache.spark.streaming.receiver
:: DeveloperApi :: A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.
ActorLogReceive - Interface in org.apache.spark.util
A trait to enable logging all Akka actor messages.
ActorReceiver<T> - Class in org.apache.spark.streaming.receiver
Provides Actors as receivers for receiving stream.
ActorReceiver(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver
 
ActorReceiver.Supervisor - Class in org.apache.spark.streaming.receiver
 
ActorReceiver.Supervisor() - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
ActorReceiverData - Interface in org.apache.spark.streaming.receiver
Case class to receive data sent by child actors
actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: A helper with set of defaults for supervisor strategy
ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
 
actorSystem() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
actorSystem() - Method in class org.apache.spark.SparkEnv
 
actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Returns the size of the value row(ordinal).
actualSize(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
add(T) - Method in class org.apache.spark.Accumulable
Add more data to this accumulator / accumulable
add(Map<Object, Object>) - Static method in class org.apache.spark.Accumulators
 
add(long, long, ED) - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
Add a new edge to the partition.
add(long, long, int, int, ED) - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
Add a new edge to the partition.
add(float[], float) - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
Adds an observation.
add(ALS.Rating<ID>) - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
Adds a rating.
add(int, Object, int[], float[]) - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlockBuilder
Adds a dst block of (srcId, dstLocalIndex, rating) tuples.
add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
 
add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
Adds a new document.
add(Iterable<T>, long) - Method in class org.apache.spark.mllib.fpm.FPTree
Adds a transaction with count.
add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Adds two block matrices together.
add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
Add a new sample to this summarizer, and update the statistical summary.
add(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Add the stats from another calculator into this one, modifying and returning this calculator.
add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
add(Vector) - Method in class org.apache.spark.util.Vector
 
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
 
addAccumulator(R, T) - Method in class org.apache.spark.GrowableAccumulableParam
 
addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
 
addBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus
Add the given block to this storage status.
AddBlock - Class in org.apache.spark.streaming.scheduler
 
AddBlock(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.AddBlock
 
addBlock(ReceivedBlockInfo) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Add received block.
addBoolean(boolean) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addData(Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Push a single data item into the buffer.
addDataWithCallback(Object, Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Push a single data item into the buffer.
addDouble(double) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addedFiles() - Method in class org.apache.spark.SparkContext
 
addedJars() - Method in class org.apache.spark.SparkContext
 
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(File) - Method in class org.apache.spark.HttpFileServer
 
addFile(String) - Method in class org.apache.spark.SparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(String, boolean) - Method in class org.apache.spark.SparkContext
Add a file to be downloaded with this Spark job on every node.
AddFile - Class in org.apache.spark.sql.hive.execution
 
AddFile(String) - Constructor for class org.apache.spark.sql.hive.execution.AddFile
 
addFileToDir(File, File) - Method in class org.apache.spark.HttpFileServer
 
addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
Add filters, if any, to the given list of ServletContextHandlers
addFloat(float) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a param with multiple values (overwrites if the input param exists).
addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a double param with multiple values.
addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a int param with multiple values.
addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a float param with multiple values.
addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a long param with multiple values.
addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a boolean param with true and false.
addImplicit(float[], float, double) - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
Adds an observation with implicit feedback.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
Merge two accumulated values together.
addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
 
addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
 
addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
 
addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
 
addInPlace(R, R) - Method in class org.apache.spark.GrowableAccumulableParam
 
addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
 
addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
 
addInPlace(Vector) - Method in class org.apache.spark.util.Vector
 
addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
 
addInputStream(InputDStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
 
addInt(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(File) - Method in class org.apache.spark.HttpFileServer
 
addJar(String) - Method in class org.apache.spark.SparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
AddJar - Class in org.apache.spark.sql.hive.execution
 
AddJar(String) - Constructor for class org.apache.spark.sql.hive.execution.AddJar
 
addListener(L) - Method in interface org.apache.spark.util.ListenerBus
Add a listener to listen events.
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
Add Hadoop configuration specific to a single partition and attempt.
addLong(long) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext
Adds a callback function to be executed on task completion.
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
 
addOutputColumn(StructType, String, DataType) - Method in interface org.apache.spark.ml.param.Params
 
addOutputLoc(int, MapStatus) - Method in class org.apache.spark.scheduler.Stage
 
addOutputStream(DStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
 
addPartitioningAttributes(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
address() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
address(String, String, String, Object, String) - Static method in class org.apache.spark.util.AkkaUtils
 
addresses() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
addRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager
If the given task ID is not in the set of running tasks, adds it.
addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
 
addSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
 
addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
 
addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
Adds a (Java friendly) listener to be executed on task completion.
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext
Adds a listener in the form of a Scala closure to be executed on task completion.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContextImpl
 
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
 
addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
addTaskSetManager(Schedulable, Properties) - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
addURL(URL) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
 
addURL(URL) - Method in class org.apache.spark.util.MutableURLClassLoader
 
addValueFromDictionary(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
 
adminAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
advance(long) - Method in class org.apache.spark.util.ManualClock
 
advanceCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
Advance the checkpoint clock by the checkpoint interval.
agg(Column, Column...) - Method in class org.apache.spark.sql.DataFrame
Aggregates on the entire DataFrame without groups.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.DataFrame
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
(Scala-specific) Aggregates on the entire DataFrame without groups.
agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
(Java-specific) Aggregates on the entire DataFrame without groups.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
Aggregates on the entire DataFrame without groups.
agg(Column, Column...) - Method in class org.apache.spark.sql.GroupedData
Compute aggregates by specifying a series of aggregate columns.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.GroupedData
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
(Java-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.GroupedData
Compute aggregates by specifying a series of aggregate columns.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesEdgeScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Send messages along edges and aggregate them at the receiving vertices.
aggregateMessagesIndexScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Send messages along edges and aggregate them at the receiving vertices.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
aggregateSizeForNode(DecisionTreeMetadata, Option<int[]>) - Static method in class org.apache.spark.mllib.tree.RandomForest
Get the number of values to be stored for this node in the bin aggregates.
aggregateUsingIndex(Iterator<Product2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.
AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
 
AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
Aggregator<K,V,C> - Class in org.apache.spark
:: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
 
aggregator() - Method in class org.apache.spark.ShuffleDependency
 
akkaSSLOptions() - Method in class org.apache.spark.SecurityManager
 
AkkaUtils - Class in org.apache.spark.util
Various utility classes for working with Akka.
AkkaUtils() - Constructor for class org.apache.spark.util.AkkaUtils
 
Algo - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
 
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
 
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
alias() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
aliasNames() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
All - Static variable in class org.apache.spark.graphx.TripletFields
Expose all the fields (source, edge, and destination).
AllCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
 
AllJobsCancelled - Class in org.apache.spark.scheduler
 
AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
 
AllJobsPage - Class in org.apache.spark.ui.jobs
Page showing list of all ongoing and recently finished jobs
AllJobsPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.AllJobsPage
 
allJoinTokens() - Static method in class org.apache.spark.sql.hive.HiveQl
 
allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Allocate all unallocated blocks to the given batch.
allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Allocate all unallocated blocks to the given batch.
AllocatedBlocks - Class in org.apache.spark.streaming.scheduler
Class representing the blocks of all the streams allocated to a batch
AllocatedBlocks(Map<Object, Seq<ReceivedBlockInfo>>) - Constructor for class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
allocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
allowExisting() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
allowLocal() - Method in class org.apache.spark.scheduler.JobSubmitted
 
allPendingTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
AllStagesPage - Class in org.apache.spark.ui.jobs
Page showing list of all ongoing and recently finished stages and pools
AllStagesPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.AllStagesPage
 
alpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for the alpha parameter in the implicit preference formulation.
AlphaComponent - Annotation Type in org.apache.spark.annotation
A new component of Spark which may have unstable API's.
ALS - Class in org.apache.spark.ml.recommendation
Alternating Least Squares (ALS) matrix factorization.
ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
 
ALS - Class in org.apache.spark.mllib.recommendation
Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
ALS.CholeskySolver - Class in org.apache.spark.ml.recommendation
Cholesky solver for least square problems.
ALS.CholeskySolver() - Constructor for class org.apache.spark.ml.recommendation.ALS.CholeskySolver
 
ALS.InBlock<ID> - Class in org.apache.spark.ml.recommendation
In-link block for computing src (user/item) factors.
ALS.InBlock(Object, int[], int[], float[], ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.InBlock
 
ALS.InBlock$ - Class in org.apache.spark.ml.recommendation
 
ALS.InBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.InBlock$
 
ALS.LeastSquaresNESolver - Interface in org.apache.spark.ml.recommendation
Trait for least squares solvers applied to the normal equation.
ALS.LocalIndexEncoder - Class in org.apache.spark.ml.recommendation
Encoder for storing (blockId, localIndex) into a single integer.
ALS.LocalIndexEncoder(int) - Constructor for class org.apache.spark.ml.recommendation.ALS.LocalIndexEncoder
 
ALS.NNLSSolver - Class in org.apache.spark.ml.recommendation
NNLS solver.
ALS.NNLSSolver() - Constructor for class org.apache.spark.ml.recommendation.ALS.NNLSSolver
 
ALS.NormalEquation - Class in org.apache.spark.ml.recommendation
Representing a normal equation (ALS' subproblem).
ALS.NormalEquation(int) - Constructor for class org.apache.spark.ml.recommendation.ALS.NormalEquation
 
ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation
Rating class for better code readability.
ALS.Rating(ID, ID, float) - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating
 
ALS.Rating$ - Class in org.apache.spark.ml.recommendation
 
ALS.Rating$() - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating$
 
ALS.RatingBlock<ID> - Class in org.apache.spark.ml.recommendation
A rating block that contains src IDs, dst IDs, and ratings, stored in primitive arrays.
ALS.RatingBlock(Object, Object, float[], ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlock
 
ALS.RatingBlock$ - Class in org.apache.spark.ml.recommendation
 
ALS.RatingBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlock$
 
ALS.RatingBlockBuilder<ID> - Class in org.apache.spark.ml.recommendation
Builder for ALS.RatingBlock.
ALS.RatingBlockBuilder(ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
 
ALS.UncompressedInBlock<ID> - Class in org.apache.spark.ml.recommendation
A block of (srcId, dstEncodedIndex, rating) tuples stored in primitive arrays.
ALS.UncompressedInBlock(Object, int[], float[], ClassTag<ID>, Ordering<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
 
ALS.UncompressedInBlockBuilder<ID> - Class in org.apache.spark.ml.recommendation
Builder for uncompressed in-blocks of (srcId, dstEncodedIndex, rating) tuples.
ALS.UncompressedInBlockBuilder(ALS.LocalIndexEncoder, ClassTag<ID>, Ordering<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.UncompressedInBlockBuilder
 
ALSModel - Class in org.apache.spark.ml.recommendation
Model fitted by ALS.
ALSModel(ALS, ParamMap, int, RDD<Tuple2<Object, float[]>>, RDD<Tuple2<Object, float[]>>) - Constructor for class org.apache.spark.ml.recommendation.ALSModel
 
ALSParams - Interface in org.apache.spark.ml.recommendation
Common params for ALS.
analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext
Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
AnalyzeTable - Class in org.apache.spark.sql.hive.execution
Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.execution.AnalyzeTable
 
and(Column) - Method in class org.apache.spark.sql.Column
Boolean AND.
AND() - Static method in class org.apache.spark.sql.hive.HiveQl
 
And - Class in org.apache.spark.sql.sources
 
And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
 
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
 
append(boolean, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
append(byte, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
append(byte[], ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
append(JvmType, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
Appends the given value v of type T into the given ByteBuffer.
append(Row, int, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
Appends row(ordinal) of type T into the given ByteBuffer.
append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
 
append(double, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
append(float, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
append(long, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
append(short, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
append(String, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
 
append(Timestamp, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
append(AvroFlumeEvent) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
 
appendBatch(List<AvroFlumeEvent>) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
 
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
Returns a new vector with 1.0 (bias) appended to the input vector.
appendFrom(Row, int) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Appends row(ordinal) to the column builder.
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
AppendingParquetOutputFormat - Class in org.apache.spark.sql.parquet
TODO: this will be able to append to directories it created itself, not necessarily to imported ones.
AppendingParquetOutputFormat(int) - Constructor for class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
 
appId() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
appId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
appId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
appId() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
ApplicationEventListener - Class in org.apache.spark.scheduler
A simple listener for application events.
ApplicationEventListener() - Constructor for class org.apache.spark.scheduler.ApplicationEventListener
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
applicationId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
Get an application ID associated with the job.
applicationId() - Method in interface org.apache.spark.scheduler.TaskScheduler
Get an application ID associated with the job.
applicationId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
applicationId() - Method in class org.apache.spark.SparkContext
 
applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
 
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
Construct a graph from a collection of vertices and edges with attributes.
apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from edges, setting referenced vertices to `defaultVertexAttr`.
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`.
apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a `ShippableVertexPartition` from the given vertices without any routing table.
apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a ShippableVertexPartition from the given vertices with the specified routing table, filling in missing vertices mentioned in the routing table using defaultVal.
apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a ShippableVertexPartition from the given vertices with the specified routing table, filling in missing vertices mentioned in the routing table using defaultVal, and merging duplicate vertex atrribute with mergeFunc.
apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition
Construct a `VertexPartition` from the given vertices.
apply(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
Return the vertex attribute for the given vertex ID.
apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
Execute a Pregel-like iterative vertex-parallel abstraction.
apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a standalone VertexRDD (one that is not set up for efficient joins with an EdgeRDD) from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
Gets the value of the input param or its default value if it does not exist.
apply(BinaryConfusionMatrix) - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryClassificationMetricComputer
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
 
apply(BinaryConfusionMatrix) - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
 
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
 
apply(int, int, int, int) - Static method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
Creates a new GridPartitioner instance.
apply(int, int, int) - Static method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
Creates a new GridPartitioner instance with the input suggested number of partitions.
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
Gets the value of the ith element.
apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
 
apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
 
apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
 
apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
Alternate factory method that takes a ByteBuffer directly for the data field
apply(BlockManagerId, long[]) - Static method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
 
apply(String) - Static method in class org.apache.spark.sql.Column
 
apply(Expression) - Static method in class org.apache.spark.sql.Column
 
apply(DataType) - Static method in class org.apache.spark.sql.columnar.ColumnType
 
apply(boolean, int, StorageLevel, SparkPlan, Option<String>) - Static method in class org.apache.spark.sql.columnar.InMemoryRelation
 
apply(String) - Method in class org.apache.spark.sql.DataFrame
Selects column based on the column name and return it as a Column.
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.ParquetConversions
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.DataSinks
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveDDLStrategy
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.Scripts
 
apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.ResolveUdtfsAlias
 
apply(int, long) - Static method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 
apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.DataSourceStrategy
 
apply(String, boolean) - Method in class org.apache.spark.sql.sources.DDLParser
 
apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.PreInsertCastAndRename
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.sources.PreWriteCheck
 
apply(SQLContext, Option<StructType>, String, Map<String, String>) - Static method in class org.apache.spark.sql.sources.ResolvedDataSource
Create a ResolvedDataSource for reading data in.
apply(SQLContext, String, SaveMode, Map<String, String>, DataFrame) - Static method in class org.apache.spark.sql.sources.ResolvedDataSource
Create a ResolvedDataSource for saving the content of the given DataFrame.
apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedFunction
 
apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedPythonFunction
Returns a Column that will evaluate to calling this UDF with the given input.
apply(String) - Static method in class org.apache.spark.storage.BlockId
Converts a BlockId "name" String back into a BlockId.
apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId
Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
 
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
 
apply(Map<String, String>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig$
Make a consumer config without requiring group.id or zookeeper.connect, since communicating with brokers also needs common settings such as timeout
apply(SparkContext, Map<String, String>, Map<TopicAndPartition, Object>, Map<TopicAndPartition, KafkaCluster.LeaderOffset>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaRDD
 
apply(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
apply(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
apply(Tuple4<String, Object, Object, Object>) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
 
apply(long) - Static method in class org.apache.spark.streaming.Minutes
 
apply(long) - Static method in class org.apache.spark.streaming.Seconds
 
apply(I, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.CompletionIterator
 
apply(Traversable<Object>) - Static method in class org.apache.spark.util.Distribution
 
apply(InputStream, File, SparkConf) - Static method in class org.apache.spark.util.logging.FileAppender
Create the right appender based on Spark configuration
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values passed as variable-length arguments.
apply(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
apply(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
apply(int) - Method in class org.apache.spark.util.Vector
 
applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Creates a DataFrame from an RDD containing Rows by applying a schema to this RDD.
applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
 
applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Applies a schema to an RDD of Java Beans.
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Applies a schema to an RDD of Java Beans.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
 
appName() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
appName() - Method in class org.apache.spark.SparkContext
 
appName() - Method in class org.apache.spark.ui.SparkUI
 
appName() - Method in class org.apache.spark.ui.SparkUITab
 
approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
ApproximateActionListener<T,U,R> - Class in org.apache.spark.partial
A JobListener for an approximate single-result action, such as count() or non-parallel reduce().
ApproximateActionListener(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Constructor for class org.apache.spark.partial.ApproximateActionListener
 
ApproximateEvaluator<U,R> - Interface in org.apache.spark.partial
An object that computes a function incrementally by merging in results of type U from multiple tasks.
appUIAddress() - Method in class org.apache.spark.ui.SparkUI
 
appUIHostPort() - Method in class org.apache.spark.ui.SparkUI
Return the application UI host:port.
AreaUnderCurve - Class in org.apache.spark.mllib.evaluation
Computes the area under the curve (AUC) using the trapezoidal rule.
AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
 
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the receiver operating characteristic (ROC) curve.
areBoundsEmpty() - Method in class org.apache.spark.util.random.AcceptanceResult
 
argString() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
arr() - Method in class org.apache.spark.rdd.PartitionGroup
 
array(DataType) - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type array
ARRAY() - Static method in class org.apache.spark.sql.hive.HiveQl
 
ARRAY_CONTAINS_NULL_BAG_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
ARRAY_ELEMENTS_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
arrayBuffer() - Method in class org.apache.spark.streaming.receiver.ArrayBufferBlock
 
ArrayBufferBlock - Class in org.apache.spark.streaming.receiver
class representing a block received as an ArrayBuffer
ArrayBufferBlock(ArrayBuffer<?>) - Constructor for class org.apache.spark.streaming.receiver.ArrayBufferBlock
 
ArrayValues - Class in org.apache.spark.storage
 
ArrayValues(Object[]) - Constructor for class org.apache.spark.storage.ArrayValues
 
as(String) - Method in class org.apache.spark.sql.Column
Gives the column an alias.
as(Symbol) - Method in class org.apache.spark.sql.Column
Gives the column an alias.
as(String) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame with an alias set.
as(Symbol) - Method in class org.apache.spark.sql.DataFrame
(Scala-specific) Returns a new DataFrame with an alias set.
asc() - Method in class org.apache.spark.sql.Column
Returns an ordering used in sorting.
asc(String) - Static method in class org.apache.spark.sql.functions
Returns a sort expression based on ascending order of the column.
asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
Read the elements of this stream through an iterator.
AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
 
AskPermissionToCommitOutput(int, long, long) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
 
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
askTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the default Spark timeout to use for Akka ask operations.
askWithReply(Object, ActorRef, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
Send a message to the given actor and get its result within a default timeout, or throw a SparkException if this fails.
askWithReply(Object, ActorRef, int, int, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
Send a message to the given actor and get its result within a default timeout, or throw a SparkException if this fails even after the specified number of retries.
asNullable() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
asNullable() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
asRDDId() - Method in class org.apache.spark.storage.BlockId
 
assertValid() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Check validity of parameters.
assertValid() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Check validity of parameters.
assertValid() - Method in class org.apache.spark.rdd.BlockRDD
Check if this BlockRDD is valid.
assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
 
AsynchronousListenerBus<L,E> - Class in org.apache.spark.util
Asynchronously passes events to registered listeners.
AsynchronousListenerBus(String) - Constructor for class org.apache.spark.util.AsynchronousListenerBus
 
AsyncRDDActions<T> - Class in org.apache.spark.rdd
A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
 
ata() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
A^T^ * A
atb() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
A^T^ * b
attachExecutor(ReceiverSupervisor) - Method in class org.apache.spark.streaming.receiver.Receiver
Attach Network Receiver executor to this receiver.
attachHandler(ServletContextHandler) - Method in class org.apache.spark.ui.WebUI
Attach a handler to this UI.
attachListener(CleanerListener) - Method in class org.apache.spark.ContextCleaner
Attach a listener object to get information of when objects are cleaned.
attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUI
Attach a page to this UI.
attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUITab
Attach a page to this tab.
attachTab(WebUITab) - Method in class org.apache.spark.ui.WebUI
Attach a tab to this UI, along with all of its attached pages.
attempt() - Method in class org.apache.spark.scheduler.TaskInfo
 
attempt() - Method in class org.apache.spark.scheduler.TaskSet
 
attemptId() - Method in class org.apache.spark.scheduler.Stage
 
attemptId() - Method in class org.apache.spark.scheduler.StageInfo
 
attemptID() - Method in class org.apache.spark.TaskCommitDenied
 
attemptId() - Method in class org.apache.spark.TaskContext
 
attemptId() - Method in class org.apache.spark.TaskContextImpl
 
attemptNumber() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
 
attemptNumber() - Method in class org.apache.spark.scheduler.TaskDescription
 
attemptNumber() - Method in class org.apache.spark.TaskContext
How many times this task has been attempted.
attemptNumber() - Method in class org.apache.spark.TaskContextImpl
 
attr() - Method in class org.apache.spark.graphx.Edge
 
attr() - Method in class org.apache.spark.graphx.EdgeContext
The attribute associated with the edge.
attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
attr() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
attribute() - Method in class org.apache.spark.sql.sources.EqualTo
 
attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
 
attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
 
attribute() - Method in class org.apache.spark.sql.sources.In
 
attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
 
attribute() - Method in class org.apache.spark.sql.sources.IsNull
 
attribute() - Method in class org.apache.spark.sql.sources.LessThan
 
attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
 
attributeMap() - Method in class org.apache.spark.sql.hive.MetastoreRelation
An attribute map that can be used to lookup original attributes based on expression id.
attributeMap() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
attributeMap() - Method in class org.apache.spark.sql.sources.LogicalRelation
Used to lookup original attribute capitalization
attributes() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
attributes() - Method in class org.apache.spark.sql.hive.MetastoreRelation
Non-partitionKey attributes
attributes() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
attributes() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
attrs() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
AUTO_BROADCASTJOIN_THRESHOLD() - Static method in class org.apache.spark.sql.SQLConf
 
autoBroadcastJoinThreshold() - Method in class org.apache.spark.sql.SQLConf
Upper bound on the sizes (in bytes) of the tables qualified for the auto conversion to a broadcast value during the physical executions of join operations.
Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
avg(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the average of the values in a group.
avg(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the average of the values in a group.
avg(String...) - Method in class org.apache.spark.sql.GroupedData
Compute the mean value for each numeric columns for each group.
avg(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
Compute the mean value for each numeric columns for each group.
AVG() - Static method in class org.apache.spark.sql.hive.HiveQl
 
awaitResult() - Method in class org.apache.spark.partial.ApproximateActionListener
Waits for up to timeout milliseconds since the listener was created and then returns a PartialResult with the result so far.
awaitResult() - Method in class org.apache.spark.scheduler.JobWaiter
 
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Wait the thread until the supervisor is stopped
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.util.logging.FileAppender
Wait for the appender to stop appending, either because input stream is closed or because of any error in appending
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y += a * x

B

backend() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
BaggedPoint<Datum> - Class in org.apache.spark.mllib.tree.impl
Internal representation of a datapoint which belongs to several subsamples of the same dataset, particularly for bagging (e.g., for random forests).
BaggedPoint(Datum, double[]) - Constructor for class org.apache.spark.mllib.tree.impl.BaggedPoint
 
base() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
baseDir() - Method in class org.apache.spark.HttpFileServer
 
baseMap() - Method in class org.apache.spark.sql.sources.CaseInsensitiveMap
 
baseOn(ParamPair<?>...) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
baseOn(ParamMap) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
baseOn(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
basePath() - Method in class org.apache.spark.ui.SparkUI
 
basePath() - Method in class org.apache.spark.ui.WebUITab
 
BaseRelation - Class in org.apache.spark.sql.sources
::DeveloperApi:: Represents a collection of tuples with a known schema.
BaseRelation() - Constructor for class org.apache.spark.sql.sources.BaseRelation
 
baseRelationToDataFrame(BaseRelation) - Method in class org.apache.spark.sql.SQLContext
Convert a BaseRelation created for external data sources into a DataFrame.
BasicColumnAccessor<T extends org.apache.spark.sql.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
BasicColumnAccessor(ByteBuffer, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.BasicColumnAccessor
 
BasicColumnBuilder<T extends org.apache.spark.sql.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
BasicColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.BasicColumnBuilder
 
basicSparkPage(Function0<Seq<Node>>, String) - Static method in class org.apache.spark.ui.UIUtils
Returns a page with the spark css/js and a simple format.
BatchAllocationEvent - Class in org.apache.spark.streaming.scheduler
 
BatchAllocationEvent(Time, AllocatedBlocks) - Constructor for class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
BatchCleanupEvent - Class in org.apache.spark.streaming.scheduler
 
BatchCleanupEvent(Seq<Time>) - Constructor for class org.apache.spark.streaming.scheduler.BatchCleanupEvent
 
batchDuration() - Method in class org.apache.spark.streaming.DStreamGraph
 
batchDuration() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
BATCHES() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
batchForTime() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
 
BatchInfo - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, ReceivedBlockInfo[]>, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
batchSize() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
batchTimeToSelectedFiles() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
BeginEvent - Class in org.apache.spark.scheduler
 
BeginEvent(Task<?>, TaskInfo) - Constructor for class org.apache.spark.scheduler.BeginEvent
 
beginTime() - Method in class org.apache.spark.streaming.Interval
 
benchmark(int) - Static method in class org.apache.spark.util.random.XORShiftRandom
 
BernoulliCellSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials for partitioning a data sequence.
BernoulliCellSampler(double, double, boolean) - Constructor for class org.apache.spark.util.random.BernoulliCellSampler
 
BernoulliSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.BernoulliSampler
 
bestModel() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
beta() - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
 
BETWEEN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
Bin - Class in org.apache.spark.mllib.tree.model
Used for "binning" the feature values for faster best split calculation.
Bin(Split, Split, Enumeration.Value, double) - Constructor for class org.apache.spark.mllib.tree.model.Bin
 
BINARY - Class in org.apache.spark.sql.columnar
 
BINARY() - Constructor for class org.apache.spark.sql.columnar.BINARY
 
binary() - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type binary
BinaryClassificationEvaluator - Class in org.apache.spark.ml.evaluation
:: AlphaComponent ::
BinaryClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
BinaryClassificationMetricComputer - Interface in org.apache.spark.mllib.evaluation.binary
Trait for a binary classification evaluation metric computer.
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation
:: Experimental :: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>, int) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
 
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Defaults numBins to 0.
BinaryColumnAccessor - Class in org.apache.spark.sql.columnar
 
BinaryColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.BinaryColumnAccessor
 
BinaryColumnBuilder - Class in org.apache.spark.sql.columnar
 
BinaryColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.BinaryColumnBuilder
 
BinaryColumnStats - Class in org.apache.spark.sql.columnar
 
BinaryColumnStats() - Constructor for class org.apache.spark.sql.columnar.BinaryColumnStats
 
BinaryConfusionMatrix - Interface in org.apache.spark.mllib.evaluation.binary
Trait for a binary confusion matrix.
BinaryConfusionMatrixImpl - Class in org.apache.spark.mllib.evaluation.binary
Implementation of BinaryConfusionMatrix.
BinaryConfusionMatrixImpl(BinaryLabelCounter, BinaryLabelCounter) - Constructor for class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
BinaryConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
BinaryFileRDD<T> - Class in org.apache.spark.rdd
 
BinaryFileRDD(SparkContext, Class<? extends StreamFileInputFormat<T>>, Class<String>, Class<T>, Configuration, int) - Constructor for class org.apache.spark.rdd.BinaryFileRDD
 
binaryFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext
:: Experimental ::
binaryFiles(String, int) - Method in class org.apache.spark.SparkContext
:: Experimental ::
BinaryLabelCounter - Class in org.apache.spark.mllib.evaluation.binary
A counter for positives and negatives.
BinaryLabelCounter(long, long) - Constructor for class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators
Function to check if labels used for classification are either zero or one.
BinaryLongConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
binaryRecords(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
:: Experimental ::
binaryRecords(String, int, Configuration) - Method in class org.apache.spark.SparkContext
:: Experimental ::
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
:: Experimental ::
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.StreamingContext
:: Experimental ::
bind() - Method in class org.apache.spark.ui.WebUI
Bind to the HTTP server behind this web interface.
binnedFeatures() - Method in class org.apache.spark.mllib.tree.impl.TreePoint
 
BinomialBounds - Class in org.apache.spark.util.random
Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample size with high confidence when sampling without replacement.
BinomialBounds() - Constructor for class org.apache.spark.util.random.BinomialBounds
 
BITS_PER_LONG() - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
BLAS - Class in org.apache.spark.mllib.linalg
BLAS routines for MLlib's vectors and matrices.
BLAS() - Constructor for class org.apache.spark.mllib.linalg.BLAS
 
BLOCK_MANAGER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
BlockAdditionEvent - Class in org.apache.spark.streaming.scheduler
 
BlockAdditionEvent(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.BlockAdditionEvent
 
BlockException - Exception in org.apache.spark.storage
 
BlockException(BlockId, String) - Constructor for exception org.apache.spark.storage.BlockException
 
BlockGenerator - Class in org.apache.spark.streaming.receiver
Generates batches of objects received by a Receiver and puts them into appropriately named blocks at regular intervals.
BlockGenerator(BlockGeneratorListener, int, SparkConf) - Constructor for class org.apache.spark.streaming.receiver.BlockGenerator
 
BlockGeneratorListener - Interface in org.apache.spark.streaming.receiver
Listener object for BlockGenerator events
blockId(int) - Method in class org.apache.spark.ml.recommendation.ALS.LocalIndexEncoder
Gets the block id from an encoded index.
blockId() - Method in class org.apache.spark.rdd.BlockRDDPartition
 
blockId() - Method in class org.apache.spark.scheduler.IndirectTaskResult
 
blockId() - Method in exception org.apache.spark.storage.BlockException
 
BlockId - Class in org.apache.spark.storage
:: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocations
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
blockId() - Method in class org.apache.spark.storage.BlockObjectWriter
 
blockId() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
blockId() - Method in interface org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchResult
 
blockId() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
blockId() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
blockId() - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedStoreResult
 
blockId() - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockStoreResult
 
blockId() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
blockIds() - Method in class org.apache.spark.rdd.BlockRDD
 
blockIds() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
 
blockIdsToBlockManagers(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockIdsToExecutorIds(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockIdsToHosts(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockifyObject(T, int, Serializer, Option<CompressionCodec>, ClassTag<T>) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
 
BlockInfo - Class in org.apache.spark.storage
 
BlockInfo(StorageLevel, boolean) - Constructor for class org.apache.spark.storage.BlockInfo
 
blockManager() - Method in class org.apache.spark.SparkEnv
 
BlockManager - Class in org.apache.spark.storage
Manager running on every node (driver and executors) which provides interfaces for putting and retrieving blocks both locally and remotely into various stores (memory, disk, and off-heap).
BlockManager(String, ActorSystem, BlockManagerMaster, Serializer, long, SparkConf, MapOutputTracker, ShuffleManager, BlockTransferService, SecurityManager, int) - Constructor for class org.apache.spark.storage.BlockManager
 
BlockManager(String, ActorSystem, BlockManagerMaster, Serializer, SparkConf, MapOutputTracker, ShuffleManager, BlockTransferService, SecurityManager, int) - Constructor for class org.apache.spark.storage.BlockManager
Construct a BlockManager with a memory limit set based on system properties.
blockManager() - Method in class org.apache.spark.storage.BlockManagerSource
 
blockManager() - Method in class org.apache.spark.storage.BlockStore
 
blockManagerAddedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerAddedToJson(SparkListenerBlockManagerAdded) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerBasedBlockHandler - Class in org.apache.spark.streaming.receiver
Implementation of a ReceivedBlockHandler which stores the received blocks into a block manager with the specified storage level.
BlockManagerBasedBlockHandler(BlockManager, StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
BlockManagerBasedStoreResult - Class in org.apache.spark.streaming.receiver
Implementation of ReceivedBlockStoreResult that stores the metadata related to storage of blocks using BlockManagerBasedBlockHandler
BlockManagerBasedStoreResult(StreamBlockId) - Constructor for class org.apache.spark.streaming.receiver.BlockManagerBasedStoreResult
 
blockManagerId() - Method in class org.apache.spark.Heartbeat
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManager
 
BlockManagerId - Class in org.apache.spark.storage
:: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerInfo
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetPeers
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
 
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
 
blockManagerIdFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
blockManagerIdToJson(BlockManagerId) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerInfo - Class in org.apache.spark.storage
 
BlockManagerInfo(BlockManagerId, long, long, ActorRef) - Constructor for class org.apache.spark.storage.BlockManagerInfo
 
BlockManagerMaster - Class in org.apache.spark.storage
 
BlockManagerMaster(ActorRef, SparkConf, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMaster
 
BlockManagerMasterActor - Class in org.apache.spark.storage
BlockManagerMasterActor is an actor on the master node to track statuses of all slaves' block managers.
BlockManagerMasterActor(boolean, SparkConf, LiveListenerBus) - Constructor for class org.apache.spark.storage.BlockManagerMasterActor
 
BlockManagerMessages - Class in org.apache.spark.storage
 
BlockManagerMessages() - Constructor for class org.apache.spark.storage.BlockManagerMessages
 
BlockManagerMessages.BlockManagerHeartbeat - Class in org.apache.spark.storage
 
BlockManagerMessages.BlockManagerHeartbeat(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
 
BlockManagerMessages.BlockManagerHeartbeat$ - Class in org.apache.spark.storage
 
BlockManagerMessages.BlockManagerHeartbeat$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
 
BlockManagerMessages.ExpireDeadHosts$ - Class in org.apache.spark.storage
 
BlockManagerMessages.ExpireDeadHosts$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.ExpireDeadHosts$
 
BlockManagerMessages.GetActorSystemHostPortForExecutor - Class in org.apache.spark.storage
 
BlockManagerMessages.GetActorSystemHostPortForExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor
 
BlockManagerMessages.GetActorSystemHostPortForExecutor$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetActorSystemHostPortForExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor$
 
BlockManagerMessages.GetBlockStatus - Class in org.apache.spark.storage
 
BlockManagerMessages.GetBlockStatus(BlockId, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
BlockManagerMessages.GetBlockStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetBlockStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
 
BlockManagerMessages.GetLocations - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocations(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations
 
BlockManagerMessages.GetLocations$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocations$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations$
 
BlockManagerMessages.GetLocationsMultipleBlockIds - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocationsMultipleBlockIds(BlockId[]) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
 
BlockManagerMessages.GetLocationsMultipleBlockIds$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocationsMultipleBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
 
BlockManagerMessages.GetMatchingBlockIds - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMatchingBlockIds(Function1<BlockId, Object>, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
BlockManagerMessages.GetMatchingBlockIds$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMatchingBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
 
BlockManagerMessages.GetMemoryStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMemoryStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
 
BlockManagerMessages.GetPeers - Class in org.apache.spark.storage
 
BlockManagerMessages.GetPeers(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers
 
BlockManagerMessages.GetPeers$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetPeers$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers$
 
BlockManagerMessages.GetStorageStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetStorageStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
 
BlockManagerMessages.RegisterBlockManager - Class in org.apache.spark.storage
 
BlockManagerMessages.RegisterBlockManager(BlockManagerId, long, ActorRef) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
BlockManagerMessages.RegisterBlockManager$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RegisterBlockManager$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
 
BlockManagerMessages.RemoveBlock - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBlock(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
 
BlockManagerMessages.RemoveBlock$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBlock$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
 
BlockManagerMessages.RemoveBroadcast - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBroadcast(long, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
BlockManagerMessages.RemoveBroadcast$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBroadcast$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
 
BlockManagerMessages.RemoveExecutor - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
 
BlockManagerMessages.RemoveExecutor$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
 
BlockManagerMessages.RemoveRdd - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveRdd(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
 
BlockManagerMessages.RemoveRdd$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveRdd$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
 
BlockManagerMessages.RemoveShuffle - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveShuffle(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
 
BlockManagerMessages.RemoveShuffle$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveShuffle$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
 
BlockManagerMessages.StopBlockManagerMaster$ - Class in org.apache.spark.storage
 
BlockManagerMessages.StopBlockManagerMaster$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
 
BlockManagerMessages.ToBlockManagerMaster - Interface in org.apache.spark.storage
 
BlockManagerMessages.ToBlockManagerSlave - Interface in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo - Class in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
BlockManagerMessages.UpdateBlockInfo() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
BlockManagerMessages.UpdateBlockInfo$ - Class in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
 
blockManagerRemovedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerRemovedToJson(SparkListenerBlockManagerRemoved) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerSlaveActor - Class in org.apache.spark.storage
An actor to take commands from the master to execute options.
BlockManagerSlaveActor(BlockManager, MapOutputTracker) - Constructor for class org.apache.spark.storage.BlockManagerSlaveActor
 
BlockManagerSource - Class in org.apache.spark.storage
 
BlockManagerSource(BlockManager) - Constructor for class org.apache.spark.storage.BlockManagerSource
 
BlockMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental ::
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Alternate constructor for BlockMatrix without the input of the number of rows and columns.
BlockNotFoundException - Exception in org.apache.spark.storage
 
BlockNotFoundException(String) - Constructor for exception org.apache.spark.storage.BlockNotFoundException
 
BlockObjectWriter - Class in org.apache.spark.storage
An interface for writing JVM objects to some underlying storage.
BlockObjectWriter(BlockId) - Constructor for class org.apache.spark.storage.BlockObjectWriter
 
blockPushingThread() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
BlockRDD<T> - Class in org.apache.spark.rdd
 
BlockRDD(SparkContext, BlockId[], ClassTag<T>) - Constructor for class org.apache.spark.rdd.BlockRDD
 
BlockRDDPartition - Class in org.apache.spark.rdd
 
BlockRDDPartition(BlockId, int) - Constructor for class org.apache.spark.rdd.BlockRDDPartition
 
BlockResult - Class in org.apache.spark.storage
 
BlockResult(Iterator<Object>, Enumeration.Value, long) - Constructor for class org.apache.spark.storage.BlockResult
 
blocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
blocks() - Method in class org.apache.spark.storage.BlockManagerInfo
 
blocks() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
blocks() - Method in class org.apache.spark.storage.StorageStatus
Return the blocks stored in this block manager.
BlockStatus - Class in org.apache.spark.storage
 
BlockStatus(StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
 
blockStatusFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockStatusToJson(BlockStatus) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockStore - Class in org.apache.spark.storage
Abstract class to store blocks.
BlockStore(BlockManager) - Constructor for class org.apache.spark.storage.BlockStore
 
blockStoreResult() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
blockTransferService() - Method in class org.apache.spark.SparkEnv
 
BlockValues - Interface in org.apache.spark.storage
 
bmAddress() - Method in class org.apache.spark.FetchFailed
 
BOOLEAN - Class in org.apache.spark.sql.columnar
 
BOOLEAN() - Constructor for class org.apache.spark.sql.columnar.BOOLEAN
 
BooleanBitSet - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet() - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
BooleanBitSet.Decoder - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet.Decoder(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
BooleanBitSet.Encoder - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
BooleanColumnAccessor - Class in org.apache.spark.sql.columnar
 
BooleanColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.BooleanColumnAccessor
 
BooleanColumnBuilder - Class in org.apache.spark.sql.columnar
 
BooleanColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.BooleanColumnBuilder
 
BooleanColumnStats - Class in org.apache.spark.sql.columnar
 
BooleanColumnStats() - Constructor for class org.apache.spark.sql.columnar.BooleanColumnStats
 
BooleanConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
BooleanParam - Class in org.apache.spark.ml.param
Specialized version of Param[Boolean] for Java.
BooleanParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.BooleanParam
 
BooleanParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.BooleanParam
 
booleanWritableConverter() - Static method in class org.apache.spark.SparkContext
 
booleanWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
booleanWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
boolToBoolWritable(boolean) - Static method in class org.apache.spark.SparkContext
 
BoostingStrategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Configuration options for GradientBoostedTrees.
BoostingStrategy(Strategy, Loss, int, double) - Constructor for class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
Both() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges originating from *and* arriving at a vertex of interest.
boundaries() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
 
BoundedDouble - Class in org.apache.spark.partial
:: Experimental :: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
 
BoundedPriorityQueue<A> - Class in org.apache.spark.util
Bounded priority queue.
BoundedPriorityQueue(int, Ordering<A>) - Constructor for class org.apache.spark.util.BoundedPriorityQueue
 
boundPort() - Method in class org.apache.spark.ui.ServerInfo
 
boundPort() - Method in class org.apache.spark.ui.WebUI
Return the actual port to which this server is bound.
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast
A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
 
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
 
BROADCAST_TIMEOUT() - Static method in class org.apache.spark.sql.SQLConf
 
BROADCAST_VARS() - Static method in class org.apache.spark.util.MetadataCleanerType
 
BroadcastBlockId - Class in org.apache.spark.storage
 
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
 
broadcastCleaned(long) - Method in interface org.apache.spark.CleanerListener
 
broadcastedConf() - Method in class org.apache.spark.rdd.CheckpointRDD
 
BroadcastFactory - Interface in org.apache.spark.broadcast
:: DeveloperApi :: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
broadcastId() - Method in class org.apache.spark.CleanBroadcast
 
broadcastId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
 
BroadcastManager - Class in org.apache.spark.broadcast
 
BroadcastManager(boolean, SparkConf, SecurityManager) - Constructor for class org.apache.spark.broadcast.BroadcastManager
 
broadcastManager() - Method in class org.apache.spark.SparkEnv
 
broadcastTimeout() - Method in class org.apache.spark.sql.SQLConf
Timeout in seconds for the broadcast wait time in hash join
broadcastVars() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
Broker - Class in org.apache.spark.streaming.kafka
:: Experimental :: Represent the host and port info for a Kafka broker.
buf() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
buffer() - Method in class org.apache.spark.storage.ArrayValues
 
buffer() - Method in class org.apache.spark.storage.ByteBufferValues
 
buffer() - Method in class org.apache.spark.util.SerializableBuffer
 
buffers() - Method in class org.apache.spark.sql.columnar.CachedBatch
 
build() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
Builds a ALS.RatingBlock.
build() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlockBuilder
build() - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Builds and returns all combinations of parameters specified by the param grid.
build(Node[]) - Method in class org.apache.spark.mllib.tree.model.Node
build the left node and right nodes if not leaf
build() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
build() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Returns the final columnar byte buffer.
build() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
build() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
buildFilter() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
buildMetadata(RDD<LabeledPoint>, Strategy, int, String) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Construct a DecisionTreeMetadata instance for this dataset and parameters.
buildMetadata(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
buildNonNulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
buildPools() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
buildPools() - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
buildPools() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
buildRegistryName(Source) - Method in class org.apache.spark.metrics.MetricsSystem
Build a name that uniquely identifies each metric source.
buildScan(String[], Filter[]) - Method in class org.apache.spark.sql.jdbc.JDBCRelation
 
buildScan() - Method in class org.apache.spark.sql.json.JSONRelation
 
buildScan(Seq<Attribute>, Seq<Expression>) - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
buildScan(Seq<Attribute>, Seq<Expression>) - Method in interface org.apache.spark.sql.sources.CatalystScan
 
buildScan(String[], Filter[]) - Method in interface org.apache.spark.sql.sources.PrunedFilteredScan
 
buildScan(String[]) - Method in interface org.apache.spark.sql.sources.PrunedScan
 
buildScan() - Method in interface org.apache.spark.sql.sources.TableScan
 
BYTE - Class in org.apache.spark.sql.columnar
 
BYTE() - Constructor for class org.apache.spark.sql.columnar.BYTE
 
ByteArrayChunkOutputStream - Class in org.apache.spark.util.io
An OutputStream that writes to fixed-size chunks of byte arrays.
ByteArrayChunkOutputStream(int) - Constructor for class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
ByteArrayColumnType<T extends org.apache.spark.sql.types.DataType> - Class in org.apache.spark.sql.columnar
 
ByteArrayColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ByteArrayColumnType
 
byteBuffer() - Method in class org.apache.spark.streaming.receiver.ByteBufferBlock
 
ByteBufferBlock - Class in org.apache.spark.streaming.receiver
class representing a block received as an ByteBuffer
ByteBufferBlock(ByteBuffer) - Constructor for class org.apache.spark.streaming.receiver.ByteBufferBlock
 
ByteBufferData - Class in org.apache.spark.streaming.receiver
 
ByteBufferData(ByteBuffer) - Constructor for class org.apache.spark.streaming.receiver.ByteBufferData
 
ByteBufferInputStream - Class in org.apache.spark.util
Reads data from a ByteBuffer, and optionally cleans it up using BlockManager.dispose() at the end of the stream (e.g.
ByteBufferInputStream(ByteBuffer, boolean) - Constructor for class org.apache.spark.util.ByteBufferInputStream
 
ByteBufferValues - Class in org.apache.spark.storage
 
ByteBufferValues(ByteBuffer) - Constructor for class org.apache.spark.storage.ByteBufferValues
 
BytecodeUtils - Class in org.apache.spark.graphx.util
Includes an utility function to test whether a function accesses a specific attribute of an object.
BytecodeUtils() - Constructor for class org.apache.spark.graphx.util.BytecodeUtils
 
ByteColumnAccessor - Class in org.apache.spark.sql.columnar
 
ByteColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.ByteColumnAccessor
 
ByteColumnBuilder - Class in org.apache.spark.sql.columnar
 
ByteColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.ByteColumnBuilder
 
ByteColumnStats - Class in org.apache.spark.sql.columnar
 
ByteColumnStats() - Constructor for class org.apache.spark.sql.columnar.ByteColumnStats
 
bytes() - Method in class org.apache.spark.streaming.receiver.ByteBufferData
 
BYTES_FOR_PRECISION() - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Compute the FIXED_LEN_BYTE_ARRAY length needed to represent a given DECIMAL precision.
bytesToBytesWritable(byte[]) - Static method in class org.apache.spark.SparkContext
 
bytesToLines(InputStream) - Static method in class org.apache.spark.streaming.dstream.SocketReceiver
This methods translates the data from an inputstream (say, from a socket) to '\n' delimited strings and returns an iterator to access the strings.
bytesToString(long) - Static method in class org.apache.spark.util.Utils
Convert a quantity in bytes to a human-readable string such as "4.0 MB".
bytesWritableConverter() - Static method in class org.apache.spark.SparkContext
 
bytesWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
bytesWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
bytesWritten(long) - Method in interface org.apache.spark.util.logging.RollingPolicy
Notify that bytes have been written
bytesWritten(long) - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Increment the bytes that have been written in the current file
bytesWritten(long) - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.graphx.Graph
Caches the vertices and edges associated with this graph at the previously-specified target storage levels, which default to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Caches the underlying RDD.
cache() - Method in class org.apache.spark.partial.StudentTCacher
 
cache() - Method in class org.apache.spark.rdd.RDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.DataFrame
 
cache() - Method in interface org.apache.spark.sql.RDDApi
 
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
CachedBatch - Class in org.apache.spark.sql.columnar
 
CachedBatch(byte[][], Row) - Constructor for class org.apache.spark.sql.columnar.CachedBatch
 
cachedColumnBuffers() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
CachedData - Class in org.apache.spark.sql
Holds a cached logical plan and its data
CachedData(LogicalPlan, InMemoryRelation) - Constructor for class org.apache.spark.sql.CachedData
 
cachedRepresentation() - Method in class org.apache.spark.sql.CachedData
 
CacheManager - Class in org.apache.spark
Spark class responsible for passing RDDs partition contents to the BlockManager and making sure a node doesn't load two copies of an RDD at once.
CacheManager(BlockManager) - Constructor for class org.apache.spark.CacheManager
 
cacheManager() - Method in class org.apache.spark.SparkEnv
 
CacheManager - Class in org.apache.spark.sql
Provides support in a SQLContext for caching query results and automatically using these cached results when subsequent queries are executed.
CacheManager(SQLContext) - Constructor for class org.apache.spark.sql.CacheManager
 
cacheQuery(DataFrame, Option<String>, StorageLevel) - Method in class org.apache.spark.sql.CacheManager
Caches the data produced by the logical representation of the given schema rdd.
cacheTable(String) - Method in class org.apache.spark.sql.CacheManager
Caches the specified table in-memory.
cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
Caches the specified table in-memory.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for regression
calculate() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Calculate the impurity from the stored sufficient statistics.
calculatedTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
calculateNumBatchesToRemember(Duration) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
Calculate the number of last batches to remember, such that all the files selected in at least last MIN_REMEMBER_DURATION duration can be remembered.
calculateTotalMemory(SparkContext) - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
 
call(T1) - Method in interface org.apache.spark.api.java.function.Function
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
 
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
 
call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
 
call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
 
call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
 
call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
 
call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
 
call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
 
call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
 
call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
 
callSite() - Method in class org.apache.spark.scheduler.ActiveJob
 
callSite() - Method in class org.apache.spark.scheduler.JobSubmitted
 
callSite() - Method in class org.apache.spark.scheduler.Stage
 
CallSite - Class in org.apache.spark.util
CallSite represents a place in user code.
CallSite(String, String) - Constructor for class org.apache.spark.util.CallSite
 
callUDF(Function0<?>, DataType) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 0 arguments as user-defined function (UDF).
callUDF(Function1<?, ?>, DataType, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 1 arguments as user-defined function (UDF).
callUDF(Function2<?, ?, ?>, DataType, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 2 arguments as user-defined function (UDF).
callUDF(Function3<?, ?, ?, ?>, DataType, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 3 arguments as user-defined function (UDF).
callUDF(Function4<?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 4 arguments as user-defined function (UDF).
callUDF(Function5<?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 5 arguments as user-defined function (UDF).
callUDF(Function6<?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 6 arguments as user-defined function (UDF).
callUDF(Function7<?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 7 arguments as user-defined function (UDF).
callUDF(Function8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 8 arguments as user-defined function (UDF).
callUDF(Function9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 9 arguments as user-defined function (UDF).
callUDF(Function10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 10 arguments as user-defined function (UDF).
cancel() - Method in class org.apache.spark.ComplexFutureAction
 
cancel() - Method in interface org.apache.spark.FutureAction
Cancels the execution of this action.
cancel(boolean) - Method in class org.apache.spark.JavaFutureActionWrapper
 
cancel() - Method in class org.apache.spark.scheduler.JobWaiter
Sends a signal to the DAGScheduler to cancel the job.
cancel() - Method in class org.apache.spark.SimpleFutureAction
 
cancel() - Method in class org.apache.spark.util.MetadataCleaner
 
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel all jobs that are running or waiting in the queue.
cancelAllJobs() - Method in class org.apache.spark.SparkContext
Cancel all jobs that have been scheduled or are running.
cancelJob(int) - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel a job that is running or waiting in the queue.
cancelJob(int) - Method in class org.apache.spark.SparkContext
Cancel a given job if it's scheduled or running
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
Cancel active jobs for the specified group.
cancelStage(int) - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel all jobs associated with a running or scheduled stage.
cancelStage(int) - Method in class org.apache.spark.SparkContext
Cancel a given stage and all jobs associated with it
cancelTasks(int, boolean) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
cancelTasks(int, boolean) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
canCommit(int, long, long) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
Called by tasks to ask whether they can commit their output to HDFS.
canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
 
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
 
canFetchMoreResults(long) - Method in class org.apache.spark.scheduler.TaskSetManager
Check whether has enough quota to fetch the result with size bytes
capacity() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
CartesianPartition - Class in org.apache.spark.rdd
 
CartesianPartition(int, RDD<?>, RDD<?>, int, int) - Constructor for class org.apache.spark.rdd.CartesianPartition
 
CartesianRDD<T,U> - Class in org.apache.spark.rdd
 
CartesianRDD(SparkContext, RDD<T>, RDD<U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.CartesianRDD
 
CASE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
CaseInsensitiveMap - Class in org.apache.spark.sql.sources
Builds a map in which keys are case insensitive
CaseInsensitiveMap(Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CaseInsensitiveMap
 
caseSensitive() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
cast(DataType) - Method in class org.apache.spark.sql.Column
Casts the column to a different data type.
cast(String) - Method in class org.apache.spark.sql.Column
Casts the column to a different data type, using the canonical string representation of the type.
castAndRenameChildOutput(InsertIntoTable, Seq<Attribute>, LogicalPlan) - Static method in class org.apache.spark.sql.sources.PreInsertCastAndRename
If necessary, cast data types and rename fields to the expected types and names.
castChildOutput(InsertIntoTable, MetastoreRelation, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
 
catalog() - Method in class org.apache.spark.sql.sources.PreWriteCheck
 
CatalystArrayContainsNullConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array contains null (see ParquetTypesConverter) into an ArrayType.
CatalystArrayContainsNullConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
CatalystArrayContainsNullConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
CatalystArrayConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array (see ParquetTypesConverter) into an ArrayType.
CatalystArrayConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
 
CatalystArrayConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
 
CatalystConverter - Class in org.apache.spark.sql.parquet
 
CatalystConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystConverter
 
CatalystGroupConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that is able to convert a Parquet record to a org.apache.spark.sql.catalyst.expressions.Row object.
CatalystGroupConverter(StructField[], int, CatalystConverter, ArrayBuffer<Object>, ArrayBuffer<Row>) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
 
CatalystGroupConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
 
CatalystGroupConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
This constructor is used for the root converter only!
CatalystMapConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts two-element groups that match the characteristics of a map (see ParquetTypesConverter) into an MapType.
CatalystMapConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystMapConverter
 
CatalystNativeArrayConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array (see ParquetTypesConverter) into an ArrayType.
CatalystNativeArrayConverter(NativeType, int, CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
CatalystPrimitiveConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.PrimitiveConverter that converts Parquet types to Catalyst types.
CatalystPrimitiveConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
CatalystPrimitiveRowConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that is able to convert a Parquet record to a org.apache.spark.sql.catalyst.expressions.Row object.
CatalystPrimitiveRowConverter(StructField[], MutableRow) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
CatalystPrimitiveRowConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
CatalystPrimitiveStringConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.PrimitiveConverter that converts Parquet Binary to Catalyst String.
CatalystPrimitiveStringConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
 
CatalystScan - Interface in org.apache.spark.sql.sources
::Experimental:: An interface for experimenting with a more direct connection to the query planner.
CatalystStructConverter - Class in org.apache.spark.sql.parquet
This converter is for multi-element groups of primitive or complex types that have repetition level optional or required (so struct fields).
CatalystStructConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystStructConverter
 
CatalystTimestampConverter - Class in org.apache.spark.sql.parquet
 
CatalystTimestampConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
categories() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
categories() - Method in class org.apache.spark.mllib.tree.model.Split
 
category() - Method in class org.apache.spark.mllib.tree.model.Bin
 
channelFactory() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
channelFactoryExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
checkEquals(ASTNode) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Throws an error if this is not equal to other.
checkHost(String, String) - Static method in class org.apache.spark.util.Utils
 
checkHostPort(String, String) - Static method in class org.apache.spark.util.Utils
 
checkInputColumn(StructType, String, DataType) - Method in interface org.apache.spark.ml.param.Params
Check whether the given schema contains an input column.
checkMinimalPollingPeriod(TimeUnit, int) - Static method in class org.apache.spark.metrics.MetricsSystem
 
checkModifyPermissions(String) - Method in class org.apache.spark.SecurityManager
Checks the given user against the modify acl list to see if they have authorization to modify the application.
checkOutputSpecs(JobContext) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.Graph
Mark this Graph for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
checkpoint() - Method in class org.apache.spark.rdd.CheckpointRDD
 
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
 
checkpoint() - Method in class org.apache.spark.rdd.RDD
Mark this RDD for checkpointing.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
Checkpoint - Class in org.apache.spark.streaming
 
Checkpoint(StreamingContext, Time) - Constructor for class org.apache.spark.streaming.Checkpoint
 
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Enable periodic checkpointing of RDDs of this DStream
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointBackupFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
Get the checkpoint backup file for the given checkpoint time
checkpointClock() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
 
checkpointData() - Method in class org.apache.spark.rdd.RDD
 
checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
 
checkpointDir() - Method in class org.apache.spark.SparkContext
 
checkpointDir() - Method in class org.apache.spark.streaming.Checkpoint
 
checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
 
checkpointDirToLogDir(String, int) - Static method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
checkpointDirToLogDir(String) - Static method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
 
checkpointDuration() - Method in class org.apache.spark.streaming.Checkpoint
 
checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
 
Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
 
checkpointFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
Get the checkpoint file for the given checkpoint time
CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
 
checkpointInProgress() - Method in class org.apache.spark.streaming.DStreamGraph
 
checkpointInterval() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
 
checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
checkpointInterval() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
 
checkpointPath() - Method in class org.apache.spark.rdd.CheckpointRDD
 
CheckpointRDD<T> - Class in org.apache.spark.rdd
This RDD represents a RDD checkpoint file (similar to HadoopRDD).
CheckpointRDD(SparkContext, String, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CheckpointRDD
 
checkpointRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
CheckpointRDDPartition - Class in org.apache.spark.rdd
 
CheckpointRDDPartition(int) - Constructor for class org.apache.spark.rdd.CheckpointRDDPartition
 
CheckpointReader - Class in org.apache.spark.streaming
 
CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
 
CheckpointState - Class in org.apache.spark.rdd
Enumeration to manage state transitions of an RDD through checkpointing [ Initialized --> marked for checkpointing --> checkpointing in progress --> checkpointed ]
CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
 
checkpointTime() - Method in class org.apache.spark.streaming.Checkpoint
 
CheckpointWriter - Class in org.apache.spark.streaming
Convenience class to handle the writing of graph checkpoint to file
CheckpointWriter(JobGenerator, SparkConf, String, Configuration) - Constructor for class org.apache.spark.streaming.CheckpointWriter
 
CheckpointWriter.CheckpointWriteHandler - Class in org.apache.spark.streaming
 
CheckpointWriter.CheckpointWriteHandler(Time, byte[]) - Constructor for class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.Pool
 
checkSpeculatableTasks() - Method in interface org.apache.spark.scheduler.Schedulable
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
Check for tasks to be speculated and return true if there are any.
checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
checkTimeoutInterval() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
checkUIViewPermissions(String) - Method in class org.apache.spark.SecurityManager
Checks the given user against the view acl list to see if they have authorization to view the UI.
child() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
child() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
child() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
child() - Method in class org.apache.spark.sql.sources.Not
 
ChildFirstURLClassLoader - Class in org.apache.spark.util
A mutable class loader that gives preference to its own URLs over the parent class loader when loading classes and resources.
ChildFirstURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.ChildFirstURLClassLoader
 
children() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
 
children() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
children() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
children() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
children() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
ChiSqSelector - Class in org.apache.spark.mllib.feature
:: Experimental :: Creates a ChiSquared feature selector.
ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
 
ChiSqSelectorModel - Class in org.apache.spark.mllib.feature
:: Experimental :: Chi Squared selector model.
ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
 
chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution.
chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform distribution, with each category having an expected frequency of 1 / observed.size.
chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's independence test on the input contingency matrix, which cannot contain negative entries or columns or rows that sum up to 0.
chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's independence test for every feature against the label across the input RDD.
ChiSqTest - Class in org.apache.spark.mllib.stat.test
Conduct the chi-squared test for the input RDDs using the specified method.
ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
 
ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.Method(String, Function2<Object, Object, Object>) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.Method$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
 
ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.NullHypothesis$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
:: Experimental :: Object containing the test results for the chi-squared hypothesis test.
ChiSqTestResult(double, int, double, String, String) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
Conduct Pearson's independence test for each feature against the label across the input RDD.
chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
chmod700(File) - Static method in class org.apache.spark.util.Utils
JDK equivalent of chmod 700 file.
classForName(String) - Static method in class org.apache.spark.util.Utils
Preferred alternative to Class.forName(className)
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
:: AlphaComponent :: Model produced by a Classifier.
ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
 
ClassificationModel - Interface in org.apache.spark.mllib.classification
:: Experimental :: Represents a classification model that predicts to which of a set of categories an example belongs.
Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
:: AlphaComponent :: Single-label binary or multiclass classification.
Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
 
ClassifierParams - Interface in org.apache.spark.ml.classification
:: DeveloperApi :: Params for classification.
classIsLoadable(String) - Static method in class org.apache.spark.util.Utils
Determines whether the provided class is loadable in the current thread.
classLoader() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
className() - Method in class org.apache.spark.ExceptionFailure
 
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaRDD
 
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
clean(F, boolean) - Method in class org.apache.spark.SparkContext
Clean a closure to make it ready to serialized and send to tasks (removes unreferenced variables in $outer's, updates REPL variables) If checkSerializable is set, clean will also proactively check to see if f is serializable and throw a SparkException if not.
clean(Object, boolean) - Static method in class org.apache.spark.util.ClosureCleaner
 
CleanBroadcast - Class in org.apache.spark
 
CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
 
cleaner() - Method in class org.apache.spark.SparkContext
 
CleanerListener - Interface in org.apache.spark
Listener class used for testing when any item has been cleaned by the Cleaner class.
CleanRDD - Class in org.apache.spark
 
CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
 
CleanShuffle - Class in org.apache.spark
 
CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
 
cleanup(long) - Method in class org.apache.spark.SparkContext
Called by MetadataCleaner to clean up the persistentRdds map periodically
cleanup(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
Cleanup old checkpoint data.
cleanup(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
cleanup(Time) - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
 
cleanUpAfterSchedulerStop() - Method in class org.apache.spark.scheduler.DAGScheduler
 
cleanupOldBatches(Time, boolean) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Clean up block information of old batches.
cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
CleanupOldBlocks - Class in org.apache.spark.streaming.receiver
 
CleanupOldBlocks(Time) - Constructor for class org.apache.spark.streaming.receiver.CleanupOldBlocks
 
cleanupOldBlocks(long) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
Cleanup old blocks older than the given threshold time
cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
cleanupOldBlocksAndBatches(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Clean up the data and metadata of blocks and batches that are strictly older than the threshold time.
cleanupOldLogs(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Delete the log files that are older than the threshold time.
CleanupTask - Interface in org.apache.spark
Classes that represent cleaning tasks.
CleanupTaskWeakReference - Class in org.apache.spark
A WeakReference associated with a CleanupTask.
CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
 
clear() - Static method in class org.apache.spark.Accumulators
 
clear() - Method in class org.apache.spark.sql.SQLConf
 
clear() - Method in class org.apache.spark.storage.BlockManagerInfo
 
clear() - Method in class org.apache.spark.storage.BlockStore
 
clear() - Method in class org.apache.spark.storage.MemoryStore
 
clear() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
CLEAR_NULL_VALUES_INTERVAL() - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
clearActiveContext() - Static method in class org.apache.spark.SparkContext
Clears the active SparkContext metadata.
clearCache() - Method in class org.apache.spark.sql.CacheManager
Clears all cached tables.
clearCache() - Method in class org.apache.spark.sql.SQLContext
Removes all cached tables from the in-memory cache.
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext
Clear the thread-local property for overriding the call sites of actions and RDDs.
clearCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
 
clearCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
ClearCheckpointData - Class in org.apache.spark.streaming.scheduler
 
ClearCheckpointData(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearCheckpointData
 
clearDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearFiles() - Method in class org.apache.spark.SparkContext
Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.SparkContext
Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext
Clear the current thread's job group ID and its description.
clearMetadata(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Clear metadata that are older than rememberDuration of this DStream.
clearMetadata(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
ClearMetadata - Class in org.apache.spark.streaming.scheduler
 
ClearMetadata(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearMetadata
 
clearNullValues() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
Remove entries with values that are no longer strongly reachable.
clearOldValues(long, Function2<A, B, BoxedUnit>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashMap
Removes old key-value pairs that have timestamp earlier than `threshTime`.
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashSet
Removes old values that have timestamp earlier than threshTime
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
Remove old key-value pairs with timestamps earlier than `threshTime`.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
:: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
:: Experimental :: Clears the threshold so that predict will output raw prediction scores.
client() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
client() - Method in class org.apache.spark.storage.TachyonBlockManager
 
client() - Method in class org.apache.spark.streaming.flume.FlumeConnection
 
clock() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
 
clock() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
Clock - Interface in org.apache.spark.util
An interface to represent clocks, so that they can be mocked out in unit tests.
clone() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
clone() - Method in class org.apache.spark.SparkConf
Copy this object
clone(JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType
Creates a duplicated copy of the value.
clone() - Method in class org.apache.spark.storage.StorageLevel
 
clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
 
clone() - Method in class org.apache.spark.util.random.PoissonSampler
 
clone() - Method in interface org.apache.spark.util.random.RandomSampler
return a copy of the RandomSampler object
clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
Clone an object using a Spark serializer.
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
Return a sampler that is the complement of the range specified of the current sampler.
close() - Method in class org.apache.spark.api.java.JavaSparkContext
 
close() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
close() - Method in class org.apache.spark.input.PortableDataStream
Close the file (if it is currently open)
close() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
close() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
close() - Method in class org.apache.spark.serializer.DeserializationStream
 
close() - Method in class org.apache.spark.serializer.JavaDeserializationStream
 
close() - Method in class org.apache.spark.serializer.JavaSerializationStream
 
close() - Method in class org.apache.spark.serializer.KryoDeserializationStream
 
close() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
close() - Method in class org.apache.spark.serializer.SerializationStream
 
close() - Method in class org.apache.spark.SparkHadoopWriter
 
close() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
close() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
close() - Method in class org.apache.spark.storage.BlockObjectWriter
 
close() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
close() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter
 
closeIfNeeded() - Method in class org.apache.spark.util.NextIterator
Calls the subclass-defined close method, but only once.
ClosureCleaner - Class in org.apache.spark.util
 
ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
 
closureSerializer() - Method in class org.apache.spark.SparkEnv
 
cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
cn() - Method in class org.apache.spark.mllib.feature.VocabWord
 
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(Column...) - Static method in class org.apache.spark.sql.functions
Returns the first column that is not null.
coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions
Returns the first column that is not null.
COALESCE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
CoalescedRDD<T> - Class in org.apache.spark.rdd
Represents a coalesced RDD that has fewer partitions than its parent RDD This class uses the PartitionCoalescer class to find a good partitioning of the parent RDD so that each new partition has roughly the same number of parent partitions and that the preferred location of each new partition overlaps with as many preferred locations of its parent partitions
CoalescedRDD(RDD<T>, int, double, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CoalescedRDD
 
CoalescedRDDPartition - Class in org.apache.spark.rdd
Class that captures a coalesced RDD by essentially keeping track of parent partitions
CoalescedRDDPartition(int, RDD<?>, int[], Option<String>) - Constructor for class org.apache.spark.rdd.CoalescedRDDPartition
 
CoarseGrainedClusterMessage - Interface in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
 
CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
 
CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutors(Seq<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
 
CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
 
CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillTask(long, String, boolean) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
 
CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.LaunchTask(SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
 
CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.LaunchTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
 
CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterClusterManager$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
 
CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisteredExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
 
CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutor(String, String, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
 
CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorFailed(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
 
CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorFailed$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
 
CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RemoveExecutor(String, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RemoveExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
 
CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RequestExecutors(int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
 
CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RequestExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
 
CoarseGrainedClusterMessages.RetrieveSparkProps$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RetrieveSparkProps$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
 
CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster
Alternate factory method that takes a ByteBuffer directly for the data field
CoarseGrainedClusterMessages.ReviveOffers$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
 
CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StatusUpdate(String, long, Enumeration.Value, SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StatusUpdate$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
 
CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
 
CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
 
CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
 
CoarseGrainedSchedulerBackend - Class in org.apache.spark.scheduler.cluster
A scheduler backend that waits for coarse grained executors to connect to it through Akka.
CoarseGrainedSchedulerBackend(TaskSchedulerImpl, ActorSystem) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
CoarseGrainedSchedulerBackend.DriverActor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedSchedulerBackend.DriverActor(Seq<Tuple2<String, String>>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
CoarseMesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos
A SchedulerBackend that runs tasks on Mesos, but uses "coarse-grained" tasks, where it holds onto each Mesos node for the duration of the Spark job instead of relinquishing cores whenever a task is done.
CoarseMesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
code() - Method in class org.apache.spark.mllib.feature.VocabWord
 
CODEGEN_ENABLED() - Static method in class org.apache.spark.sql.SQLConf
 
codegenEnabled() - Method in class org.apache.spark.sql.SQLConf
When set to true, Spark SQL will use the Scala compiler at runtime to generate custom bytecode that evaluates expressions found in queries.
codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
 
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd
:: DeveloperApi :: A RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
 
CoGroupPartition - Class in org.apache.spark.rdd
 
CoGroupPartition(int, CoGroupSplitDep[]) - Constructor for class org.apache.spark.rdd.CoGroupPartition
 
cogroupResult2ToJava(RDD<Tuple2<K, Tuple3<Iterable<V>, Iterable<W1>, Iterable<W2>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
cogroupResult3ToJava(RDD<Tuple2<K, Tuple4<Iterable<V>, Iterable<W1>, Iterable<W2>, Iterable<W3>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
cogroupResultToJava(RDD<Tuple2<K, Tuple2<Iterable<V>, Iterable<W>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
CoGroupSplitDep - Interface in org.apache.spark.rdd
 
col(String) - Method in class org.apache.spark.sql.DataFrame
Selects column based on the column name and return it as a Column.
col(String) - Static method in class org.apache.spark.sql.functions
Returns a Column based on the given column name.
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
collect() - Method in class org.apache.spark.rdd.RDD
Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return an RDD that contains all matching values by applying f.
collect() - Method in class org.apache.spark.sql.DataFrame
Returns an array that contains all of Rows in this DataFrame.
collect() - Method in interface org.apache.spark.sql.RDDApi
 
collectAsList() - Method in class org.apache.spark.sql.DataFrame
Returns a Java list that contains all of Rows in this DataFrame.
collectAsList() - Method in interface org.apache.spark.sql.RDDApi
 
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of collect, which returns a future for retrieving an array containing all of the elements in this RDD.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for retrieving all elements of this RDD.
collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.
collectedStatistics() - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.ByteColumnStats
 
collectedStatistics() - Method in interface org.apache.spark.sql.columnar.ColumnStats
Column statistics represented as a single row, currently including closed lower bound, closed upper bound and null count.
collectedStatistics() - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.FloatColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.GenericColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.IntColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.LongColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.NoopColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.ShortColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.StringColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
 
CollectionsUtils - Class in org.apache.spark.util
 
CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
 
collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Collect the neighbor vertex ids for each vertex.
collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Collect the neighbor vertex attributes for each vertex.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in a specific partition of this RDD.
collectPartitions() - Method in class org.apache.spark.rdd.RDD
A private method for tests, to look at the contents of each partition
colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
cols() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
colsPerPart() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
Computes column-wise summary statistics for the input RDD[Vector].
Column - Class in org.apache.spark.sql
:: Experimental :: A column in a DataFrame.
Column(Expression) - Constructor for class org.apache.spark.sql.Column
 
Column(String) - Constructor for class org.apache.spark.sql.Column
 
column(String) - Static method in class org.apache.spark.sql.functions
Returns a Column based on the given column name.
column() - Method in class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
 
COLUMN_BATCH_SIZE() - Static method in class org.apache.spark.sql.SQLConf
 
COLUMN_NAME_OF_CORRUPT_RECORD() - Static method in class org.apache.spark.sql.SQLConf
 
ColumnAccessor - Interface in org.apache.spark.sql.columnar
An Iterator like trait used to extract values from columnar byte buffer.
columnBatchSize() - Method in class org.apache.spark.sql.SQLConf
The number of rows that will be
ColumnBuilder - Interface in org.apache.spark.sql.columnar
 
ColumnName - Class in org.apache.spark.sql
:: Experimental :: A convenient class used for constructing schema.
ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
 
columnNameOfCorruptRecord() - Method in class org.apache.spark.sql.SQLConf
 
columnNames() - Method in class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues
 
columnOrdinals() - Method in class org.apache.spark.sql.hive.MetastoreRelation
An attribute map for determining the ordinal for non-partition columns.
columnPartition(JDBCPartitioningInfo) - Static method in class org.apache.spark.sql.jdbc.JDBCRelation
Given a partitioning schematic (a column of integral type, a number of partitions, and upper and lower bounds on the column's value), generate WHERE clauses for each partition so that each row in the table appears exactly once.
columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
columns() - Method in class org.apache.spark.sql.DataFrame
Returns all column names as an array.
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Compute similarities between columns of this matrix using a sampling approach.
columnSimilaritiesDIMSUM(double[], double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Find all similar columns using the DIMSUM sampling algorithm, described in two papers
ColumnStatisticsSchema - Class in org.apache.spark.sql.columnar
 
ColumnStatisticsSchema(Attribute) - Constructor for class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
columnStats() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
columnStats() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Column statistics information
ColumnStats - Interface in org.apache.spark.sql.columnar
Used to collect statistical information when building in-memory columns.
columnStats() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
 
columnType() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
ColumnType<T extends org.apache.spark.sql.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
An abstract class that represents type of a column.
ColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ColumnType
 
columnType() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Combine elements of each key in DStream's RDDs using custom functions.
combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
 
combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
combiningStrategy() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
 
command() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
commit() - Method in class org.apache.spark.SparkHadoopWriter
 
commitAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter
Flush the partial writes and commit them as a single atomic block.
commitAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
commitJob() - Method in class org.apache.spark.SparkHadoopWriter
 
commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
commonHeaderNodes() - Static method in class org.apache.spark.ui.UIUtils
 
comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FairSchedulingAlgorithm
 
comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
 
comparator(Schedulable, Schedulable) - Method in interface org.apache.spark.scheduler.SchedulingAlgorithm
 
compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
 
compatibilityBlackList() - Static method in class org.apache.spark.sql.hive.HiveShim
 
compatibleType(DataType, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD
Returns the most general data type for two given data types.
completedIndices() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
completedStageIndices() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
completedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
completion() - Method in class org.apache.spark.util.CompletionIterator
 
CompletionEvent - Class in org.apache.spark.scheduler
 
CompletionEvent(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.CompletionEvent
 
CompletionIterator<A,I extends scala.collection.Iterator<A>> - Class in org.apache.spark.util
Wrapper around an iterator which calls a completion method after it successfully iterates through all the elements.
CompletionIterator(I) - Constructor for class org.apache.spark.util.CompletionIterator
 
completionTime() - Method in class org.apache.spark.scheduler.StageInfo
Time when all tasks in the stage completed or when the stage was cancelled.
completionTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
ComplexColumnBuilder<T extends org.apache.spark.sql.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
ComplexColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.ComplexColumnBuilder
 
ComplexFutureAction<T> - Class in org.apache.spark
A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
 
compress() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
Compresses the block into an ALS.InBlock.
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
COMPRESS_CACHED() - Static method in class org.apache.spark.sql.SQLConf
 
compressCodec() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compressed() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
CompressedMapStatus - Class in org.apache.spark.scheduler
A MapStatus implementation that tracks the size of each block.
CompressedMapStatus(BlockManagerId, byte[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
 
CompressedMapStatus(BlockManagerId, long[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
 
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
compressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
CompressibleColumnAccessor<T extends org.apache.spark.sql.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
CompressibleColumnBuilder<T extends org.apache.spark.sql.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
A stackable trait that builds optionally compressed byte buffer for a column.
COMPRESSION_CODEC_KEY() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
CompressionCodec - Interface in org.apache.spark.io
:: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compressionCodec() - Method in class org.apache.spark.streaming.CheckpointWriter
 
compressionEncoders() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
compressionRatio() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
CompressionScheme - Interface in org.apache.spark.sql.columnar.compression
 
compressType() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
Provides the RDD[(VertexId, VD)] equivalent output.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomVectorRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.BlockRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CartesianRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CheckpointRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoalescedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.EmptyRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PipedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
:: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SampledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SubtractedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Runs the SQL query against the JDBC driver.
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Method that generates a RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Method that generates a RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream
Finds the files that were modified since the last time this method was called and makes a union RDD out of them.
compute(Time) - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
Generates RDDs with blocks received by the receiver of this stream.
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.StateDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.kafka.KafkaRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
Gets the partition data by getting the corresponding block from the block manager.
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes column-wise summary statistics.
computeCorrelation(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Compute correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
Compute Spearman's correlation for two datasets.
computeCorrelationMatrix(RDD<Vector>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation matrix from the covariance matrix.
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the correlation implementation for RDD[Vector].
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the covariance matrix, treating each row as an observation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
Method to calculate loss of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
Method to calculate loss of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
Method to calculate error of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
Method to calculate loss of the base learner for the gradient boosting calculation.
computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils
Returns a sampling rate that guarantees a sample of size >= sampleSizeLowerBound 99.99% of the time.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the Gramian matrix A^T A.
computeOrReadCheckpoint(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
Compute an RDD partition or read it from a checkpoint if the RDD is checkpointing.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the top k principal components.
computeSplitSize(long, long, long) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
This input format overrides computeSplitSize() to make sure that each split only contains full records.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the singular value decomposition of this IndexedRowMatrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes singular value decomposition of this matrix.
computeSVD(int, boolean, double, int, double, String) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
The actual SVD implementation, visible for testing.
computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Given the result returned by getCounts, determine the threshold for accepting items to generate exact sample size.
conf() - Method in interface org.apache.spark.input.Configurable
 
conf() - Method in class org.apache.spark.rdd.RDD
 
conf() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
conf() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
conf() - Method in class org.apache.spark.scheduler.TaskSetManager
 
conf() - Method in class org.apache.spark.SparkContext
 
conf() - Method in class org.apache.spark.SparkEnv
 
conf() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
conf() - Method in class org.apache.spark.storage.BlockManager
 
conf() - Method in class org.apache.spark.streaming.StreamingContext
 
conf() - Method in class org.apache.spark.ui.SparkUI
 
confidence() - Method in class org.apache.spark.partial.BoundedDouble
 
config() - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
configFile() - Method in class org.apache.spark.metrics.MetricsConfig
 
configTestLog4j(String) - Static method in class org.apache.spark.util.Utils
config a log4j properties used for testsuite
Configurable - Interface in org.apache.spark.input
A trait to implement Configurable interface.
ConfigurableCombineFileRecordReader<K,V> - Class in org.apache.spark.input
A CombineFileRecordReader that can pass Hadoop Configuration to Configurable RecordReaders.
ConfigurableCombineFileRecordReader(InputSplit, TaskAttemptContext, Class<? extends RecordReader<K, V>>) - Constructor for class org.apache.spark.input.ConfigurableCombineFileRecordReader
 
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
configuration() - Method in interface org.apache.spark.sql.parquet.ParquetTest
 
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in "labels"
connect(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
connected(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
ConnectedComponents - Class in org.apache.spark.graphx.lib
Connected components algorithm.
ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
 
connectLeader(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
CONSOLE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
ConsoleProgressBar - Class in org.apache.spark.ui
ConsoleProgressBar shows the progress of stages in the next line of the console.
ConsoleProgressBar(SparkContext) - Constructor for class org.apache.spark.ui.ConsoleProgressBar
 
ConsoleSink - Class in org.apache.spark.metrics.sink
 
ConsoleSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.ConsoleSink
 
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
An input stream that always returns the same RDD on each timestep.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
 
constructTree(org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData[]) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
Given a list of nodes from a tree, construct the tree.
constructTrees(RDD<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
constructURIForAuthentication(URI, SecurityManager) - Static method in class org.apache.spark.util.Utils
Construct a URI container information used for authentication.
consumerConnector() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
Checks whether a parameter is explicitly specified.
contains(String) - Method in class org.apache.spark.SparkConf
Does the configuration contain a given parameter?
contains(Object) - Method in class org.apache.spark.sql.Column
Contains the other element.
contains(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
Check if block manager master has a block.
contains(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
contains(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
contains(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
contains(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
contains(A) - Method in class org.apache.spark.util.TimeStampedHashSet
 
containsBlock(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
Check if disk block manager has a block.
containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
Return whether the given block is stored in this block manager in O(1) time.
containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
 
containsShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster
Check if the given shuffle is being tracked
contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
context() - Method in interface org.apache.spark.api.java.JavaRDDLike
The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
 
context() - Method in class org.apache.spark.rdd.RDD
The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
context() - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream
Return the StreamingContext associated with this DStream
ContextCleaner - Class in org.apache.spark
An asynchronous cleaner for RDD, shuffle, and broadcast state.
ContextCleaner(SparkContext) - Constructor for class org.apache.spark.ContextCleaner
 
ContextWaiter - Class in org.apache.spark.streaming
 
ContextWaiter() - Constructor for class org.apache.spark.streaming.ContextWaiter
 
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
convert() - Method in class org.apache.spark.WritableConverter
 
convert() - Method in class org.apache.spark.WritableFactory
 
convertFromAttributes(Seq<Attribute>, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertFromString(String) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertFromTimestamp(Timestamp) - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
convertSplitLocationInfo(Object[]) - Static method in class org.apache.spark.rdd.HadoopRDD
 
convertToAttributes(Type, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertToBaggedRDD(RDD<Datum>, double, int, boolean, int) - Static method in class org.apache.spark.mllib.tree.impl.BaggedPoint
Convert an input dataset into its BaggedPoint representation, choosing subsamplingRate counts for each instance.
convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps
Convert bi-directional edges into uni-directional ones.
convertToString(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertToTimestamp(Binary) - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
convertToTreeRDD(RDD<LabeledPoint>, Bin[][], DecisionTreeMetadata) - Static method in class org.apache.spark.mllib.tree.impl.TreePoint
Convert an input dataset into its TreePoint representation, binning feature values in preparation for DecisionTree training.
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
coordinatorActor() - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
 
copiesRunning() - Method in class org.apache.spark.scheduler.TaskSetManager
 
copy() - Method in class org.apache.spark.ml.param.ParamMap
Make a copy of this param map.
copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y = x
copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
copy() - Method in interface org.apache.spark.mllib.linalg.Vector
Makes a deep copy of this vector.
copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
 
copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
 
copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
 
copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the class when applicable for non-locking concurrent usage.
copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
 
copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Returns a shallow copy of this instance.
copy() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.util.StatCounter
Clone this StatCounter
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
copyField(Row, int, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Copies from(fromOrdinal) to to(toOrdinal).
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils
Copy all data from an InputStream to an OutputStream.
cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
cores() - Method in class org.apache.spark.scheduler.WorkerOffer
 
coresByTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the Pearson correlation matrix for the input RDD of Vectors.
corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the correlation matrix for the input RDD of Vectors using the specified method.
corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the Pearson correlation for the input RDDs.
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the correlation for the input RDDs using the specified method.
Correlation - Interface in org.apache.spark.mllib.stat.correlation
Trait for correlation algorithms.
CorrelationNames - Class in org.apache.spark.mllib.stat.correlation
Maintains supported and default correlation names.
CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
Correlations - Class in org.apache.spark.mllib.stat.correlation
Delegates computation to the specific correlation object based on the input method name.
Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
 
corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
count() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the number of elements in the RDD.
count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
The number of edges in the RDD.
count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
The number of vertices in the RDD.
count() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
count() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
 
count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample size.
count() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.rdd.RDD
Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
count() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
count() - Method in class org.apache.spark.sql.DataFrame
Returns the number of rows in the DataFrame.
count(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of items in a group.
count(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of items in a group.
count() - Method in class org.apache.spark.sql.GroupedData
Count the number of rows for each group.
COUNT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
count() - Method in interface org.apache.spark.sql.RDDApi
 
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.util.StatCounter
 
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental ::
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of count, which returns a future for counting the number of elements in this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
Count the number of elements for each key, collecting the results to a local Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
(Experimental) Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
(Experimental) Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
counter() - Method in class org.apache.spark.partial.MeanEvaluator
 
counter() - Method in class org.apache.spark.partial.SumEvaluator
 
CountEvaluator - Class in org.apache.spark.partial
An ApproximateEvaluator for counts.
CountEvaluator(int, double) - Constructor for class org.apache.spark.partial.CountEvaluator
 
cpFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
cpRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
cpState() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
CPUS_PER_TASK() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
CR() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
CreatableRelationProvider - Interface in org.apache.spark.sql.sources
 
create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Deprecated.
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Create a new StorageLevel object.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
Create a PartitionPruningRDD.
create(String, LogicalPlan, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
Creates a new ParquetRelation and underlying Parquetfile for the given LogicalPlan.
create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
 
create(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
 
create(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
create(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
createActorSystem(String, String, int, SparkConf, SecurityManager) - Static method in class org.apache.spark.util.AkkaUtils
Creates an ActorSystem ready for remoting, with various Spark features.
createAkkaConfig() - Method in class org.apache.spark.SSLOptions
Creates an Akka configuration object which contains all the SSL settings represented by this object.
createCombiner() - Method in class org.apache.spark.Aggregator
 
createCommand(Protos.Offer, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
createCompiledClass(String, File, String, String, Seq<URL>) - Static method in class org.apache.spark.TestUtils
Creates a compiled class with the given name.
createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates a DataFrame from an RDD of case classes.
createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates a DataFrame from a local Seq of Product.
createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Creates a DataFrame from an RDD containing Rows using the given schema.
createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Creates a DataFrame from an JavaRDD containing Rows using the given schema.
createDataFrame(JavaRDD<Row>, List<String>) - Method in class org.apache.spark.sql.SQLContext
Creates a DataFrame from an JavaRDD containing Rows by applying a seq of names of columns to this RDD, the data type for each column will be inferred by the first row.
createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Applies a schema to an RDD of Java Beans.
createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Applies a schema to an RDD of Java Beans.
createDataSourceTable(String, Option<StructType>, String, Map<String, String>, boolean) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
Creates a data source table (a table created with USING clause) in Hive's metastore.
createDecimal(BigDecimal) - Static method in class org.apache.spark.sql.hive.HiveShim
 
createDefaultDBIfNeeded(HiveContext) - Static method in class org.apache.spark.sql.hive.HiveShim
 
createDirectory(String, String) - Static method in class org.apache.spark.util.Utils
Create a directory inside the given parent directory.
createDirectStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(StreamingContext, Map<String, String>, Set<String>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, Map<TopicAndPartition, Long>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, Set<String>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDriverEnv(SparkConf, boolean, LiveListenerBus, Option<OutputCommitCoordinator>) - Static method in class org.apache.spark.SparkEnv
Create a SparkEnv for the driver.
createDriverResultsArray() - Static method in class org.apache.spark.sql.hive.HiveShim
 
createEmpty(String, Seq<Attribute>, boolean, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
Creates an empty ParquetRelation and underlying Parquetfile that only consists of the Metadata for the given schema.
createExecutorEnv(SparkConf, String, String, int, int, boolean) - Static method in class org.apache.spark.SparkEnv
Create a SparkEnv for an executor.
createExecutorInfo(String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates an external table from the given path and returns the corresponding DataFrame.
createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates an external table from the given path based on a data source and returns the corresponding DataFrame.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates an external table from the given path based on a data source and a set of options.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: (Scala-specific) Creates an external table from the given path based on a data source and a set of options.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Create an external table from the given path based on a data source, a schema and a set of options.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: (Scala-specific) Create an external table from the given path based on a data source, a schema and a set of options.
createFilter(Expression) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
createFunction() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
createHistoryUI(SparkConf, SparkListenerBus, SecurityManager, String, String) - Static method in class org.apache.spark.ui.SparkUI
 
createJar(Seq<File>, File) - Static method in class org.apache.spark.TestUtils
Create a jar file that contains this set of files.
createJarWithClasses(Seq<String>, String, Seq<Tuple2<String, String>>, Seq<URL>) - Static method in class org.apache.spark.TestUtils
Create a jar that defines classes with the given names.
createJarWithFiles(Map<String, String>, File) - Static method in class org.apache.spark.TestUtils
Create a jar file containing multiple files.
createJDBCTable(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame
Save this RDD to a JDBC database at url under the table name table.
createJettySslContextFactory() - Method in class org.apache.spark.SSLOptions
Creates a Jetty SSL context factory according to the SSL settings represented by this object.
createJobID(Date, int) - Static method in class org.apache.spark.SparkHadoopWriter
 
createLiveUI(SparkContext, SparkConf, SparkListenerBus, JobProgressListener, SecurityManager, String) - Static method in class org.apache.spark.ui.SparkUI
 
createMesosTask(TaskDescription, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Turn a Spark TaskDescription into a Mesos task
CreateMetastoreDataSource - Class in org.apache.spark.sql.hive.execution
 
CreateMetastoreDataSource(String, Option<StructType>, String, Map<String, String>, boolean, boolean) - Constructor for class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
CreateMetastoreDataSourceAsSelect - Class in org.apache.spark.sql.hive.execution
 
CreateMetastoreDataSourceAsSelect(String, String, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
 
createMetricsSystem(String, SparkConf, SecurityManager) - Static method in class org.apache.spark.metrics.MetricsSystem
 
createNewSparkContext(SparkConf) - Static method in class org.apache.spark.streaming.StreamingContext
 
createNewSparkContext(String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.streaming.StreamingContext
 
createPartitioner() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
createPathFromString(String, JobConf) - Static method in class org.apache.spark.SparkHadoopWriter
 
createPathFromString(String, JobConf) - Static method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
createPlan(String) - Static method in class org.apache.spark.sql.hive.HiveQl
Creates LogicalPlan for a given HiveQL string.
createPlanForView(Table, Option<String>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPythonWorker(String, Map<String, String>) - Method in class org.apache.spark.SparkEnv
 
createRDD(SparkContext, Map<String, String>, OffsetRange[], ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(SparkContext, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, OffsetRange[]) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create a RDD from Kafka using offset ranges for each topic and partition.
createRecordFilter(Seq<Expression>) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Create a FixedLengthBinaryRecordReader
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamFileInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler that always redirects the user to the given path
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.jdbc.DefaultSource
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.json.DefaultSource
Returns a new base relation with the parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in class org.apache.spark.sql.json.DefaultSource
Returns a new base relation with the given schema and parameters.
createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in class org.apache.spark.sql.json.DefaultSource
 
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.parquet.DefaultSource
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in class org.apache.spark.sql.parquet.DefaultSource
Returns a new base relation with the given parameters and schema.
createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in class org.apache.spark.sql.parquet.DefaultSource
Returns a new base relation with the given parameters and save given data into it.
createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider
Creates a relation with the given parameters based on the contents of the given DataFrame.
createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider
Returns a new base relation with the given parameters and user defined schema.
createRoutingTables(EdgeRDD<?>, Partitioner) - Static method in class org.apache.spark.graphx.VertexRDD
 
createServlet(JettyUtils.ServletParams<T>, SecurityManager, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
 
createServletHandler(String, JettyUtils.ServletParams<T>, SecurityManager, String, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
Create a context handler that responds to a request with the given path prefix
createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a context handler that responds to a request with the given path prefix
createSparkEnv(SparkConf, boolean, LiveListenerBus) - Method in class org.apache.spark.SparkContext
 
createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler for serving files from a static directory
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, Map<String, String>, Map<String, Integer>, StorageLevel) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
 
createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Create an InputDStream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Create a Java-friendly InputDStream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createTable(String, String, Seq<Attribute>, boolean, Option<CreateTableDesc>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
Create table with specified database, table name, table description and schema
CreateTableAsSelect - Class in org.apache.spark.sql.hive.execution
Create table and insert the query result into it.
CreateTableAsSelect(String, String, LogicalPlan, boolean, Option<CreateTableDesc>) - Constructor for class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
CreateTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
CreateTableUsing - Class in org.apache.spark.sql.sources
Used to represent the operation of create table using a data source.
CreateTableUsing(String, Option<StructType>, String, boolean, Map<String, String>, boolean, boolean) - Constructor for class org.apache.spark.sql.sources.CreateTableUsing
 
CreateTableUsingAsSelect - Class in org.apache.spark.sql.sources
A node used to support CTAS statements and saveAsTable for the data source API.
CreateTableUsingAsSelect(String, String, boolean, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
createTaskSetManager(TaskSet, int) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
createTempDir(String, String) - Static method in class org.apache.spark.util.Utils
Create a temporary directory inside the given parent directory.
createTempLocalBlock() - Method in class org.apache.spark.storage.DiskBlockManager
Produces a unique block id and File suitable for storing local intermediate results.
createTempShuffleBlock() - Method in class org.apache.spark.storage.DiskBlockManager
Produces a unique block id and File suitable for storing shuffled intermediate results.
CreateTempTableUsing - Class in org.apache.spark.sql.sources
 
CreateTempTableUsing(String, Option<StructType>, String, Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CreateTempTableUsing
 
CreateTempTableUsingAsSelect - Class in org.apache.spark.sql.sources
 
CreateTempTableUsingAsSelect(String, String, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
 
createTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
createUsingIndex(Iterator<Product2<Object, VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Similar effect as aggregateUsingIndex((a, b) => a)
createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
 
creationSite() - Method in class org.apache.spark.rdd.RDD
User code that created this RDD (e.g.
creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
 
credentialsProvider() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
CrossValidator - Class in org.apache.spark.ml.tuning
:: AlphaComponent :: K-fold cross validation.
CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
 
CrossValidatorModel - Class in org.apache.spark.ml.tuning
:: AlphaComponent :: Model from k-fold cross validation.
CrossValidatorModel(CrossValidator, ParamMap, Model<?>) - Constructor for class org.apache.spark.ml.tuning.CrossValidatorModel
 
CrossValidatorParams - Interface in org.apache.spark.ml.tuning
CSV_DEFAULT_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CsvSink - Class in org.apache.spark.metrics.sink
 
CsvSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.CsvSink
 
currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
 
currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
 
currentGraph() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
 
currentInterval(Duration) - Static method in class org.apache.spark.streaming.Interval
 
currentLocalityIndex() - Method in class org.apache.spark.scheduler.TaskSetManager
 
currentResult() - Method in interface org.apache.spark.partial.ApproximateEvaluator
 
currentResult() - Method in class org.apache.spark.partial.CountEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
currentResult() - Method in class org.apache.spark.partial.MeanEvaluator
 
currentResult() - Method in class org.apache.spark.partial.SumEvaluator
 
currentUnrollMemory() - Method in class org.apache.spark.storage.MemoryStore
Return the amount of memory currently occupied for unrolling blocks across all threads.
currentUnrollMemoryForThisThread() - Method in class org.apache.spark.storage.MemoryStore
Return the amount of memory currently occupied for unrolling blocks by this thread.
currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
 

D

DAGScheduler - Class in org.apache.spark.scheduler
The high-level scheduling layer that implements stage-oriented scheduling.
DAGScheduler(SparkContext, TaskScheduler, LiveListenerBus, MapOutputTrackerMaster, BlockManagerMaster, SparkEnv, Clock) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
DAGScheduler(SparkContext, TaskScheduler) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
DAGScheduler(SparkContext) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
dagScheduler() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
dagScheduler() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
dagScheduler() - Method in class org.apache.spark.SparkContext
 
DAGSchedulerEvent - Interface in org.apache.spark.scheduler
Types of events that can be handled by the DAGScheduler.
DAGSchedulerEventProcessLoop - Class in org.apache.spark.scheduler
 
DAGSchedulerEventProcessLoop(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerEventProcessLoop
 
DAGSchedulerSource - Class in org.apache.spark.scheduler
 
DAGSchedulerSource(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerSource
 
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
 
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
data() - Method in class org.apache.spark.storage.BlockResult
 
data() - Method in class org.apache.spark.storage.PutResult
 
data() - Method in class org.apache.spark.util.Distribution
 
data() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
data() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
database() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
database() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.QualifiedTableName
 
databaseName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
databaseName() - Method in class org.apache.spark.sql.sources.RefreshTable
 
dataDeserialize(BlockId, ByteBuffer, Serializer) - Method in class org.apache.spark.storage.BlockManager
Deserializes a ByteBuffer into an iterator of values and disposes of it when the end of the iterator is reached.
DataFrame - Class in org.apache.spark.sql
:: Experimental :: A distributed collection of data organized into named columns.
DataFrame(SQLContext, SQLContext.QueryExecution) - Constructor for class org.apache.spark.sql.DataFrame
 
DataFrame(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.DataFrame
A constructor that automatically analyzes the logical plan.
DATAFRAME_EAGER_ANALYSIS() - Static method in class org.apache.spark.sql.SQLConf
 
dataFrameEagerAnalysis() - Method in class org.apache.spark.sql.SQLConf
 
DataFrameHolder - Class in org.apache.spark.sql
A container for a DataFrame, used for implicit conversions.
DataFrameHolder(DataFrame) - Constructor for class org.apache.spark.sql.DataFrameHolder
 
dataSerialize(BlockId, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
Serializes into a byte buffer.
dataSerializeStream(BlockId, OutputStream, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
Serializes into a stream.
DataSinks() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
DataSourceStrategy - Class in org.apache.spark.sql.sources
A Strategy for planning scans over data sources defined using the sources API.
DataSourceStrategy() - Constructor for class org.apache.spark.sql.sources.DataSourceStrategy
 
dataType() - Method in class org.apache.spark.sql.columnar.NativeColumnType
 
dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
dataType() - Method in class org.apache.spark.sql.sources.DDLParser
 
dataType() - Method in class org.apache.spark.sql.UserDefinedFunction
 
dataType() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
DataValidators - Class in org.apache.spark.mllib.util
:: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
 
DATE - Class in org.apache.spark.sql.columnar
 
DATE() - Constructor for class org.apache.spark.sql.columnar.DATE
 
date() - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type date
DateColumnAccessor - Class in org.apache.spark.sql.columnar
 
DateColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DateColumnAccessor
 
DateColumnBuilder - Class in org.apache.spark.sql.columnar
 
DateColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DateColumnBuilder
 
DateColumnStats - Class in org.apache.spark.sql.columnar
 
DateColumnStats() - Constructor for class org.apache.spark.sql.columnar.DateColumnStats
 
DateConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
datum() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
 
DDLException - Exception in org.apache.spark.sql.sources
The exception thrown from the DDL parser.
DDLException(String) - Constructor for exception org.apache.spark.sql.sources.DDLException
 
DDLParser - Class in org.apache.spark.sql.sources
A parser for foreign DDL commands.
DDLParser(Function1<String, LogicalPlan>) - Constructor for class org.apache.spark.sql.sources.DDLParser
 
dead(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
decimal() - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type decimal
decimal(int, int) - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type decimal
DecimalConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
decimalMetadata() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
decimalMetastoreString(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
 
decimalTypeInfo(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
 
decimalTypeInfoToCatalyst(PrimitiveObjectInspector) - Static method in class org.apache.spark.sql.hive.HiveShim
 
DecisionTree - Class in org.apache.spark.mllib.tree
:: Experimental :: A class which implements a decision tree learning algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
 
DecisionTreeMetadata - Class in org.apache.spark.mllib.tree.impl
Learning and dataset metadata for DecisionTree.
DecisionTreeMetadata(int, long, int, int, Map<Object, Object>, Set<Object>, int[], Impurity, Enumeration.Value, int, int, double, int, int) - Constructor for class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
:: Experimental :: Decision tree model for classification or regression.
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
DecisionTreeModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.tree.model
 
DecisionTreeModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
DecisionTreeModel.SaveLoadV1_0$.NodeData - Class in org.apache.spark.mllib.tree.model
Model data for model import/export
DecisionTreeModel.SaveLoadV1_0$.NodeData(int, int, org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.PredictData, double, boolean, Option<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.SplitData>, Option<Object>, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
DecisionTreeModel.SaveLoadV1_0$.PredictData - Class in org.apache.spark.mllib.tree.model
 
DecisionTreeModel.SaveLoadV1_0$.PredictData(double, double) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
 
DecisionTreeModel.SaveLoadV1_0$.SplitData - Class in org.apache.spark.mllib.tree.model
 
DecisionTreeModel.SaveLoadV1_0$.SplitData(int, double, int, Seq<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
decoder() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
decoder(ByteBuffer, NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
Decoder<T extends org.apache.spark.sql.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
decreaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
 
deepCopy() - Method in class org.apache.spark.mllib.tree.model.Node
Returns a deep copy of the subtree rooted at this node.
DEFAULT_BUFFER_SIZE() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
 
DEFAULT_DATA_SOURCE_NAME() - Static method in class org.apache.spark.sql.SQLConf
 
DEFAULT_LOG_DIR() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
DEFAULT_MINIMUM_SHARE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_PARTITION_NAME() - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
 
DEFAULT_POOL_NAME() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_PORT() - Static method in class org.apache.spark.ui.SparkUI
 
DEFAULT_RETAINED_JOBS() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_SCHEDULER_FILE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_SCHEDULING_MODE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_SIZE_IN_BYTES() - Static method in class org.apache.spark.sql.SQLConf
 
DEFAULT_WEIGHT() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
defaultCorrName() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
defaultDataSourceName() - Method in class org.apache.spark.sql.SQLConf
 
defaultFilter(Path) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
 
defaultFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext
Default min number of partitions for Hadoop RDDs when not given by user Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2.
defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
As of Spark 1.0.0, defaultMinSplits is deprecated, use JavaSparkContext.defaultMinPartitions() instead
defaultMinSplits() - Method in class org.apache.spark.SparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
defaultParallelism() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
defaultParallelism() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
defaultParallelism() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
defaultParallelism() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
defaultParallelism() - Method in class org.apache.spark.SparkContext
Default level of parallelism to use when not given by user (e.g.
defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Returns default configuration for the boosting algorithm
defaultParams(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Returns default configuration for the boosting algorithm
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultPartitioner(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
defaultProbabilities() - Method in class org.apache.spark.util.Distribution
 
defaultSize() - Method in class org.apache.spark.sql.columnar.ColumnType
 
defaultSizeInBytes() - Method in class org.apache.spark.sql.SQLConf
The default size in bytes to assign to a logical operator's estimation statistics.
DefaultSource - Class in org.apache.spark.sql.jdbc
Given a partitioning schematic (a column of integral type, a number of partitions, and upper and lower bounds on the column's value), generate WHERE clauses for each partition so that each row in the table appears exactly once.
DefaultSource() - Constructor for class org.apache.spark.sql.jdbc.DefaultSource
 
DefaultSource - Class in org.apache.spark.sql.json
 
DefaultSource() - Constructor for class org.apache.spark.sql.json.DefaultSource
 
DefaultSource - Class in org.apache.spark.sql.parquet
Allows creation of Parquet based tables using the syntax:
DefaultSource() - Constructor for class org.apache.spark.sql.parquet.DefaultSource
 
defaultStategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
Construct a default set of parameters for DecisionTree
defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
Construct a default set of parameters for DecisionTree
defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
 
defaultValue() - Method in class org.apache.spark.ml.param.Param
 
DeferredObjectAdapter - Class in org.apache.spark.sql.hive
 
DeferredObjectAdapter(ObjectInspector) - Constructor for class org.apache.spark.sql.hive.DeferredObjectAdapter
 
degrees() - Method in class org.apache.spark.graphx.GraphOps
The degree of each vertex in the graph.
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult
Returns the degree(s) of freedom of the hypothesis test.
delaySeconds() - Method in class org.apache.spark.streaming.Checkpoint
 
delegate() - Method in class org.apache.spark.InterruptibleIterator
 
deleteAllCheckpoints() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
Call this at the end to delete any remaining checkpoint files.
deleteAllCheckpoints() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
Call this after training is finished to delete any remaining checkpoints.
deleteOldFiles() - Method in class org.apache.spark.util.logging.RollingFileAppender
Retain only last few files
deleteRecursively(File) - Static method in class org.apache.spark.util.Utils
Delete a file or directory and its contents recursively.
deleteRecursively(TachyonFile, TachyonFS) - Static method in class org.apache.spark.util.Utils
Delete a file or directory and its contents recursively.
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
Creates a column-major dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from a double array.
DenseMatrix - Class in org.apache.spark.mllib.linalg
Column-major dense matrix.
DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
 
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
Column-major dense matrix.
DenseVector - Class in org.apache.spark.mllib.linalg
A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
 
dependencies() - Method in class org.apache.spark.rdd.RDD
Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
Dependency<T> - Class in org.apache.spark
:: DeveloperApi :: Base class for dependencies.
Dependency() - Constructor for class org.apache.spark.Dependency
 
deps() - Method in class org.apache.spark.rdd.CoGroupPartition
 
depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Get depth of tree.
DeregisterReceiver - Class in org.apache.spark.streaming.scheduler
 
DeregisterReceiver(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
desc() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
desc() - Method in class org.apache.spark.sql.Column
Returns an ordering used in sorting.
desc(String) - Static method in class org.apache.spark.sql.functions
Returns a sort expression based on the descending order of the column.
desc() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
DescribeCommand - Class in org.apache.spark.sql.sources
Returned for the "DESCRIBE [EXTENDED] [dbName.]tableName" command.
DescribeCommand(LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.sources.DescribeCommand
 
DescribeHiveTableCommand - Class in org.apache.spark.sql.hive.execution
Implementation for "describe [extended] table".
DescribeHiveTableCommand(MetastoreRelation, Seq<Attribute>, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
 
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LDAModel
Return the topics described by weighted terms.
describeTopics() - Method in class org.apache.spark.mllib.clustering.LDAModel
Return the topics described by weighted terms.
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
 
description() - Method in class org.apache.spark.ExceptionFailure
 
description() - Method in class org.apache.spark.storage.StorageLevel
 
description() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
DeserializationStream - Class in org.apache.spark.serializer
:: DeveloperApi :: A stream for reading serialized objects.
DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
 
deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
deserialize(byte[]) - Static method in class org.apache.spark.util.Utils
Deserialize an object using Java serialization
deserialize(byte[], ClassLoader) - Static method in class org.apache.spark.util.Utils
Deserialize an object using Java serialization and the given ClassLoader
deserialized() - Method in class org.apache.spark.storage.MemoryEntry
 
deserialized() - Method in class org.apache.spark.storage.StorageLevel
 
deserializeFilterExpressions(Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
Note: Inside the Hadoop API we only have access to Configuration, not to SparkContext, so we cannot use broadcasts to convey the actual filter predicate.
deserializeLongValue(byte[]) - Static method in class org.apache.spark.util.Utils
Deserialize a Long value (used for PythonPartitioner)
deserializeMapStatuses(byte[]) - Static method in class org.apache.spark.MapOutputTracker
 
deserializePlan(InputStream, Class<?>) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserializeStream(InputStream, ClassLoader) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserializeViaNestedStream(InputStream, SerializerInstance, Function1<DeserializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Deserialize via nested stream using specific serializer
deserializeWithDependencies(ByteBuffer) - Static method in class org.apache.spark.scheduler.Task
Deserialize the list of dependencies in a task serialized with serializeWithDependencies, and return the task itself as a serialized ByteBuffer.
destinationToken() - Static method in class org.apache.spark.sql.hive.HiveQl
 
destroy() - Method in class org.apache.spark.broadcast.Broadcast
Destroy all data and metadata related to this broadcast variable.
destroy(boolean) - Method in class org.apache.spark.broadcast.Broadcast
Destroy all data and metadata related to this broadcast variable.
destroyPythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
 
destTableId() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
details() - Method in class org.apache.spark.scheduler.Stage
 
details() - Method in class org.apache.spark.scheduler.StageInfo
 
determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
Determines the bounds for range partitioning from candidates with weights indicating how many items each represents.
DeveloperApi - Annotation Type in org.apache.spark.annotation
A lower-level, unstable API intended for developers.
df() - Method in class org.apache.spark.sql.DataFrameHolder
 
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
Generate a diagonal matrix in DenseMatrix format from the supplied values.
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a diagonal matrix in Matrix format from the supplied values.
DIALECT() - Static method in class org.apache.spark.sql.SQLConf
 
dialect() - Method in class org.apache.spark.sql.SQLConf
The SQL dialect that is used when parsing queries.
DictionaryEncoding - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
DictionaryEncoding.Decoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
DictionaryEncoding.Encoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
diff(Self) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Hides vertices that are the same between this and other.
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
Hides vertices that are the same between this and other; for vertices that are different, keeps the values from other.
dir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
dir() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
DirectKafkaInputDStream<K,V,U extends kafka.serializer.Decoder<K>,T extends kafka.serializer.Decoder<V>,R> - Class in org.apache.spark.streaming.kafka
A stream of KafkaRDD where each given Kafka topic/partition corresponds to an RDD partition.
DirectKafkaInputDStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>, ClassTag<R>) - Constructor for class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData - Class in org.apache.spark.streaming.kafka
 
DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData() - Constructor for class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
 
DirectTaskResult<T> - Class in org.apache.spark.scheduler
A TaskResult that contains the task's return value and accumulator updates.
DirectTaskResult(ByteBuffer, Map<Object, Object>, TaskMetrics) - Constructor for class org.apache.spark.scheduler.DirectTaskResult
 
DirectTaskResult() - Constructor for class org.apache.spark.scheduler.DirectTaskResult
 
disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions
Allows for the spark.hadoop.validateOutputSpecs checks to be disabled on a case-by-case basis; see SPARK-4835 for more details.
disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
disconnected() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
 
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
 
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
 
diskBlockManager() - Method in class org.apache.spark.storage.BlockManager
 
DiskBlockManager - Class in org.apache.spark.storage
Creates and maintains the logical mapping between logical blocks and physical on-disk locations.
DiskBlockManager(BlockManager, SparkConf) - Constructor for class org.apache.spark.storage.DiskBlockManager
 
DiskBlockObjectWriter - Class in org.apache.spark.storage
BlockObjectWriter which writes directly to a file on disk.
DiskBlockObjectWriter(BlockId, File, Serializer, int, Function1<OutputStream, OutputStream>, boolean, ShuffleWriteMetrics) - Constructor for class org.apache.spark.storage.DiskBlockObjectWriter
 
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
diskSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
diskSize() - Method in class org.apache.spark.storage.BlockStatus
 
diskSize() - Method in class org.apache.spark.storage.RDDInfo
 
diskStore() - Method in class org.apache.spark.storage.BlockManager
 
DiskStore - Class in org.apache.spark.storage
Stores BlockManager blocks on disk.
DiskStore(BlockManager, DiskBlockManager) - Constructor for class org.apache.spark.storage.DiskStore
 
diskUsed() - Method in class org.apache.spark.storage.StorageStatus
Return the disk space used by this block manager.
diskUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
Return the disk space used by the given RDD in this block manager in O(1) time.
dispose(ByteBuffer) - Static method in class org.apache.spark.storage.BlockManager
Attempt to clean up a ByteBuffer if it is memory-mapped.
dist(Vector) - Method in class org.apache.spark.util.Vector
 
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame that contains only the unique rows from this DataFrame.
distinct() - Method in interface org.apache.spark.sql.RDDApi
 
DistributedLDAModel - Class in org.apache.spark.mllib.clustering
:: Experimental ::
DistributedLDAModel(LDA.EMOptimizer, double[]) - Constructor for class org.apache.spark.mllib.clustering.DistributedLDAModel
 
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
Represents a distributively stored matrix backed by one or more RDDs.
Distribution - Class in org.apache.spark.util
Util for getting some stats from a small sample of numeric values, with some handy summary functions.
Distribution(double[], int, int) - Constructor for class org.apache.spark.util.Distribution
 
Distribution(Traversable<Object>) - Constructor for class org.apache.spark.util.Distribution
 
DIV() - Static method in class org.apache.spark.sql.hive.HiveQl
 
div(Duration) - Method in class org.apache.spark.streaming.Duration
 
divide(Object) - Method in class org.apache.spark.sql.Column
Division this expression by another expression.
divide(double) - Method in class org.apache.spark.util.Vector
 
doc() - Method in class org.apache.spark.ml.param.Param
 
doCancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
docConcentration() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
doCheckpoint() - Method in class org.apache.spark.rdd.RDD
Performs the checkpointing of this RDD by saving this.
doCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
DoCheckpoint - Class in org.apache.spark.streaming.scheduler
 
DoCheckpoint(Time) - Constructor for class org.apache.spark.streaming.scheduler.DoCheckpoint
 
doCleanupBroadcast(long, boolean) - Method in class org.apache.spark.ContextCleaner
Perform broadcast cleanup.
doCleanupRDD(int, boolean) - Method in class org.apache.spark.ContextCleaner
Perform RDD cleanup.
doCleanupShuffle(int, boolean) - Method in class org.apache.spark.ContextCleaner
Perform shuffle cleanup, asynchronously.
doesDirectoryContainAnyNewFiles(File, long) - Static method in class org.apache.spark.util.Utils
Determines if a directory contains any files newer than cutoff seconds.
doKillExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
Request that the ApplicationMaster kill the specified executors.
doRequestTotalExecutors(int) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
Request executors from the ApplicationMaster by specifying the total number desired.
dot(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
dot(x, y)
dot(Vector) - Method in class org.apache.spark.util.Vector
 
DOUBLE - Class in org.apache.spark.sql.columnar
 
DOUBLE() - Constructor for class org.apache.spark.sql.columnar.DOUBLE
 
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
DoubleColumnAccessor - Class in org.apache.spark.sql.columnar
 
DoubleColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DoubleColumnAccessor
 
DoubleColumnBuilder - Class in org.apache.spark.sql.columnar
 
DoubleColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnBuilder
 
DoubleColumnStats - Class in org.apache.spark.sql.columnar
 
DoubleColumnStats() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnStats
 
DoubleConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function
A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleParam - Class in org.apache.spark.ml.param
Specialized version of Param[Double] for Java.
DoubleParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
 
DoubleParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
 
DoubleRDDFunctions - Class in org.apache.spark.rdd
Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
 
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.rdd.RDD
 
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
 
doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
 
doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
 
doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
 
doubleWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
doubleWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
driver() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
driver() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
DRIVER_AKKA_ACTOR_NAME() - Method in class org.apache.spark.storage.BlockManagerMaster
 
DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
 
driverActor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
driverActor() - Method in class org.apache.spark.storage.BlockManagerMaster
 
driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
 
DriverQuirks - Class in org.apache.spark.sql.jdbc
Encapsulates workarounds for the extensions, quirks, and bugs in various databases.
DriverQuirks() - Constructor for class org.apache.spark.sql.jdbc.DriverQuirks
 
driverSideSetup() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
dropFromMemory(BlockId, Either<Object[], ByteBuffer>) - Method in class org.apache.spark.storage.BlockManager
Drop a block from memory, possibly putting it on disk if applicable.
droppedBlocks() - Method in class org.apache.spark.storage.PutResult
 
droppedBlocks() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
 
DropTable - Class in org.apache.spark.sql.hive.execution
Drops a table from the metastore and removes it if it is cached.
DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DropTable
 
dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
Drops the temporary table with the given table name in the catalog.
Dst - Static variable in class org.apache.spark.graphx.TripletFields
Expose the destination and edge fields but not the source field.
dstAttr() - Method in class org.apache.spark.graphx.EdgeContext
The vertex attribute of the edge's destination vertex.
dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
The destination vertex attribute
dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
dstEncodedIndices() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
 
dstEncodedIndices() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
 
dstId() - Method in class org.apache.spark.graphx.Edge
 
dstId() - Method in class org.apache.spark.graphx.EdgeContext
The vertex id of the edge's destination vertex.
dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
dstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
dstIds() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
 
dstPtrs() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
 
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
DStream<T> - Class in org.apache.spark.streaming.dstream
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
 
DStreamCheckpointData<T> - Class in org.apache.spark.streaming.dstream
 
DStreamCheckpointData(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStreamCheckpointData
 
DStreamGraph - Class in org.apache.spark.streaming
 
DStreamGraph() - Constructor for class org.apache.spark.streaming.DStreamGraph
 
DTStatsAggregator - Class in org.apache.spark.mllib.tree.impl
DecisionTree statistics aggregator for a node.
DTStatsAggregator(DecisionTreeMetadata, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.impl.DTStatsAggregator
 
dtypes() - Method in class org.apache.spark.sql.DataFrame
Returns all column names and their data types as an array.
DummyCategoricalSplit - Class in org.apache.spark.mllib.tree.model
Split with no acceptable feature values for categorical features.
DummyCategoricalSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyCategoricalSplit
 
DummyHighSplit - Class in org.apache.spark.mllib.tree.model
Split with maximum threshold for continuous features.
DummyHighSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyHighSplit
 
DummyLowSplit - Class in org.apache.spark.mllib.tree.model
Split with minimum threshold for continuous features.
DummyLowSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyLowSplit
 
dumpTree(Node, StringBuilder, int) - Static method in class org.apache.spark.sql.hive.HiveQl
 
duration() - Method in class org.apache.spark.scheduler.TaskInfo
 
Duration - Class in org.apache.spark.streaming
 
Duration(long) - Constructor for class org.apache.spark.streaming.Duration
 
duration() - Method in class org.apache.spark.streaming.Interval
 
Durations - Class in org.apache.spark.streaming
 
Durations() - Constructor for class org.apache.spark.streaming.Durations
 

E

e() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
e() - Method in class org.apache.spark.streaming.scheduler.ErrorReported
 
Edge<ED> - Class in org.apache.spark.graphx
A single directed edge consisting of a source id, target id, and the data associated with the edge.
Edge(long, long, ED) - Constructor for class org.apache.spark.graphx.Edge
 
EdgeActiveness - Enum in org.apache.spark.graphx.impl
Criteria for filtering edges based on activeness.
edgeArraySortDataFormat() - Static method in class org.apache.spark.graphx.Edge
 
edgeArraySortDataFormat() - Static method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
EdgeContext<VD,ED,A> - Class in org.apache.spark.graphx
Represents an edge along with its neighboring vertices and allows sending messages along the edge.
EdgeContext() - Constructor for class org.apache.spark.graphx.EdgeContext
 
EdgeDirection - Class in org.apache.spark.graphx
The direction of a directed edge relative to a vertex.
edgeListFile(SparkContext, String, boolean, int, StorageLevel, StorageLevel) - Static method in class org.apache.spark.graphx.GraphLoader
Loads a graph from an edge list formatted file where each line contains two integers: a source id and a target id.
EdgeOnly - Static variable in class org.apache.spark.graphx.TripletFields
Expose only the edge field and not the source or destination field.
EdgePartition<ED,VD> - Class in org.apache.spark.graphx.impl
A collection of edges, along with referenced vertex attributes and an optional active vertex set for filtering computation on the edges.
EdgePartition(int[], int[], Object, GraphXPrimitiveKeyOpenHashMap<Object, Object>, GraphXPrimitiveKeyOpenHashMap<Object, Object>, long[], Object, Option<OpenHashSet<Object>>, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgePartition
 
EdgePartitionBuilder<ED,VD> - Class in org.apache.spark.graphx.impl
Constructs an EdgePartition from scratch.
EdgePartitionBuilder(int, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgePartitionBuilder
 
edgePartitionToMsgs(int, EdgePartition<?, ?>) - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition
Generate a `RoutingTableMessage` for each vertex referenced in `edgePartition`.
EdgeRDD<ED> - Class in org.apache.spark.graphx
EdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each partition for performance.
EdgeRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.EdgeRDD
 
EdgeRDDImpl<ED,VD> - Class in org.apache.spark.graphx.impl
 
EdgeRDDImpl(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, StorageLevel, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.EdgeRDDImpl
 
edges() - Method in class org.apache.spark.graphx.Graph
An RDD containing the edges and their associated attributes.
edges() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
edges() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
EdgeTriplet<VD,ED> - Class in org.apache.spark.graphx
An edge triplet represents an edge along with the vertex attributes of its neighboring vertices.
EdgeTriplet() - Constructor for class org.apache.spark.graphx.EdgeTriplet
 
EdgeWithLocalIds<ED> - Class in org.apache.spark.graphx.impl
Add a new edge to the partition.
EdgeWithLocalIds(long, long, int, int, ED) - Constructor for class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
EigenValueDecomposition - Class in org.apache.spark.mllib.linalg
:: Experimental :: Compute eigen-decomposition.
EigenValueDecomposition() - Constructor for class org.apache.spark.mllib.linalg.EigenValueDecomposition
 
Either() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges originating from *or* arriving at a vertex of interest.
elementClassTag() - Method in class org.apache.spark.rdd.RDD
 
elements() - Method in class org.apache.spark.util.Vector
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
elementType() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
emittedTaskSizeWarning() - Method in class org.apache.spark.scheduler.TaskSetManager
 
empty() - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition
 
empty() - Static method in class org.apache.spark.ml.param.ParamMap
Returns an empty param map.
empty() - Static method in class org.apache.spark.storage.BlockStatus
 
empty() - Method in class org.apache.spark.util.TimeStampedHashMap
 
empty() - Method in class org.apache.spark.util.TimeStampedHashSet
 
empty() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
emptyDataFrame() - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Returns a DataFrame with no rows or columns.
emptyJson() - Static method in class org.apache.spark.util.Utils
Return an empty JSON object
emptyNode(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return a node with the given node id (but nothing else set).
emptyRDD() - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD that has no partitions or elements.
EmptyRDD<T> - Class in org.apache.spark.rdd
An RDD that has no partitions and no elements.
EmptyRDD(SparkContext, ClassTag<T>) - Constructor for class org.apache.spark.rdd.EmptyRDD
 
emptyRDD(ClassTag<T>) - Method in class org.apache.spark.SparkContext
Get an RDD that has no partitions or elements.
enabled() - Method in class org.apache.spark.SSLOptions
 
enabledAlgorithms() - Method in class org.apache.spark.SSLOptions
 
enableDebugging() - Static method in class org.apache.spark.serializer.SerializationDebugger
 
enableLogForwarding() - Static method in class org.apache.spark.sql.parquet.ParquetRelation
 
encode(int, int) - Method in class org.apache.spark.ml.recommendation.ALS.LocalIndexEncoder
Encodes a (blockId, localIndex) into a single integer.
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
encoder(NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
Encoder<T extends org.apache.spark.sql.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
encoder(NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
end() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
end() - Method in class org.apache.spark.sql.parquet.CatalystStructConverter
 
endIdx() - Method in class org.apache.spark.util.Distribution
 
endsWith(Column) - Method in class org.apache.spark.sql.Column
String ends with.
endsWith(String) - Method in class org.apache.spark.sql.Column
String ends with another string literal.
endTime() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
endTime() - Method in class org.apache.spark.streaming.Interval
 
endTime() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
enforceCorrectType(Object, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD
 
enqueueFailedTask(TaskSetManager, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskResultGetter
 
enqueueSuccessfulTask(TaskSetManager, long, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskResultGetter
 
EnsembleCombiningStrategy - Class in org.apache.spark.mllib.tree.configuration
Enum to select ensemble combining strategy for base learners
EnsembleCombiningStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
entries() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
Entropy - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating entropy during binary classification.
Entropy() - Constructor for class org.apache.spark.mllib.tree.impurity.Entropy
 
EntropyAggregator - Class in org.apache.spark.mllib.tree.impurity
Class for updating views of a vector of sufficient statistics, in order to compute impurity from a sample.
EntropyAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.EntropyAggregator
 
EntropyCalculator - Class in org.apache.spark.mllib.tree.impurity
Stores statistics for one (node, feature, bin) for calculating impurity.
EntropyCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.EntropyCalculator
 
entrySet() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
env() - Method in class org.apache.spark.api.java.JavaSparkContext
 
env() - Method in class org.apache.spark.scheduler.TaskSetManager
 
env() - Method in class org.apache.spark.SparkContext
 
env() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
env() - Method in class org.apache.spark.streaming.StreamingContext
 
environmentDetails() - Method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
environmentDetails(SparkConf, String, Seq<String>, Seq<String>) - Static method in class org.apache.spark.SparkEnv
Return a map representation of jvm information, Spark properties, system properties, and class paths.
EnvironmentListener - Class in org.apache.spark.ui.env
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the EnvironmentTab
EnvironmentListener() - Constructor for class org.apache.spark.ui.env.EnvironmentListener
 
environmentListener() - Method in class org.apache.spark.ui.SparkUI
 
EnvironmentPage - Class in org.apache.spark.ui.env
 
EnvironmentPage(EnvironmentTab) - Constructor for class org.apache.spark.ui.env.EnvironmentPage
 
EnvironmentTab - Class in org.apache.spark.ui.env
 
EnvironmentTab(SparkUI) - Constructor for class org.apache.spark.ui.env.EnvironmentTab
 
environmentUpdateFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
environmentUpdateToJson(SparkListenerEnvironmentUpdate) - Static method in class org.apache.spark.util.JsonProtocol
 
envVars() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
epoch() - Method in class org.apache.spark.scheduler.Task
 
epoch() - Method in class org.apache.spark.scheduler.TaskSetManager
 
EPSILON() - Static method in class org.apache.spark.mllib.util.MLUtils
 
eqNullSafe(Object) - Method in class org.apache.spark.sql.Column
Equality test that is safe for null values.
equals(Object) - Method in class org.apache.spark.graphx.EdgeDirection
 
equals(Object) - Method in class org.apache.spark.HashPartitioner
 
equals(Object) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
equals(Object) - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
equals(Object) - Method in interface org.apache.spark.mllib.linalg.Vector
 
equals(IndexedSeq<Object>, double[], IndexedSeq<Object>, double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Check equality between sparse/dense vectors
equals(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
equals(Object) - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
equals(Object) - Method in class org.apache.spark.mllib.tree.model.Predict
 
equals(Object) - Method in class org.apache.spark.RangePartitioner
 
equals(Object) - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
equals(Object) - Method in class org.apache.spark.scheduler.AccumulableInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.InputFormatInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.SplitInfo
 
equals(Object) - Method in class org.apache.spark.scheduler.Stage
 
equals(Object) - Method in class org.apache.spark.sql.Column
 
equals(Object) - Method in class org.apache.spark.sql.json.JSONRelation
 
equals(Object) - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
equals(Object) - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
equals(Object) - Method in class org.apache.spark.sql.sources.LogicalRelation
 
equals(Object) - Method in class org.apache.spark.storage.BlockId
 
equals(Object) - Method in class org.apache.spark.storage.BlockManagerId
 
equals(Object) - Method in class org.apache.spark.storage.StorageLevel
 
equals(Object) - Method in class org.apache.spark.streaming.kafka.Broker
Broker's port
equals(Object) - Method in class org.apache.spark.streaming.kafka.OffsetRange
exclusive ending offset
equalTo(Object) - Method in class org.apache.spark.sql.Column
Equality test.
EqualTo - Class in org.apache.spark.sql.sources
 
EqualTo(String, Object) - Constructor for class org.apache.spark.sql.sources.EqualTo
 
error(SchedulerDriver, String) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
error(SchedulerDriver, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
error(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
error() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
error() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
errorMessage() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
errorRegEx() - Static method in class org.apache.spark.sql.hive.HiveQl
 
ErrorReported - Class in org.apache.spark.streaming.scheduler
 
ErrorReported(String, Throwable) - Constructor for class org.apache.spark.streaming.scheduler.ErrorReported
 
estimate(Object) - Static method in class org.apache.spark.util.SizeEstimator
 
Estimator<M extends Model<M>> - Class in org.apache.spark.ml
:: AlphaComponent :: Abstract class for estimators that fit models to data.
Estimator() - Constructor for class org.apache.spark.ml.Estimator
 
estimator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
param for the estimator to be cross-validated
estimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
param for estimator param maps
eval(Row) - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
eval(Row) - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
evaluate(DataFrame, ParamMap) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
evaluate(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Evaluator
Evaluates the output.
Evaluator - Class in org.apache.spark.ml
:: AlphaComponent :: Abstract class for evaluators that compute metrics from predictions.
Evaluator() - Constructor for class org.apache.spark.ml.Evaluator
 
evaluator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
param for the evaluator for selection
event() - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
eventLogCodec() - Method in class org.apache.spark.SparkContext
 
eventLogDir() - Method in class org.apache.spark.SparkContext
 
eventLogger() - Method in class org.apache.spark.SparkContext
 
EventLoggingListener - Class in org.apache.spark.scheduler
A SparkListener that logs events to persistent storage.
EventLoggingListener(String, String, SparkConf, Configuration) - Constructor for class org.apache.spark.scheduler.EventLoggingListener
 
EventLoggingListener(String, String, SparkConf) - Constructor for class org.apache.spark.scheduler.EventLoggingListener
 
EventLoop<E> - Class in org.apache.spark.util
An event loop to receive events from the caller and process all events in the event thread.
EventLoop(String) - Constructor for class org.apache.spark.util.EventLoop
 
eventProcessLoop() - Method in class org.apache.spark.scheduler.DAGScheduler
 
EventTransformer - Class in org.apache.spark.streaming.flume
A simple object that provides the implementation of readExternal and writeExternal for both the wrapper classes for Flume-style Events.
EventTransformer() - Constructor for class org.apache.spark.streaming.flume.EventTransformer
 
ExamplePoint - Class in org.apache.spark.sql.test
An example class to demonstrate UDT in Scala, Java, and Python.
ExamplePoint(double, double) - Constructor for class org.apache.spark.sql.test.ExamplePoint
 
ExamplePointUDT - Class in org.apache.spark.sql.test
User-defined type for ExamplePoint.
ExamplePointUDT() - Constructor for class org.apache.spark.sql.test.ExamplePointUDT
 
except(DataFrame) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame containing rows in this frame but not in another frame.
exception() - Method in class org.apache.spark.scheduler.JobFailed
 
EXCEPTION_PRINT_INTERVAL() - Method in class org.apache.spark.scheduler.TaskSetManager
 
ExceptionFailure - Class in org.apache.spark
:: DeveloperApi :: Task failed due to a runtime exception.
ExceptionFailure(String, String, StackTraceElement[], String, Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
 
ExceptionFailure(Throwable, Option<TaskMetrics>) - Constructor for class org.apache.spark.ExceptionFailure
 
exceptionFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
exceptionString(Throwable) - Static method in class org.apache.spark.util.Utils
Return a nice string representation of the exception.
exceptionToJson(Exception) - Static method in class org.apache.spark.util.JsonProtocol
 
execArgs() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
execId() - Method in class org.apache.spark.ExecutorLostFailure
 
execId() - Method in class org.apache.spark.scheduler.ExecutorAdded
 
execId() - Method in class org.apache.spark.scheduler.ExecutorLost
 
execId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
execId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
 
execute() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
execute() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
execute() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
execute() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
execute() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
Inserts all rows into the Parquet file.
execute() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
executeAndGetOutput(Seq<String>, File, Map<String, String>, boolean) - Static method in class org.apache.spark.util.Utils
Execute a command and get its output, throwing an exception if it yields a code other than 0.
executeCollect() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
executeCommand(Seq<String>, File, Map<String, String>, boolean) - Static method in class org.apache.spark.util.Utils
Execute a command and return the process running the command.
executor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
executor() - Method in class org.apache.spark.streaming.CheckpointWriter
 
executor_() - Method in class org.apache.spark.streaming.receiver.Receiver
Handler object that runs the receiver.
executorActor() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
executorActorSystemName() - Static method in class org.apache.spark.SparkEnv
 
executorAdded(String, String, String, int, int) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
executorAdded(String, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
ExecutorAdded - Class in org.apache.spark.scheduler
 
ExecutorAdded(String, String) - Constructor for class org.apache.spark.scheduler.ExecutorAdded
 
executorAdded(String, String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
executorAdded() - Method in class org.apache.spark.scheduler.TaskSetManager
 
executorAddedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
executorAddedToJson(SparkListenerExecutorAdded) - Static method in class org.apache.spark.util.JsonProtocol
 
executorAddress() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
ExecutorAllocationClient - Interface in org.apache.spark
A client that communicates with the cluster manager to request or kill executors.
ExecutorAllocationManager - Class in org.apache.spark
An agent that dynamically allocates and removes executors based on the workload.
ExecutorAllocationManager(ExecutorAllocationClient, LiveListenerBus, SparkConf) - Constructor for class org.apache.spark.ExecutorAllocationManager
 
executorAllocationManager() - Method in class org.apache.spark.SparkContext
 
ExecutorCacheTaskLocation - Class in org.apache.spark.scheduler
A location that includes both a host and an executor id on that host.
ExecutorCacheTaskLocation(String, String) - Constructor for class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
ExecutorData - Class in org.apache.spark.scheduler.cluster
Grouping of data for an executor used by CoarseGrainedSchedulerBackend.
ExecutorData(ActorRef, Address, String, int, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.ExecutorData
 
executorEnvs() - Method in class org.apache.spark.SparkContext
 
ExecutorExited - Class in org.apache.spark.scheduler
 
ExecutorExited(int) - Constructor for class org.apache.spark.scheduler.ExecutorExited
 
executorHeartbeatReceived(String, Tuple4<Object, Object, Object, TaskMetrics>[], BlockManagerId) - Method in class org.apache.spark.scheduler.DAGScheduler
Update metrics for in-progress tasks and let the master know that the BlockManager is still alive.
executorHeartbeatReceived(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Method in interface org.apache.spark.scheduler.TaskScheduler
Update metrics for in-progress tasks and let the master know that the BlockManager is still alive.
executorHeartbeatReceived(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
Update metrics for in-progress tasks and let the master know that the BlockManager is still alive.
executorHost() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
executorHost() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
 
executorId() - Method in class org.apache.spark.Heartbeat
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
executorId() - Method in class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
 
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
 
executorId() - Method in class org.apache.spark.scheduler.TaskDescription
 
executorId() - Method in class org.apache.spark.scheduler.TaskInfo
 
executorId() - Method in class org.apache.spark.scheduler.WorkerOffer
 
executorId() - Method in class org.apache.spark.SparkEnv
 
executorId() - Method in class org.apache.spark.storage.BlockManagerId
 
executorId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor
 
executorIds() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
 
executorIdToBlockManagerId() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
executorIdToStorageStatus() - Method in class org.apache.spark.storage.StorageStatusListener
 
ExecutorInfo - Class in org.apache.spark.scheduler.cluster
:: DeveloperApi :: Stores information about an executor to pass from the scheduler to SparkListeners.
ExecutorInfo(String, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.ExecutorInfo
 
executorInfo() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
 
executorInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
executorInfoToJson(ExecutorInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
executorLogs() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
ExecutorLossReason - Class in org.apache.spark.scheduler
Represents an explanation for a executor or whole slave failing or exiting.
ExecutorLossReason(String) - Constructor for class org.apache.spark.scheduler.ExecutorLossReason
 
executorLost(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
executorLost(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, int) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
executorLost(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
ExecutorLost - Class in org.apache.spark.scheduler
 
ExecutorLost(String) - Constructor for class org.apache.spark.scheduler.ExecutorLost
 
executorLost(String, String) - Method in class org.apache.spark.scheduler.Pool
 
executorLost(String, String) - Method in interface org.apache.spark.scheduler.Schedulable
 
executorLost(String, ExecutorLossReason) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
executorLost(String, String) - Method in class org.apache.spark.scheduler.TaskSetManager
Called by TaskScheduler when an executor is lost so we can re-enqueue our tasks
ExecutorLostFailure - Class in org.apache.spark
:: DeveloperApi :: The task failed because the executor that it was running on was lost.
ExecutorLostFailure(String) - Constructor for class org.apache.spark.ExecutorLostFailure
 
executorMemory() - Method in class org.apache.spark.SparkContext
 
executorPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
executorRemoved(String, String, Option<Object>) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
executorRemovedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
executorRemovedToJson(SparkListenerExecutorRemoved) - Static method in class org.apache.spark.util.JsonProtocol
 
executorRunTime() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
executorSideSetup(int, int, int) - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
ExecutorsListener - Class in org.apache.spark.ui.exec
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the ExecutorsTab
ExecutorsListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.exec.ExecutorsListener
 
executorsListener() - Method in class org.apache.spark.ui.SparkUI
 
ExecutorsPage - Class in org.apache.spark.ui.exec
 
ExecutorsPage(ExecutorsTab, boolean) - Constructor for class org.apache.spark.ui.exec.ExecutorsPage
 
ExecutorsTab - Class in org.apache.spark.ui.exec
 
ExecutorsTab(SparkUI) - Constructor for class org.apache.spark.ui.exec.ExecutorsTab
 
executorSummary() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
ExecutorSummaryInfo - Class in org.apache.spark.ui.exec
Summary information about an executor to display in the UI.
ExecutorSummaryInfo(String, String, int, long, long, int, int, int, int, long, long, long, long, long, Map<String, String>) - Constructor for class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
ExecutorTable - Class in org.apache.spark.ui.jobs
Stage summary grouped by executors.
ExecutorTable(int, int, StagesTab) - Constructor for class org.apache.spark.ui.jobs.ExecutorTable
 
ExecutorThreadDumpPage - Class in org.apache.spark.ui.exec
 
ExecutorThreadDumpPage(ExecutorsTab) - Constructor for class org.apache.spark.ui.exec.ExecutorThreadDumpPage
 
executorToDuration() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToInputRecords() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToLogUrls() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToOutputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToOutputRecords() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksActive() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksComplete() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
executorToTasksFailed() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
ExistingEdgePartitionBuilder<ED,VD> - Class in org.apache.spark.graphx.impl
Constructs an EdgePartition from an existing EdgePartition with the same vertex set.
ExistingEdgePartitionBuilder(GraphXPrimitiveKeyOpenHashMap<Object, Object>, long[], Object, Option<OpenHashSet<Object>>, int, ClassTag<ED>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
 
exitCode() - Method in class org.apache.spark.scheduler.ExecutorExited
 
ExpectationSum - Class in org.apache.spark.mllib.clustering
 
ExpectationSum(double, double[], DenseVector<Object>[], DenseMatrix<Object>[]) - Constructor for class org.apache.spark.mllib.clustering.ExpectationSum
 
Experimental - Annotation Type in org.apache.spark.annotation
An experimental user-facing API.
experimental() - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: A collection of methods that are considered experimental, but can be used to hook into the query planner for advanced functionality.
ExperimentalMethods - Class in org.apache.spark.sql
:: Experimental :: Holder for experimental methods for the bravest.
explain(boolean) - Method in class org.apache.spark.sql.Column
Prints the expression to the console for debugging purpose.
explain(boolean) - Method in class org.apache.spark.sql.DataFrame
Prints the plans (logical and physical) to the console for debugging purpose.
explain() - Method in class org.apache.spark.sql.DataFrame
Only prints the physical plan to the console for debugging purpose.
explainedVariance() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns the explained variance regression score.
explainParams() - Method in interface org.apache.spark.ml.param.Params
Returns the documentation of all params.
explode(Seq<Column>, Function1<Row, TraversableOnce<A>>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.DataFrame
(Scala-specific) Returns a new DataFrame where each row has been expanded to zero or more rows by the provided function.
explode(String, String, Function1<A, TraversableOnce<B>>, TypeTags.TypeTag<B>) - Method in class org.apache.spark.sql.DataFrame
(Scala-specific) Returns a new DataFrame where a single column has been expanded to zero or more rows by the provided function.
explode() - Static method in class org.apache.spark.sql.hive.HiveQl
 
ExponentialGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
ExponentialGenerator(double) - Constructor for class org.apache.spark.mllib.random.ExponentialGenerator
 
exponentialJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
exponentialJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
exponentialJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
exponentialJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
exponentialJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
exponentialJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
exponentialRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d. samples from the exponential distribution with the input mean.
exponentialVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the exponential distribution with the input mean.
exprs() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
ExtendedHiveQlParser - Class in org.apache.spark.sql.hive
A parser that recognizes all HiveQL constructs together with Spark SQL specific extensions.
ExtendedHiveQlParser() - Constructor for class org.apache.spark.sql.hive.ExtendedHiveQlParser
 
EXTERNAL_SORT() - Static method in class org.apache.spark.sql.SQLConf
 
externalShuffleServiceEnabled() - Method in class org.apache.spark.storage.BlockManager
 
externalSortEnabled() - Method in class org.apache.spark.sql.SQLConf
When true the planner will use the external sort, which may spill to disk.
extraCoresPerSlave() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
extract(long, Function1<T, Object>) - Method in class org.apache.spark.mllib.fpm.FPTree
Extracts all patterns with valid suffix and minimum count.
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
extract(ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
extract(ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
Extracts a value out of the buffer at the buffer's current position.
extract(ByteBuffer, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Extracts a value out of the buffer at the buffer's current position and stores in row(ordinal).
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
extract(ByteBuffer, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
 
extract(ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
extractDistribution(Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
extractDoubleDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
extractFn() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
extractHostPortFromSparkUrl(String) - Static method in class org.apache.spark.util.Utils
Return a pair of host and port extracted from the sparkUrl.
extractLongDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
extractMultiClassCategories(int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Nested method to extract list of eligible categories given an index.
extractSingle(MutableRow, int) - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
extractSingle(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
extractTo(MutableRow, int) - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
extractTo(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
extractTo(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
extraStrategies() - Method in class org.apache.spark.sql.ExperimentalMethods
Allows extra strategies to be injected into the query planner at runtime.
eye(int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
Generate an Identity Matrix in DenseMatrix format.
eye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a dense Identity Matrix in Matrix format.

F

f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
f() - Method in class org.apache.spark.sql.UserDefinedFunction
 
f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns document-based f1-measure averaged by the number of documents
f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns f1-measure for a given label (category)
failAnalysis(String) - Method in class org.apache.spark.sql.sources.PreWriteCheck
 
failed() - Method in class org.apache.spark.scheduler.TaskInfo
 
FAILED() - Static method in class org.apache.spark.TaskState
 
failedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
failedStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
FailedStageTable - Class in org.apache.spark.ui.jobs
 
FailedStageTable(Seq<StageInfo>, String, JobProgressListener, boolean) - Constructor for class org.apache.spark.ui.jobs.FailedStageTable
 
failedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
failedTasks() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
failure() - Method in class org.apache.spark.partial.ApproximateActionListener
 
failureReason() - Method in class org.apache.spark.scheduler.StageInfo
If the stage failed, the reason why.
failuresBySlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
FAIR_SCHEDULER_PROPERTIES() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
FairSchedulableBuilder - Class in org.apache.spark.scheduler
 
FairSchedulableBuilder(Pool, SparkConf) - Constructor for class org.apache.spark.scheduler.FairSchedulableBuilder
 
FairSchedulingAlgorithm - Class in org.apache.spark.scheduler
 
FairSchedulingAlgorithm() - Constructor for class org.apache.spark.scheduler.FairSchedulingAlgorithm
 
fakeClassTag() - Static method in class org.apache.spark.api.java.JavaSparkContext
Produces a ClassTag[T], which is actually just a casted ClassTag[AnyRef].
fakeOutput(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.PhysicalPlanHacks
 
FALSE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
FalsePositiveRate - Class in org.apache.spark.mllib.evaluation.binary
False positive rate.
FalsePositiveRate() - Constructor for class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
 
falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns false positive rate for a given label (category)
fastSquaredDistance(VectorWithNorm, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
fastSquaredDistance(Vector, double, Vector, double, double) - Static method in class org.apache.spark.mllib.util.MLUtils
Returns the squared Euclidean distance between two vectors.
feature() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
feature() - Method in class org.apache.spark.mllib.tree.model.Split
 
featureArity() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
featuresCol() - Method in interface org.apache.spark.ml.param.HasFeaturesCol
param for features column name
featureSubset() - Method in class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
FeatureType - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum to describe whether a feature is "continuous" or "categorical"
FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
 
featureType() - Method in class org.apache.spark.mllib.tree.model.Bin
 
featureType() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
featureType() - Method in class org.apache.spark.mllib.tree.model.Split
 
featureUpdate(int, int, double, double) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Faster version of update.
FetchFailed - Class in org.apache.spark
:: DeveloperApi :: Task failed to fetch shuffle data from a remote node.
FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
 
fetchFile(String, File, SparkConf, SecurityManager, Configuration, long, boolean) - Static method in class org.apache.spark.util.Utils
Download a file or directory to target directory.
fetchHcfsFile(Path, File, FileSystem, SparkConf, Configuration, boolean, Option<String>) - Static method in class org.apache.spark.util.Utils
Fetch a file or directory from a Hadoop-compatible filesystem.
fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
field() - Method in class org.apache.spark.storage.BroadcastBlockId
 
FieldAccessFinder - Class in org.apache.spark.util
 
FieldAccessFinder(Map<Class<?>, Set<String>>) - Constructor for class org.apache.spark.util.FieldAccessFinder
 
FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
FIFOSchedulableBuilder - Class in org.apache.spark.scheduler
 
FIFOSchedulableBuilder(Pool) - Constructor for class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
FIFOSchedulingAlgorithm - Class in org.apache.spark.scheduler
 
FIFOSchedulingAlgorithm() - Constructor for class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
 
file() - Method in class org.apache.spark.storage.FileSegment
 
file() - Method in class org.apache.spark.storage.TachyonFileSegment
 
FileAppender - Class in org.apache.spark.util.logging
Continuously appends the data from an input stream into the given file.
FileAppender(InputStream, File, int) - Constructor for class org.apache.spark.util.logging.FileAppender
 
fileDir() - Method in class org.apache.spark.HttpFileServer
 
fileExists(TachyonFile) - Method in class org.apache.spark.storage.TachyonBlockManager
 
FileInputDStream<K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> - Class in org.apache.spark.streaming.dstream
This class represents an input stream that monitors a Hadoop-compatible filesystem for new files and creates a stream out of them.
FileInputDStream(StreamingContext, String, Function1<Path, Object>, boolean, Option<Configuration>, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Constructor for class org.apache.spark.streaming.dstream.FileInputDStream
 
FileInputDStream.FileInputDStreamCheckpointData - Class in org.apache.spark.streaming.dstream
A custom version of the DStreamCheckpointData that stores names of Hadoop files as checkpoint data.
FileInputDStream.FileInputDStreamCheckpointData() - Constructor for class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
filePath() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
files() - Method in class org.apache.spark.SparkContext
 
fileSegment() - Method in class org.apache.spark.storage.BlockObjectWriter
Returns the file segment of committed data that this Writer has written.
fileSegment() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
FileSegment - Class in org.apache.spark.storage
References a particular segment of a file (potentially the entire file), based off an offset and a length.
FileSegment(File, long, long) - Constructor for class org.apache.spark.storage.FileSegment
 
fileServerSSLOptions() - Method in class org.apache.spark.SecurityManager
 
fileStream(String, Class<K>, Class<V>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, Configuration, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
FileSystemHelper - Class in org.apache.spark.sql.parquet
 
FileSystemHelper() - Constructor for class org.apache.spark.sql.parquet.FileSystemHelper
 
fillObject(Iterator<Writable>, Deserializer, Seq<Tuple2<Attribute, Object>>, MutableRow) - Static method in class org.apache.spark.sql.hive.HadoopTableReader
Transform all given raw Writables into Rows.
filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps
Filter the graph by computing some values to filter on, and applying the predicates.
filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Construct a new edge partition containing only the edges matching epred and where both vertices match vpred.
filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
filter(Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Restrict the vertex set to the set of vertices satisfying the given predicate.
filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD
Restricts the vertex set to the set of vertices satisfying the given predicate.
filter(Params) - Method in class org.apache.spark.ml.param.ParamMap
Filters this param map for the given parent.
filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing only the elements that satisfy a predicate.
filter(Column) - Method in class org.apache.spark.sql.DataFrame
Filters rows using the given condition.
filter(String) - Method in class org.apache.spark.sql.DataFrame
Filters rows using the given SQL expression.
Filter - Class in org.apache.spark.sql.sources
 
Filter() - Constructor for class org.apache.spark.sql.sources.Filter
 
filter() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<Tuple2<A, B>, Object>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
filter(Function1<Tuple2<A, B>, Object>) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
FilteredDStream<T> - Class in org.apache.spark.streaming.dstream
 
FilteredDStream(DStream<T>, Function1<T, Object>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.FilteredDStream
 
FilteringParquetRowInputFormat - Class in org.apache.spark.sql.parquet
We extend ParquetInputFormat in order to have more control over which RecordFilter we want to use.
FilteringParquetRowInputFormat() - Constructor for class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
filterName() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
filterParams() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD
Filters this RDD with p, where p takes an additional parameter of type A.
finalRDD() - Method in class org.apache.spark.scheduler.JobSubmitted
 
finalStage() - Method in class org.apache.spark.scheduler.ActiveJob
 
find(Object) - Static method in class org.apache.spark.serializer.SerializationDebugger
Find the path leading to a not serializable object.
findBestSplits(RDD<BaggedPoint<TreePoint>>, DecisionTreeMetadata, Node[], Map<Object, Node[]>, Map<Object, Map<Object, RandomForest.NodeIndexInfo>>, Split[][], Bin[][], Queue<Tuple2<Object, Node>>, TimeTracker, Option<NodeIdCache>) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Given a group of nodes, this finds the best split for each node.
findClass(String) - Method in class org.apache.spark.util.ParentClassLoader
 
findClosest(TraversableOnce<VectorWithNorm>, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
Returns the index of the closest center to the given point, as well as the squared distance.
findLeader(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
findLeaders(Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
findMaxTaskId(String, Configuration) - Static method in class org.apache.spark.sql.parquet.FileSystemHelper
Finds the maximum taskid in the output file names at the given path.
findSplitsForContinuousFeature(double[], DecisionTreeMetadata, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Find splits for a continuous feature NOTE: Returned number of splits is set based on featureSamples and could be different from the specified numSplits.
findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
Find synonyms of a word
findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
Find synonyms of the vector representation of a word
finishAll() - Method in class org.apache.spark.ui.ConsoleProgressBar
Mark all the stages as finished, clear the progress bar if showed, then the progress will not interweave with output of jobs.
finished() - Method in class org.apache.spark.scheduler.ActiveJob
 
finished() - Method in class org.apache.spark.scheduler.TaskInfo
 
FINISHED() - Static method in class org.apache.spark.TaskState
 
FINISHED_STATES() - Static method in class org.apache.spark.TaskState
 
finishedTasks() - Method in class org.apache.spark.partial.ApproximateActionListener
 
finishTime() - Method in class org.apache.spark.scheduler.TaskInfo
The time when the task has completed successfully (including the time to remotely fetch results, if necessary).
first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
first() - Method in class org.apache.spark.api.java.JavaPairRDD
 
first() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the first element in this RDD.
first() - Method in class org.apache.spark.rdd.RDD
Return the first element in this RDD.
first() - Method in class org.apache.spark.sql.DataFrame
Returns the first row.
first(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the first value in a group.
first(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the first value of a column in a group.
first() - Method in interface org.apache.spark.sql.RDDApi
 
FIRST_DELAY() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
firstAvailableClass(String, String) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
firstAvailableClass(String, String) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
fit(DataFrame, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with optional parameters.
fit(DataFrame, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with optional parameters.
fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Estimator
Fits a single model to the input data with provided parameter map.
fit(DataFrame, ParamMap[]) - Method in class org.apache.spark.ml.Estimator
Fits multiple models to the input data with multiple sets of parameters.
fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
 
fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.impl.estimator.Predictor
 
fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Pipeline
Fits the pipeline to the input dataset with additional parameters.
fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
 
fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
fit(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
Returns a ChiSquared feature selector.
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
Computes the inverse document frequency.
fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
Computes the inverse document frequency.
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler
Computes the mean and variance and stores as a model to be used for later scaling.
fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
Computes the vector representation of each word in vocabulary.
fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
Computes the vector representation of each word in vocabulary (Java version).
fittingParamMap() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
fittingParamMap() - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
fittingParamMap() - Method in class org.apache.spark.ml.Model
Fitting parameters, such that parent.fit(..., fittingParamMap) could reproduce the model.
fittingParamMap() - Method in class org.apache.spark.ml.PipelineModel
 
fittingParamMap() - Method in class org.apache.spark.ml.recommendation.ALSModel
 
fittingParamMap() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
 
fittingParamMap() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
FixedLengthBinaryInputFormat - Class in org.apache.spark.input
 
FixedLengthBinaryInputFormat() - Constructor for class org.apache.spark.input.FixedLengthBinaryInputFormat
 
FixedLengthBinaryRecordReader - Class in org.apache.spark.input
FixedLengthBinaryRecordReader is returned by FixedLengthBinaryInputFormat.
FixedLengthBinaryRecordReader() - Constructor for class org.apache.spark.input.FixedLengthBinaryRecordReader
 
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<Row, TraversableOnce<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
Returns a new RDD by first applying a function to all rows of this DataFrame, and then flattening the results.
flatMap(Function1<T, TraversableOnce<R>>, ClassTag<R>) - Method in interface org.apache.spark.sql.RDDApi
 
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMap(Function1<T, Traversable<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function
A function that returns zero or more output records from each input record.
FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function
A function that takes two inputs and returns zero or more output records.
FlatMappedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
FlatMappedDStream(DStream<T>, Function1<T, Traversable<U>>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.FlatMappedDStream
 
flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapValuedDStream<K,V,U> - Class in org.apache.spark.streaming.dstream
 
FlatMapValuedDStream(DStream<Tuple2<K, V>>, Function1<V, TraversableOnce<U>>, ClassTag<K>, ClassTag<V>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD
Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
FlatMaps f over this RDD, where f takes an additional parameter of type A.
FLOAT - Class in org.apache.spark.sql.columnar
 
FLOAT() - Constructor for class org.apache.spark.sql.columnar.FLOAT
 
FloatColumnAccessor - Class in org.apache.spark.sql.columnar
 
FloatColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.FloatColumnAccessor
 
FloatColumnBuilder - Class in org.apache.spark.sql.columnar
 
FloatColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.FloatColumnBuilder
 
FloatColumnStats - Class in org.apache.spark.sql.columnar
 
FloatColumnStats() - Constructor for class org.apache.spark.sql.columnar.FloatColumnStats
 
FloatConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
FloatParam - Class in org.apache.spark.ml.param
Specialized version of Param[Float] for Java.
FloatParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
 
FloatParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
 
floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
 
floatWritableConverter() - Static method in class org.apache.spark.SparkContext
 
floatWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
floatWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
floor(Duration) - Method in class org.apache.spark.streaming.Time
 
FlumeBatchFetcher - Class in org.apache.spark.streaming.flume
This class implements the core functionality of FlumePollingReceiver.
FlumeBatchFetcher(FlumePollingReceiver) - Constructor for class org.apache.spark.streaming.flume.FlumeBatchFetcher
 
FlumeConnection - Class in org.apache.spark.streaming.flume
A wrapper around the transceiver and the Avro IPC API.
FlumeConnection(NettyTransceiver, SparkFlumeProtocol.Callback) - Constructor for class org.apache.spark.streaming.flume.FlumeConnection
 
FlumeEventServer - Class in org.apache.spark.streaming.flume
A simple server that implements Flume's Avro protocol.
FlumeEventServer(FlumeReceiver) - Constructor for class org.apache.spark.streaming.flume.FlumeEventServer
 
FlumeInputDStream<T> - Class in org.apache.spark.streaming.flume
 
FlumeInputDStream(StreamingContext, String, int, StorageLevel, boolean, ClassTag<T>) - Constructor for class org.apache.spark.streaming.flume.FlumeInputDStream
 
FlumePollingInputDStream<T> - Class in org.apache.spark.streaming.flume
A ReceiverInputDStream that can be used to read data from several Flume agents running SparkSinks.
FlumePollingInputDStream(StreamingContext, Seq<InetSocketAddress>, int, int, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
FlumePollingReceiver - Class in org.apache.spark.streaming.flume
 
FlumePollingReceiver(Seq<InetSocketAddress>, int, int, StorageLevel) - Constructor for class org.apache.spark.streaming.flume.FlumePollingReceiver
 
FlumeReceiver - Class in org.apache.spark.streaming.flume
A NetworkReceiver which listens for events using the Flume Avro interface.
FlumeReceiver(String, int, StorageLevel, boolean) - Constructor for class org.apache.spark.streaming.flume.FlumeReceiver
 
FlumeReceiver.CompressionChannelPipelineFactory - Class in org.apache.spark.streaming.flume
A Netty Pipeline factory that will decompress incoming data from and the Netty client and compress data going back to the client.
FlumeReceiver.CompressionChannelPipelineFactory() - Constructor for class org.apache.spark.streaming.flume.FlumeReceiver.CompressionChannelPipelineFactory
 
FlumeUtils - Class in org.apache.spark.streaming.flume
 
FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
 
flush() - Method in class org.apache.spark.serializer.JavaSerializationStream
 
flush() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
flush() - Method in class org.apache.spark.serializer.SerializationStream
 
flush() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
flush() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
FMeasure - Class in org.apache.spark.mllib.evaluation.binary
F-Measure.
FMeasure(double) - Constructor for class org.apache.spark.mllib.evaluation.binary.FMeasure
 
fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns f-measure for a given label (category)
fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns f1-measure for a given label (category)
fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns f-measure (equals to precision and recall because precision equals recall)
fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, F-Measure) curve.
fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, F-Measure) curve with beta = 1.0.
fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
foldable() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
foldable() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
forAttribute() - Method in class org.apache.spark.sql.columnar.PartitionStatistics
 
foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Applies a function f to all elements of this RDD.
foreach(Function1<Edge<ED>, BoxedUnit>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Apply the function f to all edges in this partition.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies a function f to all elements of this RDD.
foreach(Function1<Row, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame
Applies a function f to all rows.
foreach(Function1<T, BoxedUnit>) - Method in interface org.apache.spark.sql.RDDApi
 
foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreach(Function1<Tuple2<A, B>, U>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
foreach(Function1<A, U>) - Method in class org.apache.spark.util.TimeStampedHashSet
 
foreach(Function1<Tuple2<A, B>, U>) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseVector
 
foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Matrix
Applies a function f to all the active elements of dense and sparse matrix.
foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseVector
 
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector
Applies a function f to all the active elements of dense and sparse vector.
foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of the foreach action, which applies a function f to all the elements of this RDD.
foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
Applies a function f to all elements of this RDD.
ForEachDStream<T> - Class in org.apache.spark.streaming.dstream
 
ForEachDStream(DStream<T>, Function2<RDD<T>, Time, BoxedUnit>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ForEachDStream
 
foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<Row>, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame
Applies a function f to each partition of this DataFrame.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in interface org.apache.spark.sql.RDDApi
 
foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of the foreachPartition action, which applies a function f to each partition of this RDD.
foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
Applies a function f to each partition of this RDD.
foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Apply a function to each RDD in this DStream.
foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Apply a function to each RDD in this DStream.
foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
Apply a function to each RDD in this DStream.
foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
Applies f to each element of this RDD, where f takes an additional parameter of type A.
foreachWithinEdgePartition(int, boolean, boolean, Function1<Object, BoxedUnit>) - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
Runs f on each vertex id to be sent to the specified edge partition.
formatDate(Date) - Static method in class org.apache.spark.ui.UIUtils
 
formatDate(long) - Static method in class org.apache.spark.ui.UIUtils
 
formatDuration(long) - Static method in class org.apache.spark.ui.UIUtils
 
formatDurationVerbose(long) - Static method in class org.apache.spark.ui.UIUtils
Generate a verbose human-readable string representing a duration such as "5 second 35 ms"
formatNumber(double) - Static method in class org.apache.spark.ui.UIUtils
Generate a human-readable string representing a number (e.g.
formatter() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
formatVersion() - Method in interface org.apache.spark.mllib.util.Saveable
Current version of model save/load format.
formatWindowsPath(String) - Static method in class org.apache.spark.util.Utils
Format a Windows path such that it can be safely passed to a URI.
FPGrowth - Class in org.apache.spark.mllib.fpm
:: Experimental ::
FPGrowth() - Constructor for class org.apache.spark.mllib.fpm.FPGrowth
Constructs a default instance with default parameters {minSupport: 0.3, numPartitions: same as the input data}.
FPGrowth.FreqItemset<Item> - Class in org.apache.spark.mllib.fpm
Frequent itemset.
FPGrowth.FreqItemset(Object, long) - Constructor for class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
 
FPGrowthModel<Item> - Class in org.apache.spark.mllib.fpm
:: Experimental ::
FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
 
FPTree<T> - Class in org.apache.spark.mllib.fpm
FP-Tree data structure used in FP-Growth.
FPTree() - Constructor for class org.apache.spark.mllib.fpm.FPTree
 
FPTree.Node<T> - Class in org.apache.spark.mllib.fpm
Representing a node in an FP-Tree.
FPTree.Node(FPTree.Node<T>) - Constructor for class org.apache.spark.mllib.fpm.FPTree.Node
 
framework() - Method in class org.apache.spark.streaming.Checkpoint
 
frameworkMessage(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, byte[]) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
frameworkMessage(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, byte[]) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
freeCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
freeMemory() - Method in class org.apache.spark.storage.MemoryStore
Free memory not occupied by existing blocks.
freq() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
 
freqItemsets() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
 
fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
fromBinary(Binary) - Static method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 
fromBreeze(Matrix<Object>) - Static method in class org.apache.spark.mllib.linalg.Matrices
Creates a Matrix instance from a breeze matrix.
fromBreeze(Vector<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a vector instance from a breeze vector.
fromByteString(ByteString) - Static method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
 
fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
Generate a SparseMatrix from Coordinate List (COO) format.
fromDataType(DataType, String, boolean, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Converts a given Catalyst DataType into the corresponding Parquet Type.
fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
Convert a scala DStream to a Java-friendly JavaDStream.
fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
Creates an EdgeRDD from already-constructed edge partitions.
fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from EdgePartitions, setting referenced vertices to `defaultVertexAttr`.
fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
Creates an EdgeRDD from a set of edges.
fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
Construct a graph from a collection of edges.
fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD containing all vertices referred to in edges.
fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph
Construct a graph from a collection of edges encoded as vertex id pairs.
fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the vertices.
fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
Convert a scala InputDStream to a Java-friendly JavaInputDStream.
fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
Convert a scala InputDStream of pairs to a Java-friendly JavaPairInputDStream.
fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD
Convert a JavaRDD of key-value pairs to JavaPairRDD.
fromMesos(Protos.TaskState) - Static method in class org.apache.spark.TaskState
 
fromMsgs(int, Iterator<Tuple2<Object, Object>>) - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition
Build a `RoutingTablePartition` from `RoutingTableMessage`s.
fromOffset() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
 
fromOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
inclusive starting offset
fromOffsets() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
fromPrimitiveDataType(DataType) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
For a given Catalyst DataType return the name of the corresponding Parquet primitive type or None if the given type is not primitive.
fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
 
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions
Implicit conversion from an RDD to RDDFunctions.
fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
 
fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
 
fromStage(Stage, Option<Object>) - Static method in class org.apache.spark.scheduler.StageInfo
Construct a StageInfo from a Stage.
fromString(String) - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
fromString(String) - Static method in class org.apache.spark.mllib.tree.impurity.Impurities
 
fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
 
fromString(String) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Return the StorageLevel object with the specified name.
fromWeakReference(WeakReference<V>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceIterator(Iterator<Tuple2<K, WeakReference<V>>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceMap(Map<K, WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceOption(Option<WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fromWeakReferenceTuple(Tuple2<K, WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
fs() - Method in class org.apache.spark.rdd.CheckpointRDD
 
fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a full outer join of this and other.
fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a full outer join of this and other.
fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a full outer join of this and other.
fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
 
func() - Method in class org.apache.spark.scheduler.ActiveJob
 
func() - Method in class org.apache.spark.scheduler.JobSubmitted
 
Function<T1,R> - Interface in org.apache.spark.api.java.function
Base interface for functions whose return types do not create special RDDs.
function() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
function() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function
A two-argument function that takes arguments of type T1 and T2 and returns an R.
Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function
A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
functionClassName() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
functions - Class in org.apache.spark.sql
 
functions() - Constructor for class org.apache.spark.sql.functions
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
funcWrapper() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
FutureAction<T> - Interface in org.apache.spark
A future for the result of an action to support cancellation.

G

gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
GammaGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
GammaGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.GammaGenerator
 
gammaJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
gammaJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
gammaJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
gammaRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d. samples from the gamma distribution with the input shape and scale.
gammaVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the gamma distribution with the input shape and scale.
GapSamplingIterator<T> - Class in org.apache.spark.util.random
 
GapSamplingIterator(Iterator<T>, double, Random, double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.GapSamplingIterator
 
GapSamplingReplacementIterator<T> - Class in org.apache.spark.util.random
advance to first sample as part of object construction.
GapSamplingReplacementIterator(Iterator<T>, double, Random, double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.GapSamplingReplacementIterator
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
gatherCompressibilityStats(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
gatherCompressibilityStats(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.ByteColumnStats
 
gatherStats(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnStats
Gathers statistics information from row(ordinal).
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.FloatColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.GenericColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.IntColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.LongColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.NoopColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.ShortColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.StringColumnStats
 
gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
 
GaussianMixture - Class in org.apache.spark.mllib.clustering
:: Experimental ::
GaussianMixture() - Constructor for class org.apache.spark.mllib.clustering.GaussianMixture
Constructs a default instance.
GaussianMixtureModel - Class in org.apache.spark.mllib.clustering
:: Experimental ::
GaussianMixtureModel(double[], MultivariateGaussian[]) - Constructor for class org.apache.spark.mllib.clustering.GaussianMixtureModel
 
gaussians() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
 
GC_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
gemm(double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS
C := alpha * A * B + beta * C
gemv(double, Matrix, DenseVector, double, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y := alpha * A * x + beta * y
GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression
:: DeveloperApi :: GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
 
GeneralizedLinearModel - Class in org.apache.spark.mllib.regression
:: DeveloperApi :: GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm.
GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
generate(String, String, int, int) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
 
generateJob(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Generate a SparkStreaming job for the given time.
generateJob(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
generateJobs(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
GenerateJobs - Class in org.apache.spark.streaming.scheduler
 
GenerateJobs(Time) - Constructor for class org.apache.spark.streaming.scheduler.GenerateJobs
 
generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
Generate an RDD containing test data for KMeans.
generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
 
generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
Return a Java List of synthetic data randomly generated according to a multi collinear model.
generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and uregularized variants.
generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
Generate an RDD containing test data for LogisticRegression.
generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
generateRolledOverFileSuffix() - Method in interface org.apache.spark.util.logging.RollingPolicy
Get the desired name of the rollover file
generateRolledOverFileSuffix() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Get the desired name of the rollover file
generateRolledOverFileSuffix() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
generator() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
GENERIC - Class in org.apache.spark.sql.columnar
 
GENERIC() - Constructor for class org.apache.spark.sql.columnar.GENERIC
 
GenericColumnAccessor - Class in org.apache.spark.sql.columnar
 
GenericColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.GenericColumnAccessor
 
GenericColumnBuilder - Class in org.apache.spark.sql.columnar
 
GenericColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.GenericColumnBuilder
 
GenericColumnStats - Class in org.apache.spark.sql.columnar
 
GenericColumnStats() - Constructor for class org.apache.spark.sql.columnar.GenericColumnStats
 
geq(Object) - Method in class org.apache.spark.sql.Column
Greater than or equal to an expression.
get(Object) - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
get() - Method in interface org.apache.spark.FutureAction
Blocks and returns the result of this job.
get() - Method in class org.apache.spark.JavaFutureActionWrapper
 
get(long, TimeUnit) - Method in class org.apache.spark.JavaFutureActionWrapper
 
get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
Optionally returns the value associated with a param or its default.
get(Param<T>) - Method in interface org.apache.spark.ml.param.Params
Gets the value of a parameter in the embedded param map.
get(long) - Method in class org.apache.spark.partial.StudentTCacher
 
get(String) - Method in class org.apache.spark.SparkConf
Get a parameter; throws a NoSuchElementException if it's not set
get(String, String) - Method in class org.apache.spark.SparkConf
Get a parameter, falling back to a default if not set
get() - Static method in class org.apache.spark.SparkEnv
Returns the SparkEnv.
get(String) - Static method in class org.apache.spark.SparkFiles
Get the absolute path of a file added through SparkContext.addFile().
get() - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
get(String) - Static method in class org.apache.spark.sql.jdbc.DriverQuirks
Fetch the DriverQuirks class corresponding to a given database url.
get(String) - Method in class org.apache.spark.sql.sources.CaseInsensitiveMap
 
get(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get a block from the block manager (either local or remote).
get() - Static method in class org.apache.spark.TaskContext
Return the currently active TaskContext.
get(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
get(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getAcceptanceResults(RDD<Tuple2<K, V>>, boolean, Map<K, Object>, Option<Map<K, Object>>, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Count the number of items instantly accepted and generate the waitlist for each stratum.
getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Returns an array containing the ids of all active jobs.
getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker
Returns an array containing the ids of all active jobs.
getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Returns an array containing the ids of all active stages.
getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker
Returns an array containing the ids of all active stages.
getActorSystemHostPortForExecutor(String) - Method in class org.apache.spark.storage.BlockManagerMaster
 
getAddressHostName(String) - Static method in class org.apache.spark.util.Utils
 
getAkkaConf() - Method in class org.apache.spark.SparkConf
Get all akka conf variables set on this SparkConf
getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getAll() - Method in class org.apache.spark.SparkConf
Get all parameters as a list of pairs
getAllBlocks() - Method in class org.apache.spark.storage.DiskBlockManager
List all the blocks currently stored on disk by the disk manager.
getAllConfs() - Method in class org.apache.spark.sql.SQLConf
Return all the configuration properties that have been set (i.e.
getAllConfs() - Method in class org.apache.spark.sql.SQLContext
Return all the configuration properties that have been set (i.e.
getAllFiles() - Method in class org.apache.spark.storage.DiskBlockManager
List all the files currently stored on disk by the disk manager.
getAllPartitionsOf(Hive, Table) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getAllPools() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return pools for fair scheduler
getAlpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getAlpha() - Method in class org.apache.spark.mllib.clustering.LDA
Alias for getDocConcentration
getAppId() - Method in class org.apache.spark.SparkConf
Returns the Spark application id, valid in the Driver after TaskScheduler registration and from the start in the Executor.
getAppName() - Method in class org.apache.spark.ui.SparkUI
 
getAst(String) - Static method in class org.apache.spark.sql.hive.HiveQl
Returns the AST for the given SQL string.
getBasePath() - Method in class org.apache.spark.ui.WebUI
 
getBernoulliSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Return the per partition sampling function used for sampling without replacement.
getBeta() - Method in class org.apache.spark.mllib.clustering.LDA
Alias for getTopicConcentration
getBinaryWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getBinaryWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
Return the given block stored in this block manager in O(1) time.
getBlockData(BlockId) - Method in class org.apache.spark.storage.BlockManager
Interface to get local block data.
getBlocksOfBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Get the blocks allocated to the given batch.
getBlocksOfBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Get the blocks for the given batch and all input streams.
getBlocksOfBatchAndStream(Time, int) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Get the blocks allocated to the given batch and stream.
getBlocksOfBatchAndStream(Time, int) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Get the blocks allocated to the given batch and stream.
getBlocksOfStream(int) - Method in class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
getBlockStatus(BlockId, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Return the block's status on all block managers, if any.
getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf
Get a parameter as a boolean, falling back to a default if not set
getBooleanWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getBooleanWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getBytes(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getBytes(FileSegment) - Method in class org.apache.spark.storage.DiskStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getBytes(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getByteWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getByteWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
 
getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
The three methods below are helpers for accessing the local map, a property of the SparkEnv of the local process.
getCachedStorageLevel(StorageLevel) - Static method in class org.apache.spark.storage.StorageLevel
 
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.EntropyAggregator
Get an ImpurityCalculator for a (node, feature, bin).
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.GiniAggregator
Get an ImpurityCalculator for a (node, feature, bin).
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
Get an ImpurityCalculator for a (node, feature, bin).
getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.VarianceAggregator
Get an ImpurityCalculator for a (node, feature, bin).
getCalendar() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
getCallSite() - Method in class org.apache.spark.SparkContext
Capture the current user callsite and return a formatted version for printing.
getCallSite(Function1<String, Object>) - Static method in class org.apache.spark.util.Utils
When called inside a class in the spark package, returns the name of the user code class (outside the spark package) that called into Spark, as well as which Spark method they called.
getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.DriverQuirks
 
getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.MySQLQuirks
 
getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.NoQuirks
 
getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.PostgresQuirks
 
getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
 
getCheckpointDir() - Method in class org.apache.spark.SparkContext
 
getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike
Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
getCheckpointFile() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
getCheckpointFile() - Method in class org.apache.spark.rdd.RDD
Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getCheckpointFiles() - Method in class org.apache.spark.graphx.Graph
Gets the name of the files to which this Graph was checkpointed.
getCheckpointFiles() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
getCheckpointFiles(String, FileSystem) - Static method in class org.apache.spark.streaming.Checkpoint
Get checkpoint files present in the give directory, ordered by oldest-first
getCheckpointInterval() - Method in class org.apache.spark.mllib.clustering.LDA
Period (in iterations) between checkpoints.
getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getClause(String, Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
getClauseOption(String, Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
getClientSideSplits(Configuration, List<Footer>, Long, Long, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getCombOp() - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Returns the function used combine results returned by seqOp from different partitions.
getCommandProcessor(String[], HiveConf) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getConf() - Method in class org.apache.spark.api.java.JavaSparkContext
Return a copy of this JavaSparkContext's configuration.
getConf() - Method in interface org.apache.spark.input.Configurable
 
getConf() - Method in class org.apache.spark.rdd.HadoopRDD
 
getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getConf() - Method in class org.apache.spark.SparkContext
Return a copy of this SparkContext's configuration.
getConf(String) - Method in class org.apache.spark.sql.SQLConf
Return the value of Spark SQL configuration property for the given key.
getConf(String, String) - Method in class org.apache.spark.sql.SQLConf
Return the value of Spark SQL configuration property for the given key.
getConf(String) - Method in class org.apache.spark.sql.SQLContext
Return the value of Spark SQL configuration property for the given key.
getConf(String, String) - Method in class org.apache.spark.sql.SQLContext
Return the value of Spark SQL configuration property for the given key.
getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
 
getConnections() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
getConnector(String, String) - Static method in class org.apache.spark.sql.jdbc.JDBCRDD
Given a driver string and an url, return a function that loads the specified driver string then returns a connection to the JDBC url.
getConsumerOffsetMetadata(String, Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
Requires Kafka >= 0.8.1.1
getConsumerOffsets(String, Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
Requires Kafka >= 0.8.1.1
getContextOrSparkClassLoader() - Static method in class org.apache.spark.util.Utils
Get the Context ClassLoader on this thread or, if not present, the ClassLoader that loaded Spark.
getConvergenceTol() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Return the largest change in log-likelihood at which convergence is considered to have occurred.
getConversions(StructType) - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Maps a StructType to a type tag list.
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
getCorrelationFromName(String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
getCreationSite() - Method in class org.apache.spark.rdd.RDD
 
getCreationSite() - Static method in class org.apache.spark.streaming.dstream.DStream
Get the creation site of a DStream from the stack trace of when the DStream is created.
getCurrentKey() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getCurrentKey() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getCurrentKey() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystConverter
Should only be called in the root (group) converter!
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
getCurrentRecord() - Method in class org.apache.spark.sql.parquet.RowRecordMaterializer
 
getCurrentUserName() - Static method in class org.apache.spark.util.Utils
Returns the current user name.
getCurrentValue() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getCurrentValue() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getCurrentValue() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getDataLocationPath(Partition) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDateWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDateWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDecimalWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDecimalWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDefaultPropertiesFile(Map<String, String>) - Static method in class org.apache.spark.util.Utils
Return the path of the default Spark properties file.
getDefaultWorkFile(TaskAttemptContext, String) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
getDelaySeconds(SparkConf) - Static method in class org.apache.spark.util.MetadataCleaner
 
getDelaySeconds(SparkConf, Enumeration.Value) - Static method in class org.apache.spark.util.MetadataCleaner
 
getDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
 
getDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
getDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
 
getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
 
getDirName() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
getDiskWriter(BlockId, File, Serializer, int, ShuffleWriteMetrics) - Method in class org.apache.spark.storage.BlockManager
A short circuited method to get a block writer that can write data directly to disk.
getDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
getDouble(String, double) - Method in class org.apache.spark.SparkConf
Get a parameter as a double, falling back to a default if not set
getDoubleWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getDoubleWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getEarliestLeaderOffsets(Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
getEntrySet() - Method in class org.apache.spark.util.TimeStampedHashMap
 
getenv(String) - Method in class org.apache.spark.SparkConf
By using this instead of System.getenv(), environment variables can be mocked in unit tests.
getEpoch() - Method in class org.apache.spark.MapOutputTracker
Called to get current epoch number.
getEstimator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getEstimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getEvaluator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getExecutorEnv() - Method in class org.apache.spark.SparkConf
Get all executor environment variables set on this SparkConf
getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext
Return a map from the slave to the max memory available for caching and the remaining memory available for caching.
getExecutorsAliveOnHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return information about blocks stored in all of the slaves
getExecutorThreadDump(String) - Method in class org.apache.spark.SparkContext
Called by the web UI to obtain executor thread dumps.
getExternalTmpPath(Context, Path) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getFeatureOffset(int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Pre-compute feature offset for use with featureUpdate.
getFeaturesCol() - Method in interface org.apache.spark.ml.param.HasFeaturesCol
 
getField(String) - Method in class org.apache.spark.sql.Column
An expression that gets a field by name in a StructField.
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BINARY
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
getField(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Returns row(ordinal).
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.DATE
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.GENERIC
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.INT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
getField(Row, int) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
getFile(long) - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
getFile(String) - Method in class org.apache.spark.storage.DiskBlockManager
Looks up a file by hashing it into one of our local subdirectories.
getFile(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
 
getFile(String) - Method in class org.apache.spark.storage.TachyonBlockManager
 
getFile(BlockId) - Method in class org.apache.spark.storage.TachyonBlockManager
 
getFilePath(File, String) - Static method in class org.apache.spark.util.Utils
Return the absolute path of a file in the given directory.
getFileSegmentLocations(String, long, long, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
Get the locations of the HDFS blocks containing the given file segment.
getFileSystemForPath(Path, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getFinalValue() - Method in class org.apache.spark.partial.PartialResult
Blocking method to wait for and return the final value.
getFloatWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getFloatWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getFormattedClassName(Object) - Static method in class org.apache.spark.util.Utils
Return the class name of the given object, removing all dollar signs
getFunctionInfo(String) - Method in class org.apache.spark.sql.hive.HiveFunctionRegistry
 
getHadoopFileSystem(URI, Configuration) - Static method in class org.apache.spark.util.Utils
Return a Hadoop FileSystem with the scheme encoded in the given path.
getHadoopFileSystem(String, Configuration) - Static method in class org.apache.spark.util.Utils
Return a Hadoop FileSystem with the scheme encoded in the given path.
getHandlers() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
getHandlers() - Method in class org.apache.spark.ui.WebUI
 
getHttpUser() - Method in class org.apache.spark.SecurityManager
Gets the user used for authenticating HTTP connections.
getImplicitPrefs() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getImpurityCalculator(int, int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Get an ImpurityCalculator for a given (node, feature, bin).
getInitialModel() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Return the user supplied initial GMM, if supplied
getInputCol() - Method in interface org.apache.spark.ml.param.HasInputCol
 
getInputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getInputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getInstance(String) - Method in class org.apache.spark.metrics.MetricsConfig
 
getInt(String, int) - Method in class org.apache.spark.SparkConf
Get a parameter as an integer, falling back to a default if not set
getIntWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getIntWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getItem(int) - Method in class org.apache.spark.sql.Column
An expression that gets an item at position ordinal out of an array.
getItemCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getIteratorSize(Iterator<T>) - Static method in class org.apache.spark.util.Utils
Counts the number of elements of an iterator using a while loop rather than calling TraversableOnce.size() because it uses a for loop, which is slightly slower in the current version of Scala.
getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.DriverQuirks
 
getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.MySQLQuirks
 
getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.NoQuirks
 
getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.PostgresQuirks
 
getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Return a list of all known jobs in a particular job group.
getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker
Return a list of all known jobs in a particular job group.
getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Returns job information, or null if the job info could not be found or was garbage collected.
getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker
Returns job information, or None if the job info could not be found or was garbage collected.
getJulianDay() - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 
getK() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Return the number of Gaussians in the mixture model
getK() - Method in class org.apache.spark.mllib.clustering.LDA
Number of topics to infer.
getLabelCol() - Method in interface org.apache.spark.ml.param.HasLabelCol
 
getLatestLeaderOffsets(Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
getLeaderOffsets(Set<TopicAndPartition>, long) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
getLeaderOffsets(Set<TopicAndPartition>, long, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getLeastGroupHash(String) - Method in class org.apache.spark.rdd.PartitionCoalescer
Sorts and gets the least element of the list associated with key in groupHash The returned PartitionGroup is the least loaded of all groups that represent the machine "key"
getLeftRightFeatureOffsets(int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Pre-compute feature offset for use with featureUpdate.
getLocal(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get block from local block manager.
getLocalBytes(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get block from the local block manager as serialized bytes.
getLocalDir(SparkConf) - Static method in class org.apache.spark.util.Utils
Get the path of a temporary directory.
getLocalFileWriter(Row) - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
getLocalFileWriter(Row) - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
getLocalityIndex(Enumeration.Value) - Method in class org.apache.spark.scheduler.TaskSetManager
Find the index in myLocalityLevels for a given locality.
getLocalProperties() - Method in class org.apache.spark.SparkContext
 
getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Get a local property set in this thread, or null if it is missing.
getLocalProperty(String) - Method in class org.apache.spark.SparkContext
Get a local property set in this thread, or null if it is missing.
getLocation() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
getLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
getLocations(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
Get locations of the blockId from the driver
getLocations(BlockId[]) - Method in class org.apache.spark.storage.BlockManagerMaster
Get locations of multiple blockIds from the driver
getLogPath(String, String, Option<String>) - Static method in class org.apache.spark.scheduler.EventLoggingListener
Return a file-system-safe path to the log file for the given application.
getLong(String, long) - Method in class org.apache.spark.SparkConf
Get a parameter as a long, falling back to a default if not set
getLongWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getLongWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getLowerBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds
Returns a threshold p such that if we conduct n Bernoulli trials with success rate = p, it is very unlikely to have more than fraction * n successes.
getLowerBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds
Returns a lambda such that Pr[X > s] is very small, where X ~ Pois(lambda).
GetMapOutputStatuses - Class in org.apache.spark
 
GetMapOutputStatuses(int) - Constructor for class org.apache.spark.GetMapOutputStatuses
 
getMatchingBlockIds(Function1<BlockId, Object>) - Method in class org.apache.spark.storage.BlockManager
Get the ids of existing blocks that match the given filter.
getMatchingBlockIds(Function1<BlockId, Object>, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Return a list of ids of existing blocks such that the ids match the given filter.
getMaxBatchSize() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxInputStreamRememberDuration() - Method in class org.apache.spark.streaming.DStreamGraph
Get the maximum remember duration across all the input streams.
getMaxIter() - Method in interface org.apache.spark.ml.param.HasMaxIter
 
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Return the maximum number of iterations to run
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.LDA
Maximum number of iterations for learning.
getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMaxResultSize(SparkConf) - Static method in class org.apache.spark.util.Utils
 
getMemoryStatus() - Method in class org.apache.spark.storage.BlockManagerMaster
Return the memory status for each block manager, in the form of a map from the block manager's id to two long values.
getMessage() - Method in exception org.apache.spark.util.TaskCompletionListenerException
 
getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
getMetricsSnapshot(HttpServletRequest) - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getModel(Estimator<M>) - Method in class org.apache.spark.ml.PipelineModel
Gets the model produced by the input estimator.
getModifyAcls() - Method in class org.apache.spark.SecurityManager
 
getNarrowAncestors() - Method in class org.apache.spark.rdd.RDD
Return the ancestors of the given RDD that are related to it only through a sequence of narrow dependencies.
getNewReceiverStreamId() - Method in class org.apache.spark.streaming.StreamingContext
 
getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node
Traces down from a root node to get the node with the given node index.
getNonnegative() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
 
getNumFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
 
getNumItemBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getNumObjFields() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
getNumUserBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getObjFieldValues(Object, Object[]) - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
getOption(String) - Method in class org.apache.spark.SparkConf
Get a parameter as an Option
getOrCompute(RDD<T>, Partition, TaskContext, StorageLevel) - Method in class org.apache.spark.CacheManager
Gets or computes an RDD partition.
getOrCompute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Get the RDD corresponding to the given time; either retrieve it from cache or compute-and-cache it.
getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreateLocalRootDirs(SparkConf) - Static method in class org.apache.spark.util.Utils
Gets or creates the directories listed in spark.local.dir or SPARK_LOCAL_DIRS, and returns only the directories that exist / could be created.
getOutputCol() - Method in interface org.apache.spark.ml.param.HasOutputCol
 
getOutputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
getOutputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getParam(String) - Method in interface org.apache.spark.ml.param.Params
Gets a param by its name.
getParents(int) - Method in class org.apache.spark.NarrowDependency
Get the parent partitions for a child partition.
getParents(int) - Method in class org.apache.spark.OneToOneDependency
 
getParents(int) - Method in class org.apache.spark.RangeDependency
 
getParents(int) - Method in class org.apache.spark.rdd.PruneDependency
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
 
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
 
getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy
Returns the partition number for a given edge.
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
 
getPartition(Object) - Method in class org.apache.spark.HashPartitioner
 
getPartition(Object) - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
Returns the index of the partition the input coordinate belongs to.
getPartition(Object) - Method in class org.apache.spark.Partitioner
 
getPartition(Object) - Method in class org.apache.spark.RangePartitioner
 
getPartitionMetadata(Set<String>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
getPartitions() - Method in class org.apache.spark.mllib.rdd.RandomRDD
 
getPartitions() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
getPartitions() - Method in class org.apache.spark.rdd.BinaryFileRDD
 
getPartitions() - Method in class org.apache.spark.rdd.BlockRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CartesianRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CheckpointRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CoalescedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.EmptyRDD
 
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
 
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
 
getPartitions() - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.PipedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getPartitions() - Method in class org.apache.spark.rdd.SampledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
 
getPartitions() - Method in class org.apache.spark.rdd.SubtractedRDD
 
getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
 
getPartitions() - Method in class org.apache.spark.rdd.WholeTextFileRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
getPartitions() - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
getPartitions() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Retrieve the list of partitions corresponding to this RDD.
getPartitions(Set<String>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
getPartitions() - Method in class org.apache.spark.streaming.kafka.KafkaRDD
 
getPartitions() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
 
getPath() - Method in class org.apache.spark.input.PortableDataStream
 
getPeers(BlockManagerId) - Method in class org.apache.spark.storage.BlockManagerMaster
Get ids of other nodes in the cluster from the driver
getPendingTimes() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
getPersistentRDDs() - Method in class org.apache.spark.SparkContext
Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
getPipeEnvVars() - Method in class org.apache.spark.rdd.HadoopPartition
Get any environment variables that should be added to the users environment when running pipes
getPipeline() - Method in class org.apache.spark.streaming.flume.FlumeReceiver.CompressionChannelPipelineFactory
 
getPointIterator(RandomRDDPartition<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RandomRDD
 
getPoissonSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Return the per partition sampling function used for sampling with replacement.
getPoolForName(String) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return the pool associated with the given name, if one exists
getPredictionCol() - Method in interface org.apache.spark.ml.param.HasPredictionCol
 
getPreferredLocations(Partition) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.BlockRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CartesianRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CheckpointRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CoalescedRDD
Returns the preferred machine for the partition.
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.RDDCheckpointData
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.SampledRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.streaming.kafka.KafkaRDD
 
getPreferredLocations(Partition) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
Get the preferred location of the partition.
getPreferredLocs(RDD<?>, int) - Method in class org.apache.spark.scheduler.DAGScheduler
Gets the locality information associated with a partition of a particular RDD.
getPreferredLocs(RDD<?>, int) - Method in class org.apache.spark.SparkContext
Gets the locality information associated with the partition in a particular rdd
getPrimitiveNullWritable() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getPrimitiveNullWritableConstantObjectInspector() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getProbabilityCol() - Method in interface org.apache.spark.ml.param.HasProbabilityCol
 
getProgress() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
getProgress() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
getProgress() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
getPropertiesFromFile(String) - Static method in class org.apache.spark.util.Utils
Load properties present in the given file.
getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getQuantiles(Traversable<Object>) - Method in class org.apache.spark.util.Distribution
Get the value of the distribution at the given probabilities.
getRackForHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
getRank() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getRatingCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getRawPredictionCol() - Method in interface org.apache.spark.ml.param.HasRawPredictionCol
 
getRddBlockLocations(int, Seq<StorageStatus>) - Static method in class org.apache.spark.storage.StorageUtils
Return a mapping from block ID to its locations for each block that belongs to the given RDD.
getRDDStorageInfo() - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Return information about what RDDs are cached, if they are in mem or on disk, how much space they take, etc.
getReceiver() - Method in class org.apache.spark.streaming.dstream.PluggableInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.dstream.RawInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
Gets the receiver object that will be sent to the worker nodes to receive data.
getReceiver() - Method in class org.apache.spark.streaming.dstream.SocketInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.flume.FlumeInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.kafka.KafkaInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.mqtt.MQTTInputDStream
 
getReceiver() - Method in class org.apache.spark.streaming.twitter.TwitterInputDStream
 
getReceiverInputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
 
getRecordLength(JobContext) - Static method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Retrieves the record length property from a Hadoop configuration
getReference(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getRegParam() - Method in interface org.apache.spark.ml.param.HasRegParam
 
getRemote(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get block from remote block managers.
getRemoteBytes(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get block from remote block managers as serialized bytes.
getResource(List<Protos.Resource>, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Helper function to pull out a resource from a Mesos Resources protobuf
getResource(String) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
 
getResources(String) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
 
getRestartTime(long) - Method in class org.apache.spark.streaming.util.RecurringTimer
Get the time when the timer will fire if it is restarted right now.
getRootConverter() - Method in class org.apache.spark.sql.parquet.RowRecordMaterializer
 
getRootDirectory() - Static method in class org.apache.spark.SparkFiles
Get the root directory that contains files added through SparkContext.addFile().
getSaslUser() - Method in class org.apache.spark.SecurityManager
Gets the user used for authenticating SASL connections.
getSaslUser(String) - Method in class org.apache.spark.SecurityManager
 
getSchedulableByName(String) - Method in class org.apache.spark.scheduler.Pool
 
getSchedulableByName(String) - Method in interface org.apache.spark.scheduler.Schedulable
 
getSchedulableByName(String) - Method in class org.apache.spark.scheduler.TaskSetManager
 
getSchedulingMode() - Method in class org.apache.spark.SparkContext
Return current scheduling mode
getSchema(Configuration) - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
getSecretKey() - Method in class org.apache.spark.SecurityManager
Gets the secret key.
getSecretKey(String) - Method in class org.apache.spark.SecurityManager
 
getSecurityManager() - Method in class org.apache.spark.ui.WebUI
 
getSeed() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Return the random seed
getSeed() - Method in class org.apache.spark.mllib.clustering.LDA
Random seed
getSeqOp(boolean, Map<K, Object>, StratifiedSamplingUtils.RandomDataGenerator, Option<Map<K, Object>>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Returns the function used by aggregate to collect sampling statistics for each partition.
getSerializedMapOutputStatuses(int) - Method in class org.apache.spark.MapOutputTrackerMaster
 
getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
 
getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
 
getServerStatuses(int, int) - Method in class org.apache.spark.MapOutputTracker
Called from executors to get the server URIs and output sizes of the map outputs of a given shuffle.
getServletHandlers() - Method in class org.apache.spark.metrics.MetricsSystem
Get any UI handlers used by this metrics system; can only be called after start().
getShortWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getShortWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getSingle(BlockId) - Method in class org.apache.spark.storage.BlockManager
Read a block consisting of a single object.
getSize(BlockId) - Method in class org.apache.spark.storage.BlockStore
Return the size of a block in bytes.
getSize(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getSize(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getSize(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getSizeForBlock(int) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
getSizeForBlock(int) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
getSizeForBlock(int) - Method in interface org.apache.spark.scheduler.MapStatus
Estimated size for the reduce block, in bytes.
getSizesOfActiveStateTrackingCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSizesOfHardSizeLimitedCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSizesOfSoftSizeLimitedCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
getSlotDescs() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
getSortedRolledOverFiles(String, String) - Static method in class org.apache.spark.util.logging.RollingFileAppender
Get the sorted list of rolled over files.
getSortedTaskSetQueue() - Method in class org.apache.spark.scheduler.Pool
 
getSortedTaskSetQueue() - Method in interface org.apache.spark.scheduler.Schedulable
 
getSortedTaskSetQueue() - Method in class org.apache.spark.scheduler.TaskSetManager
 
getSparkClassLoader() - Static method in class org.apache.spark.util.Utils
Get the ClassLoader which loaded Spark.
getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext
Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getSparkHome() - Method in class org.apache.spark.SparkContext
Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getSparkOrYarnConfig(SparkConf, String, String) - Static method in class org.apache.spark.util.Utils
Return the value of a config either through the SparkConf or the Hadoop configuration if this is Yarn mode.
getSparkUI(StreamingContext) - Static method in class org.apache.spark.streaming.ui.StreamingTab
 
getSplits(JobContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getSplits(Configuration, List<Footer>) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
Returns stage information, or null if the stage info could not be found or was garbage collected.
getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker
Returns stage information, or None if the stage info could not be found or was garbage collected.
getStages() - Method in class org.apache.spark.ml.Pipeline
 
getStartTime() - Method in class org.apache.spark.streaming.util.RecurringTimer
Get the time when this timer will fire if it is started right now.
getStatsSetupConstRawDataSize() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getStatsSetupConstTotalSize() - Static method in class org.apache.spark.sql.hive.HiveShim
 
getStatus(BlockId) - Method in class org.apache.spark.storage.BlockManager
Get the BlockStatus for the block identified by the given ID, if it exists.
getStatus(BlockId) - Method in class org.apache.spark.storage.BlockManagerInfo
 
getStderr(Process, long) - Static method in class org.apache.spark.util.Utils
Return the stderr of a process after waiting for the process to terminate.
getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
getStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
getStorageLevel() - Method in class org.apache.spark.rdd.RDD
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageStatus() - Method in class org.apache.spark.storage.BlockManagerMaster
 
getStringWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getStringWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getSystemProperties() - Static method in class org.apache.spark.util.Utils
Returns the system properties map that is thread-safe to iterator over.
getTableDesc(Class<? extends Deserializer>, Class<? extends InputFormat<?, ?>>, Class<?>, Properties) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getTables(Option<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
getTabs() - Method in class org.apache.spark.ui.WebUI
 
getTaskSideSplits(Configuration, List<Footer>, Long, Long, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
getThreadDump() - Static method in class org.apache.spark.util.Utils
Return a thread dump of all threads' stacktraces.
getThreadLocal() - Static method in class org.apache.spark.SparkEnv
Returns the ThreadLocal SparkEnv.
getThreshold() - Method in interface org.apache.spark.ml.param.HasThreshold
 
getThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
:: Experimental :: Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
getThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
:: Experimental :: Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
getTimeMillis() - Method in interface org.apache.spark.util.Clock
 
getTimeMillis() - Method in class org.apache.spark.util.ManualClock
 
getTimeMillis() - Method in class org.apache.spark.util.SystemClock
 
getTimeOfDayNanos() - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 
getTimestamp(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
getTimestamp(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
getTimeStampedValue(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
getTimestampWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
getTimestampWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
 
GettingResultEvent - Class in org.apache.spark.scheduler
 
GettingResultEvent(TaskInfo) - Constructor for class org.apache.spark.scheduler.GettingResultEvent
 
gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo
The time when the task started remotely getting the result.
getTopicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
getUIPort(SparkConf) - Static method in class org.apache.spark.ui.SparkUI
 
getUnallocatedBlocks(int) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Get blocks that have been added but not yet allocated to any batch.
getUpperBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds
Returns a threshold p such that if we conduct n Bernoulli trials with success rate = p, it is very unlikely to have less than fraction * n successes.
getUpperBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds
Returns a lambda such that Pr[X < s] is very small, where X ~ Pois(lambda).
getURLs() - Method in class org.apache.spark.util.MutableURLClassLoader
 
getUsedTimeMs(long) - Static method in class org.apache.spark.util.Utils
Return the string to tell how long has passed in milliseconds.
getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
getUserCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
 
getValues(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
getValues(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
getValues(BlockId, Serializer) - Method in class org.apache.spark.storage.DiskStore
A version of getValues that allows a custom serializer.
getValues(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
getValues(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
getVectorIterator(RandomRDDPartition<Object>, int) - Static method in class org.apache.spark.mllib.rdd.RandomRDD
 
getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel
Returns a map of words to their vector representations.
getViewAcls() - Method in class org.apache.spark.SecurityManager
 
Gini - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating the Gini impurity during binary classification.
Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
 
GiniAggregator - Class in org.apache.spark.mllib.tree.impurity
Class for updating views of a vector of sufficient statistics, in order to compute impurity from a sample.
GiniAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.GiniAggregator
 
GiniCalculator - Class in org.apache.spark.mllib.tree.impurity
Stores statistics for one (node, feature, bin) for calculating impurity.
GiniCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.GiniCalculator
 
GLMClassificationModel - Class in org.apache.spark.mllib.classification.impl
Helper class for import/export of GLM classification models.
GLMClassificationModel() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel
 
GLMClassificationModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.classification.impl
 
GLMClassificationModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
 
GLMClassificationModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.classification.impl
Model data for import/export
GLMClassificationModel.SaveLoadV1_0$.Data(Vector, double, Option<Object>) - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
 
GLMRegressionModel - Class in org.apache.spark.mllib.regression.impl
Helper methods for import/export of GLM regression models.
GLMRegressionModel() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel
 
GLMRegressionModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.regression.impl
 
GLMRegressionModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
 
GLMRegressionModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.regression.impl
Model data for model import/export
GLMRegressionModel.SaveLoadV1_0$.Data(Vector, double) - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
 
globalTopicTotals() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
Aggregate distributions over topics from all term vertices.
glom() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in class org.apache.spark.rdd.RDD
Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
glom() - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
GlommedDStream<T> - Class in org.apache.spark.streaming.dstream
 
GlommedDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.GlommedDStream
 
goodnessOfFit() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
grad() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
Gradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to compute the gradient for a loss function, given a single data point.
Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
 
gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
Method to calculate the gradients for the gradient boosting calculation for least absolute error calculation.
gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
Method to calculate the loss gradients for the gradient boosting calculation for binary classification The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x)))
gradient(TreeEnsembleModel, LabeledPoint) - Method in interface org.apache.spark.mllib.tree.loss.Loss
Method to calculate the gradients for the gradient boosting calculation.
gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
Method to calculate the gradients for the gradient boosting calculation for least squares error calculation.
GradientBoostedTrees - Class in org.apache.spark.mllib.tree
:: Experimental :: A class that implements Stochastic Gradient Boosting for regression and binary classification.
GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
 
GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model
:: Experimental :: Represents a gradient boosted trees model.
GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
GradientDescent - Class in org.apache.spark.mllib.optimization
Class used to solve an optimization problem using Gradient Descent.
GradientDescent(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.GradientDescent
 
Graph<VD,ED> - Class in org.apache.spark.graphx
The Graph abstractly represents a graph with arbitrary objects associated with vertices and edges.
graph() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
graph() - Method in class org.apache.spark.streaming.Checkpoint
 
graph() - Method in class org.apache.spark.streaming.dstream.DStream
 
graph() - Method in class org.apache.spark.streaming.StreamingContext
 
graphCheckpointer() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
GraphGenerators - Class in org.apache.spark.graphx.util
A collection of graph generating functions.
GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
 
GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl
An implementation of Graph to support computation on graphs.
graphite() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_PREFIX() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_HOST() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PORT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PREFIX() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_PROTOCOL() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GRAPHITE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
GraphiteSink - Class in org.apache.spark.metrics.sink
 
GraphiteSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.GraphiteSink
 
GraphKryoRegistrator - Class in org.apache.spark.graphx
Registers GraphX classes with Kryo for improved performance.
GraphKryoRegistrator() - Constructor for class org.apache.spark.graphx.GraphKryoRegistrator
 
GraphLoader - Class in org.apache.spark.graphx
Provides utilities for loading Graphs from files.
GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
 
GraphOps<VD,ED> - Class in org.apache.spark.graphx
Contains additional functionality for Graph.
GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
 
graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
Implicitly extracts the GraphOps member from a graph.
GraphXUtils - Class in org.apache.spark.graphx
 
GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
 
greater(Duration) - Method in class org.apache.spark.streaming.Duration
 
greater(Time) - Method in class org.apache.spark.streaming.Time
 
greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
 
greaterEq(Time) - Method in class org.apache.spark.streaming.Time
 
GreaterThan - Class in org.apache.spark.sql.sources
 
GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
 
GreaterThanOrEqual - Class in org.apache.spark.sql.sources
 
GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
 
gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
Create rows by cols grid graph with each vertex connected to its row+1 and col+1 neighbors.
GridPartitioner - Class in org.apache.spark.mllib.linalg.distributed
A grid partitioner, which uses a regular grid to partition coordinates.
GridPartitioner(int, int, int, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
groupArr() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD of grouped elements.
groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD of grouped elements.
groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped items.
groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped elements.
groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD
Return an RDD of grouped items.
groupBy(Column...) - Method in class org.apache.spark.sql.DataFrame
Groups the DataFrame using the specified columns, so we can run aggregation on them.
groupBy(String, String...) - Method in class org.apache.spark.sql.DataFrame
Groups the DataFrame using the specified columns, so we can run aggregation on them.
groupBy(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
Groups the DataFrame using the specified columns, so we can run aggregation on them.
groupBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
Groups the DataFrame using the specified columns, so we can run aggregation on them.
groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Group the values for each key in the RDD into a single sequence.
groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey on each RDD of this DStream.
groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey on each RDD.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Create a new DStream by applying groupByKey over a sliding window on this DStream.
groupByResultToJava(RDD<Tuple2<K, Iterable<T>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
GroupedCountEvaluator<T> - Class in org.apache.spark.partial
An ApproximateEvaluator for counts by key.
GroupedCountEvaluator(int, double, ClassTag<T>) - Constructor for class org.apache.spark.partial.GroupedCountEvaluator
 
GroupedData - Class in org.apache.spark.sql
:: Experimental :: A set of methods for aggregations on a DataFrame, created by DataFrame.groupBy.
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph
Merges multiple edges between two vertices into a single edge.
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Merge all the edges with the same src and dest id into a single edge using the merge function
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
GroupedMeanEvaluator<T> - Class in org.apache.spark.partial
An ApproximateEvaluator for means by key.
GroupedMeanEvaluator(int, double) - Constructor for class org.apache.spark.partial.GroupedMeanEvaluator
 
GroupedSumEvaluator<T> - Class in org.apache.spark.partial
An ApproximateEvaluator for sums by key.
GroupedSumEvaluator(int, double) - Constructor for class org.apache.spark.partial.GroupedSumEvaluator
 
groupHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
groupId() - Method in class org.apache.spark.scheduler.JobGroupCancelled
 
groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
Alias for cogroup.
groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for cogroup.
groupWriter() - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
GrowableAccumulableParam<R,T> - Class in org.apache.spark
 
GrowableAccumulableParam(Function1<R, Growable<T>>, ClassTag<R>) - Constructor for class org.apache.spark.GrowableAccumulableParam
 
gt(Object) - Method in class org.apache.spark.sql.Column
Greater than.

H

hadoopConfiguration() - Method in class org.apache.spark.api.java.JavaSparkContext
Returns the Hadoop configuration used for the Hadoop code (e.g.
hadoopConfiguration() - Method in class org.apache.spark.SparkContext
A default Hadoop Configuration for the Hadoop code (e.g.
hadoopFile(String, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat.
hadoopFile(String, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, int, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFiles() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
hadoopJobMetadata() - Method in class org.apache.spark.SparkEnv
 
HadoopPartition - Class in org.apache.spark.rdd
A Spark split class that wraps around a Hadoop InputSplit.
HadoopPartition(int, int, InputSplit) - Constructor for class org.apache.spark.rdd.HadoopPartition
 
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
HadoopRDD<K,V> - Class in org.apache.spark.rdd
:: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the older MapReduce API (org.apache.hadoop.mapred).
HadoopRDD(SparkContext, Broadcast<SerializableWritable<Configuration>>, Option<Function1<JobConf, BoxedUnit>>, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
 
HadoopRDD(SparkContext, JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
 
hadoopRDD(JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other necessary info (e.g.
HadoopRDD.HadoopMapPartitionsWithSplitRDD<U,T> - Class in org.apache.spark.rdd
Analogous to MapPartitionsRDD, but passes in an InputSplit to the given function rather than the index of the partition.
HadoopRDD.HadoopMapPartitionsWithSplitRDD(RDD<T>, Function2<InputSplit, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
HadoopRDD.HadoopMapPartitionsWithSplitRDD$ - Class in org.apache.spark.rdd
 
HadoopRDD.HadoopMapPartitionsWithSplitRDD$() - Constructor for class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$
 
HadoopRDD.SplitInfoReflections - Class in org.apache.spark.rdd
 
HadoopRDD.SplitInfoReflections() - Constructor for class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
HadoopTableReader - Class in org.apache.spark.sql.hive
Helper class for scanning tables stored in Hadoop - e.g., to read Hive tables that reside in the data warehouse directory.
HadoopTableReader(Seq<Attribute>, MetastoreRelation, HiveContext, HiveConf) - Constructor for class org.apache.spark.sql.hive.HadoopTableReader
 
hammingLoss() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns Hamming-loss
handle() - Method in class org.apache.spark.rdd.ShuffleCoGroupSplitDep
 
handle(Signal) - Method in class org.apache.spark.util.SignalLoggerHandler
 
handleAskPermissionToCommit(int, long, long) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
 
handleBeginEvent(Task<?>, TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleExecutorAdded(String, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleExecutorLost(String, boolean, Option<Object>) - Method in class org.apache.spark.scheduler.DAGScheduler
Responds to an executor being lost.
handleFailedTask(TaskSetManager, long, Enumeration.Value, TaskEndReason) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleFailedTask(long, Enumeration.Value, TaskEndReason) - Method in class org.apache.spark.scheduler.TaskSetManager
Marks the task as failed, re-adds it to the list of pending tasks, and notifies the DAG Scheduler.
handleGetTaskResult(TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobCancellation(int, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobCompletion(Job) - Method in class org.apache.spark.streaming.scheduler.JobSet
 
handleJobGroupCancelled(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleJobStart(Job) - Method in class org.apache.spark.streaming.scheduler.JobSet
 
handleJobSubmitted(int, RDD<?>, Function2<TaskContext, Iterator<Object>, ?>, int[], boolean, CallSite, JobListener, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleKillRequest(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.StagesTab
 
handleStageCancellation(int) - Method in class org.apache.spark.scheduler.DAGScheduler
 
handleSuccessfulTask(TaskSetManager, long, DirectTaskResult<?>) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleSuccessfulTask(long, DirectTaskResult<?>) - Method in class org.apache.spark.scheduler.TaskSetManager
Marks the task as successful and notifies the DAGScheduler that a task has ended.
handleTaskCompletion(CompletionEvent) - Method in class org.apache.spark.scheduler.DAGScheduler
Responds to a task finishing.
handleTaskGettingResult(TaskSetManager, long) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
handleTaskGettingResult(long) - Method in class org.apache.spark.scheduler.TaskSetManager
Marks the task as getting result and notifies the DAG Scheduler
handleTaskSetFailed(TaskSet, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
hasBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
hasCompleted() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
hasDictionarySupport() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
 
hasDstId() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
hasExecutorsAliveOnHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
HasFeaturesCol - Interface in org.apache.spark.ml.param
 
hashCode() - Method in class org.apache.spark.graphx.EdgeDirection
 
hashCode() - Method in class org.apache.spark.HashPartitioner
 
hashCode() - Method in interface org.apache.spark.mllib.linalg.Vector
 
hashCode() - Method in interface org.apache.spark.Partition
 
hashCode() - Method in class org.apache.spark.RangePartitioner
 
hashCode() - Method in class org.apache.spark.rdd.CoGroupPartition
 
hashCode() - Method in class org.apache.spark.rdd.HadoopPartition
 
hashCode() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
hashCode() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
hashCode() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
hashCode() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
hashCode() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
 
hashCode() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
hashCode() - Method in class org.apache.spark.scheduler.SplitInfo
 
hashCode() - Method in class org.apache.spark.scheduler.Stage
 
hashCode() - Method in class org.apache.spark.sql.Column
 
hashCode() - Method in class org.apache.spark.sql.json.JSONRelation
 
hashCode() - Method in class org.apache.spark.storage.BlockId
 
hashCode() - Method in class org.apache.spark.storage.BlockManagerId
 
hashCode() - Method in class org.apache.spark.storage.StorageLevel
 
hashCode() - Method in class org.apache.spark.streaming.kafka.Broker
 
hashCode() - Method in class org.apache.spark.streaming.kafka.OffsetRange
 
HashingTF - Class in org.apache.spark.ml.feature
:: AlphaComponent :: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF() - Constructor for class org.apache.spark.ml.feature.HashingTF
 
HashingTF - Class in org.apache.spark.mllib.feature
:: Experimental :: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF(int) - Constructor for class org.apache.spark.mllib.feature.HashingTF
 
HashingTF() - Constructor for class org.apache.spark.mllib.feature.HashingTF
 
hasHostAliveOnRack(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
HashPartitioner - Class in org.apache.spark
A Partitioner that implements hash-based partitioning using Java's Object.hashCode.
HashPartitioner(int) - Constructor for class org.apache.spark.HashPartitioner
 
hasInput() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
HasInputCol - Interface in org.apache.spark.ml.param
 
HasLabelCol - Interface in org.apache.spark.ml.param
 
HasMaxIter - Interface in org.apache.spark.ml.param
 
hasNext() - Method in class org.apache.spark.InterruptibleIterator
 
hasNext() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
hasNext() - Method in class org.apache.spark.sql.columnar.BasicColumnAccessor
 
hasNext() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
hasNext() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
hasNext() - Method in interface org.apache.spark.sql.columnar.compression.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
hasNext() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
hasNext() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
hasNext() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
hasNext() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
hasNext() - Method in class org.apache.spark.util.CompletionIterator
 
hasNext() - Method in class org.apache.spark.util.NextIterator
 
hasNext() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
hasNext() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
HasOffsetRanges - Interface in org.apache.spark.streaming.kafka
:: Experimental :: Represents any object that has a collection of OffsetRanges.
hasOutput() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
HasOutputCol - Interface in org.apache.spark.ml.param
 
HasPredictionCol - Interface in org.apache.spark.ml.param
 
HasProbabilityCol - Interface in org.apache.spark.ml.param
 
HasRawPredictionCol - Interface in org.apache.spark.ml.param
 
HasRegParam - Interface in org.apache.spark.ml.param
 
hasRootAsShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
hasRootAsShutdownDeleteDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
hasShuffleRead() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
hasShuffleWrite() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
hasShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
hasShutdownDeleteTachyonDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
hasSrcId() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
 
hasStarted() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
HasThreshold - Interface in org.apache.spark.ml.param
 
hasUnallocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Check if any blocks are left to be processed
hasUnallocatedReceivedBlocks() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Check if any blocks are left to be allocated to batches.
hasWriteObjectMethod() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
hasWriteReplaceMethod() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
HDFSCacheTaskLocation - Class in org.apache.spark.scheduler
A location on a host that is cached by HDFS.
HDFSCacheTaskLocation(String) - Constructor for class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
HdfsUtils - Class in org.apache.spark.streaming.util
 
HdfsUtils() - Constructor for class org.apache.spark.streaming.util.HdfsUtils
 
head(int) - Method in class org.apache.spark.sql.DataFrame
Returns the first n rows.
head() - Method in class org.apache.spark.sql.DataFrame
Returns the first row.
headerSparkPage(String, Function0<Seq<Node>>, SparkUITab, Option<Object>, Option<String>) - Static method in class org.apache.spark.ui.UIUtils
Returns a spark page with correctly formatted headers
headerTabs() - Method in class org.apache.spark.ui.WebUITab
Get a list of header tabs from the parent UI.
Heartbeat - Class in org.apache.spark
A heartbeat from executors to the driver.
Heartbeat(String, Tuple2<Object, TaskMetrics>[], BlockManagerId) - Constructor for class org.apache.spark.Heartbeat
 
HeartbeatReceiver - Class in org.apache.spark
Lives in the driver to receive heartbeats from executors..
HeartbeatReceiver(TaskScheduler) - Constructor for class org.apache.spark.HeartbeatReceiver
 
HeartbeatResponse - Class in org.apache.spark
 
HeartbeatResponse(boolean) - Constructor for class org.apache.spark.HeartbeatResponse
 
hiccups() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
high() - Method in class org.apache.spark.partial.BoundedDouble
 
HighlyCompressedMapStatus - Class in org.apache.spark.scheduler
A MapStatus implementation that only stores the average size of non-empty blocks, plus a bitmap for tracking which blocks are empty.
highSplit() - Method in class org.apache.spark.mllib.tree.model.Bin
 
HingeGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.
HingeGradient() - Constructor for class org.apache.spark.mllib.optimization.HingeGradient
 
histogram(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[]) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute a histogram using the provided buckets.
histogram(Double[], boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
histogram(int) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[], boolean) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute a histogram using the provided buckets.
HiveContext - Class in org.apache.spark.sql.hive
An instance of the Spark SQL execution engine that integrates with data stored in Hive.
HiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext
 
hiveContext() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
HiveDDLStrategy() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
hiveDefaultTableFilePath(String) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
HiveFunctionRegistry - Class in org.apache.spark.sql.hive
 
HiveFunctionRegistry() - Constructor for class org.apache.spark.sql.hive.HiveFunctionRegistry
 
HiveFunctionWrapper - Class in org.apache.spark.sql.hive
This class provides the UDF creation and also the UDF instance serialization and de-serialization cross process boundary.
HiveFunctionWrapper(String) - Constructor for class org.apache.spark.sql.hive.HiveFunctionWrapper
 
HiveFunctionWrapper() - Constructor for class org.apache.spark.sql.hive.HiveFunctionWrapper
 
HiveGenericUdaf - Class in org.apache.spark.sql.hive
 
HiveGenericUdaf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdaf
 
HiveGenericUdf - Class in org.apache.spark.sql.hive
 
HiveGenericUdf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdf
 
HiveGenericUdtf - Class in org.apache.spark.sql.hive
Converts a Hive Generic User Defined Table Generating Function (UDTF) to a Generator.
HiveGenericUdtf(HiveFunctionWrapper, Seq<String>, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveGenericUdtf
 
HiveInspectors - Interface in org.apache.spark.sql.hive
1.
HiveInspectors.typeInfoConversions - Class in org.apache.spark.sql.hive
 
HiveInspectors.typeInfoConversions(DataType) - Constructor for class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
 
HiveMetastoreCatalog - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog(HiveContext) - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
HiveMetastoreCatalog.CreateTables - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog.CreateTables() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables
Creates any tables required for query execution.
HiveMetastoreCatalog.ParquetConversions - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog.ParquetConversions() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog.ParquetConversions
When scanning or writing to non-partitioned Metastore Parquet tables, convert them to Parquet data source relations for better performance.
HiveMetastoreCatalog.PreInsertionCasts - Class in org.apache.spark.sql.hive
 
HiveMetastoreCatalog.PreInsertionCasts() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
Casts input data to correct data types according to table definition before inserting into that table.
HiveMetastoreCatalog.QualifiedTableName - Class in org.apache.spark.sql.hive
A fully qualified identifier for a table (i.e., database.tableName)
HiveMetastoreCatalog.QualifiedTableName(String, String) - Constructor for class org.apache.spark.sql.hive.HiveMetastoreCatalog.QualifiedTableName
 
HiveMetastoreTypes - Class in org.apache.spark.sql.hive
An attribute map for determining the ordinal for non-partition columns.
HiveMetastoreTypes() - Constructor for class org.apache.spark.sql.hive.HiveMetastoreTypes
 
HiveNativeCommand - Class in org.apache.spark.sql.hive.execution
 
HiveNativeCommand(String) - Constructor for class org.apache.spark.sql.hive.execution.HiveNativeCommand
 
HiveQl - Class in org.apache.spark.sql.hive
Provides a mapping from HiveQL statements to catalyst logical plans and expression trees.
HiveQl() - Constructor for class org.apache.spark.sql.hive.HiveQl
 
HiveQl.Token$ - Class in org.apache.spark.sql.hive
Extractor for matching Hive's AST Tokens.
HiveQl.Token$() - Constructor for class org.apache.spark.sql.hive.HiveQl.Token$
 
HiveQl.TransformableNode - Class in org.apache.spark.sql.hive
A set of implicit transformations that allow Hive ASTNodes to be rewritten by transformations similar to catalyst.trees.TreeNode.
HiveQl.TransformableNode(ASTNode) - Constructor for class org.apache.spark.sql.hive.HiveQl.TransformableNode
 
hiveQlPartitions() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
hiveQlTable() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
HiveScriptIOSchema - Class in org.apache.spark.sql.hive.execution
The wrapper class of Hive input and output schema properties
HiveScriptIOSchema(Seq<Tuple2<String, String>>, Seq<Tuple2<String, String>>, String, String, Seq<Tuple2<String, String>>, Seq<Tuple2<String, String>>, boolean) - Constructor for class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
HiveShim - Class in org.apache.spark.sql.hive
A compatibility layer for interacting with Hive version 0.13.1.
HiveShim() - Constructor for class org.apache.spark.sql.hive.HiveShim
 
HiveSimpleUdf - Class in org.apache.spark.sql.hive
 
HiveSimpleUdf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveSimpleUdf
 
HiveStrategies - Interface in org.apache.spark.sql.hive
 
HiveStrategies.DataSinks - Class in org.apache.spark.sql.hive
 
HiveStrategies.DataSinks() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.DataSinks
 
HiveStrategies.HiveCommandStrategy - Class in org.apache.spark.sql.hive
 
HiveStrategies.HiveCommandStrategy(HiveContext) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
HiveStrategies.HiveDDLStrategy - Class in org.apache.spark.sql.hive
 
HiveStrategies.HiveDDLStrategy() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveDDLStrategy
 
HiveStrategies.HiveTableScans - Class in org.apache.spark.sql.hive
 
HiveStrategies.HiveTableScans() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
Retrieves data using a HiveTableScan.
HiveStrategies.ParquetConversion - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion
:: Experimental :: Finds table scans that would use the Hive SerDe and replaces them with our own native parquet table scan operator.
HiveStrategies.ParquetConversion.LogicalPlanHacks - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion.LogicalPlanHacks(DataFrame) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
HiveStrategies.ParquetConversion.PhysicalPlanHacks - Class in org.apache.spark.sql.hive
 
HiveStrategies.ParquetConversion.PhysicalPlanHacks(SparkPlan) - Constructor for class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.PhysicalPlanHacks
 
HiveStrategies.Scripts - Class in org.apache.spark.sql.hive
 
HiveStrategies.Scripts() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.Scripts
 
HiveTableScan - Class in org.apache.spark.sql.hive.execution
The Hive table scan operator.
HiveTableScan(Seq<Attribute>, MetastoreRelation, Option<Expression>, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.HiveTableScan
 
HiveTableScans() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
HiveUdaf - Class in org.apache.spark.sql.hive
It is used as a wrapper for the hive functions which uses UDAF interface
HiveUdaf(HiveFunctionWrapper, Seq<Expression>) - Constructor for class org.apache.spark.sql.hive.HiveUdaf
 
HiveUdafFunction - Class in org.apache.spark.sql.hive
 
HiveUdafFunction(HiveFunctionWrapper, Seq<Expression>, AggregateExpression, boolean) - Constructor for class org.apache.spark.sql.hive.HiveUdafFunction
 
HiveUdafFunction() - Constructor for class org.apache.spark.sql.hive.HiveUdafFunction
 
horzcat(Matrix[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
Horizontally concatenate a sequence of matrices.
host() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
host() - Method in class org.apache.spark.scheduler.ExecutorAdded
 
host() - Method in class org.apache.spark.scheduler.ExecutorCacheTaskLocation
 
host() - Method in class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
host() - Method in class org.apache.spark.scheduler.HostTaskLocation
 
host() - Method in class org.apache.spark.scheduler.TaskInfo
 
host() - Method in interface org.apache.spark.scheduler.TaskLocation
 
host() - Method in class org.apache.spark.scheduler.WorkerOffer
 
host() - Method in class org.apache.spark.storage.BlockManagerId
 
host() - Method in class org.apache.spark.streaming.kafka.Broker
Broker's hostname
host() - Method in class org.apache.spark.streaming.kafka.KafkaCluster.LeaderOffset
 
host() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
 
host() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
hostLocation() - Method in class org.apache.spark.scheduler.SplitInfo
 
hostnameVerifier() - Method in class org.apache.spark.SecurityManager
 
hostPort() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
hostPort() - Method in class org.apache.spark.storage.BlockManagerId
 
hostPort() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
HostTaskLocation - Class in org.apache.spark.scheduler
A location on a host.
HostTaskLocation(String) - Constructor for class org.apache.spark.scheduler.HostTaskLocation
 
hours() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
htmlResponderToServlet(Function1<HttpServletRequest, Seq<Node>>) - Static method in class org.apache.spark.ui.JettyUtils
 
HTTP_BROADCAST() - Static method in class org.apache.spark.util.MetadataCleanerType
 
HttpBroadcast<T> - Class in org.apache.spark.broadcast
A Broadcast implementation that uses HTTP server as a broadcast mechanism.
HttpBroadcast(T, boolean, long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.HttpBroadcast
 
HttpBroadcastFactory - Class in org.apache.spark.broadcast
A BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism.
HttpBroadcastFactory() - Constructor for class org.apache.spark.broadcast.HttpBroadcastFactory
 
HttpFileServer - Class in org.apache.spark
 
HttpFileServer(SparkConf, SecurityManager, int) - Constructor for class org.apache.spark.HttpFileServer
 
httpFileServer() - Method in class org.apache.spark.SparkEnv
 
httpServer() - Method in class org.apache.spark.HttpFileServer
 
HttpServer - Class in org.apache.spark
An HTTP server for static content used to allow worker nodes to access JARs added to SparkContext as well as classes created by the interpreter when the user types in code.
HttpServer(SparkConf, File, SecurityManager, int, String) - Constructor for class org.apache.spark.HttpServer
 

I

i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
id() - Method in class org.apache.spark.Accumulable
 
id() - Method in interface org.apache.spark.api.java.JavaRDDLike
A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.broadcast.Broadcast
 
id() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
 
id() - Method in class org.apache.spark.mllib.tree.model.Node
 
id() - Method in class org.apache.spark.rdd.RDD
A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
id() - Method in class org.apache.spark.scheduler.Stage
 
id() - Method in class org.apache.spark.scheduler.TaskInfo
 
id() - Method in class org.apache.spark.scheduler.TaskSet
 
id() - Method in class org.apache.spark.storage.RDDInfo
 
id() - Method in class org.apache.spark.storage.TempLocalBlockId
 
id() - Method in class org.apache.spark.storage.TempShuffleBlockId
 
id() - Method in class org.apache.spark.storage.TestBlockId
 
id() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
This is an unique identifier for the receiver input stream.
id() - Method in class org.apache.spark.streaming.scheduler.Job
 
id() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
Identifiable - Interface in org.apache.spark.ml
Object with a unique id.
IDF - Class in org.apache.spark.mllib.feature
:: Experimental :: Inverse document frequency (IDF).
IDF(int) - Constructor for class org.apache.spark.mllib.feature.IDF
 
IDF() - Constructor for class org.apache.spark.mllib.feature.IDF
 
idf() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
Returns the current IDF vector.
idf() - Method in class org.apache.spark.mllib.feature.IDFModel
 
IDF.DocumentFrequencyAggregator - Class in org.apache.spark.mllib.feature
Document frequency aggregator.
IDF.DocumentFrequencyAggregator(int) - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
IDF.DocumentFrequencyAggregator() - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
IDFModel - Class in org.apache.spark.mllib.feature
:: Experimental :: Represents an IDF model that can transform term frequency vectors.
IDFModel(Vector) - Constructor for class org.apache.spark.mllib.feature.IDFModel
 
IdGenerator - Class in org.apache.spark.util
A util used to get a unique generation ID.
IdGenerator() - Constructor for class org.apache.spark.util.IdGenerator
 
idx() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
idx() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
idx() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
idx() - Method in class org.apache.spark.sql.jdbc.JDBCPartition
 
ifExists() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
implicitPrefs() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param to decide whether to use implicit preference.
implicits() - Method in class org.apache.spark.sql.SQLContext
 
improveException(Object, NotSerializableException) - Static method in class org.apache.spark.serializer.SerializationDebugger
Improve the given NotSerializableException with the serialization path leading from the given object to the problematic object.
Impurities - Class in org.apache.spark.mllib.tree.impurity
Factory for Impurity instances.
Impurities() - Constructor for class org.apache.spark.mllib.tree.impurity.Impurities
 
impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
impurity() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
Impurity - Interface in org.apache.spark.mllib.tree.impurity
:: Experimental :: Trait for calculating information gain.
impurity() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
impurity() - Method in class org.apache.spark.mllib.tree.model.Node
 
impurityAggregator() - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
ImpurityAggregator instance specifying the impurity type.
ImpurityAggregator - Class in org.apache.spark.mllib.tree.impurity
Interface for updating views of a vector of sufficient statistics, in order to compute impurity from a sample.
ImpurityAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
 
ImpurityCalculator - Class in org.apache.spark.mllib.tree.impurity
Stores statistics for one (node, feature, bin) for calculating impurity.
ImpurityCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
 
In() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges arriving at a vertex.
in(Column...) - Method in class org.apache.spark.sql.Column
A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
in(Seq<Column>) - Method in class org.apache.spark.sql.Column
A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
IN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
In - Class in org.apache.spark.sql.sources
 
In(String, Object[]) - Constructor for class org.apache.spark.sql.sources.In
 
IN_MEMORY_PARTITION_PRUNING() - Static method in class org.apache.spark.sql.SQLConf
 
IN_PROGRESS() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
increaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
 
incrementEpoch() - Method in class org.apache.spark.MapOutputTrackerMaster
 
inDegrees() - Method in class org.apache.spark.graphx.GraphOps
The in-degree of each vertex in the graph.
independence() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
index() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
index() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
index() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
index(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
index(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
Return the index for the (i, j)-th element in the backing array.
index(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
index() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
index() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
index() - Method in interface org.apache.spark.Partition
Get the partition's index within its parent RDD
index() - Method in class org.apache.spark.rdd.BlockRDDPartition
 
index() - Method in class org.apache.spark.rdd.CartesianPartition
 
index() - Method in class org.apache.spark.rdd.CheckpointRDDPartition
 
index() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
index() - Method in class org.apache.spark.rdd.CoGroupPartition
 
index() - Method in class org.apache.spark.rdd.HadoopPartition
 
index() - Method in class org.apache.spark.rdd.JdbcPartition
 
index() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
index() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
index() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
index() - Method in class org.apache.spark.rdd.PartitionPruningRDDPartition
 
index() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
index() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
index() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
 
index() - Method in class org.apache.spark.rdd.UnionPartition
 
index() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
index() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
index() - Method in class org.apache.spark.scheduler.TaskDescription
 
index() - Method in class org.apache.spark.scheduler.TaskInfo
 
index() - Method in class org.apache.spark.sql.jdbc.JDBCPartition
 
index() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
index() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
index() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
 
index() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
index2term(long) - Static method in class org.apache.spark.mllib.clustering.LDA
 
IndexedRow - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row of IndexedRowMatrix.
IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row-oriented DistributedMatrix with indexed rows.
IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
indexOf(Object) - Method in class org.apache.spark.mllib.feature.HashingTF
Returns the index of the input term.
indexSize() - Method in class org.apache.spark.graphx.impl.EdgePartition
The number of unique source vertices in the partition.
indexToLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the level of a tree which the given node is in.
indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
IndirectTaskResult<T> - Class in org.apache.spark.scheduler
A reference to a DirectTaskResult that has been stored in the worker's BlockManager.
IndirectTaskResult(BlockId, int) - Constructor for class org.apache.spark.scheduler.IndirectTaskResult
 
inferPartitionColumnValue(String, String) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
Converts a string to a Literal with automatic type inference.
inferSchema(RDD<String>, double, String) - Static method in class org.apache.spark.sql.json.JsonRDD
 
infoGain() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
InformationGainStats - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Information gain statistics for each split
InformationGainStats(double, double, double, double, Predict, Predict) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
 
init(RDD<BaggedPoint<TreePoint>>, int, int, int) - Static method in class org.apache.spark.mllib.tree.impl.NodeIdCache
Initialize the node Id cache with initial node Id values.
init(Configuration, Map<String, String>, MessageType) - Method in class org.apache.spark.sql.parquet.RowReadSupport
 
init(Configuration) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
init(Configuration) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
initDegreeVector(Graph<Object, Object>) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Generates the degree vector as the vertex properties (v0) to start power iteration.
initEventLog(OutputStream) - Static method in class org.apache.spark.scheduler.EventLoggingListener
Write metadata about an event log to the given stream.
initFrom(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartitionBase
Construct the constituents of a VertexPartitionBase from the given vertices, merging duplicate entries arbitrarily.
initFrom(Iterator<Tuple2<Object, VD>>, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartitionBase
Construct the constituents of a VertexPartitionBase from the given vertices, merging duplicate entries using mergeFunc.
INITIAL_ARRAY_SIZE() - Static method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
initialCheckpoint() - Method in class org.apache.spark.streaming.StreamingContext
 
initialHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
initialize(boolean, SparkConf, SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
initialize(boolean, SparkConf, SecurityManager) - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
initialize(boolean, SparkConf, SecurityManager) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
initialize(boolean, SparkConf, SecurityManager) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
initialize() - Method in class org.apache.spark.HttpFileServer
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamBasedRecordReader
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
initialize() - Method in class org.apache.spark.metrics.MetricsConfig
 
initialize(SchedulerBackend) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
initialize(int, String, boolean) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
initialize() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Initializes with an approximate lower bound on the expected number of elements in this column.
initialize() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
initialize() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
initialize(String) - Method in class org.apache.spark.storage.BlockManager
Initializes the BlockManager with the given appId.
initialize(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Initialize the DStream by setting the "zero" time, based on which the validity of future times is calculated.
initialize(String) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
The Kinesis Client Library calls this method during IRecordProcessor initialization.
initialize() - Method in class org.apache.spark.ui.SparkUI
Initialize all components of the server.
initialize() - Method in class org.apache.spark.ui.WebUI
Initialize all components of the server.
Initialized() - Static method in class org.apache.spark.rdd.CheckpointState
 
Initialized() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Initialized() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
initializeIfNecessary() - Method in interface org.apache.spark.Logging
 
initializeLocalJobConfFunc(String, TableDesc, JobConf) - Static method in class org.apache.spark.sql.hive.HadoopTableReader
Curried.
initializeLogging() - Method in interface org.apache.spark.Logging
 
initialValue() - Method in class org.apache.spark.partial.PartialResult
 
initInputSerDe(Seq<Expression>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
initInputSoi(AbstractSerDe, Seq<String>, Seq<DataType>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
initLocalProperties() - Method in class org.apache.spark.SparkContext
 
initNextRecordReader() - Method in class org.apache.spark.input.ConfigurableCombineFileRecordReader
 
initOutputputSoi(AbstractSerDe) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
initOutputSerDe(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
initSerDe(String, Seq<String>, Seq<DataType>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
InMemoryColumnarTableScan - Class in org.apache.spark.sql.columnar
 
InMemoryColumnarTableScan(Seq<Attribute>, Seq<Expression>, InMemoryRelation) - Constructor for class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
inMemoryPartitionPruning() - Method in class org.apache.spark.sql.SQLConf
When set to true, partition pruning for in-memory columnar tables is enabled.
InMemoryRelation - Class in org.apache.spark.sql.columnar
 
InMemoryRelation(Seq<Attribute>, boolean, int, StorageLevel, SparkPlan, Option<String>, RDD<CachedBatch>, Statistics) - Constructor for class org.apache.spark.sql.columnar.InMemoryRelation
 
InnerClosureFinder - Class in org.apache.spark.util
 
InnerClosureFinder(Set<Class<?>>) - Constructor for class org.apache.spark.util.InnerClosureFinder
 
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.EdgeRDD
Inner joins this EdgeRDD with another EdgeRDD, assuming both are partitioned using the same PartitionStrategy.
innerJoin(EdgePartition<ED2, ?>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Apply f to all edges present in both this and other and return a new EdgePartition containing the resulting edges.
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
innerJoin(Self, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Inner join another VertexPartition.
innerJoin(Iterator<Product2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Inner join an iterator of messages.
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
innerJoinKeepLeft(Iterator<Product2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Similar to innerJoin, but vertices from the left side that don't appear in iter will remain in the partition, hidden by the bitmask.
innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
INPUT() - Static method in class org.apache.spark.ui.ToolTips
 
inputBytes() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
inputBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
inputCol() - Method in interface org.apache.spark.ml.param.HasInputCol
param for input column name
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
InputDStream<T> - Class in org.apache.spark.streaming.dstream
This is the abstract base class for all input streams.
InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
 
inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
 
InputFormatInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Parses and holds information about inputFormat (and files) specified as a parameter.
InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
 
inputMetrics() - Method in class org.apache.spark.storage.BlockResult
 
inputMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
inputMetricsToJson(InputMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
inputProjection() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
inputRecords() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
inputRecords() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
inputRowFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
inputRowFormatMap() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
inputSerdeClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
inputSerdeProps() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
inputSplit() - Method in class org.apache.spark.rdd.HadoopPartition
 
inputSplitWithLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
insert(DataFrame, boolean) - Method in class org.apache.spark.sql.json.JSONRelation
 
insert(DataFrame, boolean) - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
insert(DataFrame, boolean) - Method in interface org.apache.spark.sql.sources.InsertableRelation
 
InsertableRelation - Interface in org.apache.spark.sql.sources
::DeveloperApi:: A BaseRelation that can be used to insert data into it through the insert method.
insertInto(String, boolean) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Adds the rows from this RDD to the specified table, optionally overwriting the existing data.
insertInto(String) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Adds the rows from this RDD to the specified table.
InsertIntoDataSource - Class in org.apache.spark.sql.sources
 
InsertIntoDataSource(LogicalRelation, LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.sources.InsertIntoDataSource
 
InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution
 
InsertIntoHiveTable(MetastoreRelation, Map<String, Option<String>>, SparkPlan, boolean) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
InsertIntoHiveTable - Class in org.apache.spark.sql.hive
A logical plan representing insertion into Hive table.
InsertIntoHiveTable(LogicalPlan, Map<String, Option<String>>, LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.hive.InsertIntoHiveTable
 
insertIntoJDBC(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame
Save this RDD to a JDBC database at url under the table name table.
InsertIntoParquetTable - Class in org.apache.spark.sql.parquet
:: DeveloperApi :: Operator that acts as a sink for queries on RDDs and can be used to store the output inside a directory of Parquet files.
InsertIntoParquetTable(ParquetRelation, SparkPlan, boolean) - Constructor for class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
inShutdown() - Static method in class org.apache.spark.util.Utils
Detect whether this thread might be executing a shutdown hook.
inspectorToDataType(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
instance() - Method in class org.apache.spark.metrics.MetricsSystem
 
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Gini
Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Variance
Get this impurity instance.
INT - Class in org.apache.spark.sql.columnar
 
INT() - Constructor for class org.apache.spark.sql.columnar.INT
 
intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
intAccumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
IntColumnAccessor - Class in org.apache.spark.sql.columnar
 
IntColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.IntColumnAccessor
 
IntColumnBuilder - Class in org.apache.spark.sql.columnar
 
IntColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.IntColumnBuilder
 
IntColumnStats - Class in org.apache.spark.sql.columnar
 
IntColumnStats() - Constructor for class org.apache.spark.sql.columnar.IntColumnStats
 
IntDelta - Class in org.apache.spark.sql.columnar.compression
 
IntDelta() - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta
 
IntDelta.Decoder - Class in org.apache.spark.sql.columnar.compression
 
IntDelta.Decoder(ByteBuffer, NativeColumnType<IntegerType$>) - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
IntDelta.Encoder - Class in org.apache.spark.sql.columnar.compression
 
IntDelta.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
IntegerConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
INTER_JOB_WAIT_MS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
intercept() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
intercept() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
 
intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
 
intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
intercept() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
 
intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
 
intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
internalMap() - Method in class org.apache.spark.util.TimeStampedHashSet
 
InterruptibleIterator<T> - Class in org.apache.spark
:: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality.
InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
 
interruptThread() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
interruptThread() - Method in class org.apache.spark.scheduler.local.KillTask
 
intersect(DataFrame) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame containing rows only in both this frame and another frame.
intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the intersection of this RDD and another one.
intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the intersection of this RDD and another one.
intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return the intersection of this RDD and another one.
intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
Return the intersection of this RDD and another one.
Interval - Class in org.apache.spark.streaming
 
Interval(Time, Time) - Constructor for class org.apache.spark.streaming.Interval
 
Interval(long, long) - Constructor for class org.apache.spark.streaming.Interval
 
INTERVAL_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
INTERVAL_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
IntParam - Class in org.apache.spark.ml.param
Specialized version of Param[Int] for Java.
IntParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.IntParam
 
IntParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.IntParam
 
IntParam - Class in org.apache.spark.util
An extractor object for parsing strings into integers.
IntParam() - Constructor for class org.apache.spark.util.IntParam
 
intRddToDataFrameHolder(RDD<Object>) - Method in class org.apache.spark.sql.SQLContext.implicits
Creates a single column DataFrame from an RDD[Int].
intToIntWritable(int) - Static method in class org.apache.spark.SparkContext
 
intWritableConverter() - Static method in class org.apache.spark.SparkContext
 
intWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
intWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
invalidateCache(LogicalPlan) - Method in class org.apache.spark.sql.CacheManager
Invalidates the cache of any data that contains plan.
invalidateTable(String, String) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
invalidInformationGainStats() - Static method in class org.apache.spark.mllib.tree.model.InformationGainStats
An InformationGainStats object to denote that current split doesn't satisfies minimum info gain or minimum number of instances per node.
invoke(Class<?>, Object, String, Seq<Tuple2<Class<?>, Object>>) - Static method in class org.apache.spark.util.Utils
 
invokedMethod(Object, Class<?>, String) - Static method in class org.apache.spark.graphx.util.BytecodeUtils
Test whether the given closure invokes the specified method in the specified class.
invokeWriteReplace(Object) - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
ioschema() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
isActive(long) - Method in class org.apache.spark.graphx.impl.EdgePartition
Look up vid in activeSet, throwing an exception if it is None.
isActive() - Method in class org.apache.spark.util.EventLoop
Return if the event thread has already been started but not yet stopped.
isAkkaConf(String) - Static method in class org.apache.spark.SparkConf
Return whether the given config is an akka config (e.g.
isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
 
isAuthenticationEnabled() - Method in class org.apache.spark.SecurityManager
Check to see if authentication for the Spark communication protocols is enabled
isAvailable() - Method in class org.apache.spark.scheduler.Stage
 
isBindCollision(Throwable) - Static method in class org.apache.spark.util.Utils
Return whether the exception is caused by an address-port collision when binding.
isBroadcast() - Method in class org.apache.spark.storage.BlockId
 
isCached(String) - Method in class org.apache.spark.sql.CacheManager
Returns true if the table is currently cached in-memory.
isCached(String) - Method in class org.apache.spark.sql.SQLContext
Returns true if the table is currently cached in-memory.
isCached() - Method in class org.apache.spark.storage.BlockStatus
 
isCached() - Method in class org.apache.spark.storage.RDDInfo
 
isCancelled() - Method in class org.apache.spark.ComplexFutureAction
 
isCancelled() - Method in interface org.apache.spark.FutureAction
Returns whether the action has been cancelled.
isCancelled() - Method in class org.apache.spark.JavaFutureActionWrapper
 
isCancelled() - Method in class org.apache.spark.SimpleFutureAction
 
isCategorical(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.graphx.Graph
Return whether this Graph has been checkpointed or not.
isCheckpointed() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
isCheckpointed() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
isCheckpointed() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
isCheckpointed() - Method in class org.apache.spark.rdd.RDD
Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
isCheckpointPresent() - Method in class org.apache.spark.streaming.StreamingContext
 
isClassification() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isCompleted() - Method in class org.apache.spark.ComplexFutureAction
 
isCompleted() - Method in interface org.apache.spark.FutureAction
Returns whether the action has already been completed with a value or an exception.
isCompleted() - Method in class org.apache.spark.SimpleFutureAction
 
isCompleted() - Method in class org.apache.spark.TaskContext
Returns true if the task has completed.
isCompleted() - Method in class org.apache.spark.TaskContextImpl
 
isContinuous(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isDefined(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
isDocumentVertex(Tuple2<Object, ?>) - Static method in class org.apache.spark.mllib.clustering.LDA
 
isDone() - Method in class org.apache.spark.JavaFutureActionWrapper
 
isDriver() - Method in class org.apache.spark.broadcast.BroadcastManager
 
isDriver() - Method in class org.apache.spark.storage.BlockManagerId
 
isEmpty() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
isEmpty() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
isEmpty() - Method in class org.apache.spark.rdd.RDD
 
isEmpty() - Method in class org.apache.spark.sql.CacheManager
Checks if the cache is empty.
isEventLogEnabled() - Method in class org.apache.spark.SparkContext
 
isExecutorAlive(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
isExecutorStartupConf(String) - Static method in class org.apache.spark.SparkConf
Return whether the given config should be passed to an executor on start-up.
isExtended() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
isExtended() - Method in class org.apache.spark.sql.sources.DescribeCommand
 
isFairScheduler() - Method in class org.apache.spark.ui.jobs.JobsTab
 
isFairScheduler() - Method in class org.apache.spark.ui.jobs.StagesTab
 
isFatalError(Throwable) - Static method in class org.apache.spark.util.Utils
Returns true if the given exception was fatal.
isFinished(Protos.TaskState) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Check whether a Mesos task state represents a finished task
isFinished(Enumeration.Value) - Static method in class org.apache.spark.TaskState
 
isInitialized() - Method in class org.apache.spark.streaming.dstream.DStream
 
isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
 
isInMemory() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
isInterrupted() - Method in class org.apache.spark.TaskContext
Returns true if the task has been killed.
isInterrupted() - Method in class org.apache.spark.TaskContextImpl
 
isLeaf() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
 
isLeftChild(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Returns true if this is a left child.
isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
 
isLocal() - Method in class org.apache.spark.SparkContext
 
isLocal() - Method in class org.apache.spark.sql.DataFrame
Returns true if the collect and take methods can be run locally (without any Spark executors).
isLocal() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
isLogManagerEnabled() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Check if the log manager is enabled.
isMac() - Static method in class org.apache.spark.util.Utils
Whether the underlying operating system is Mac OS X.
isMulticlass() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isMulticlassClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
 
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
 
isNotNull() - Method in class org.apache.spark.sql.Column
True if the current expression is NOT null.
IsNotNull - Class in org.apache.spark.sql.sources
 
IsNotNull(String) - Constructor for class org.apache.spark.sql.sources.IsNotNull
 
isNull() - Method in class org.apache.spark.sql.Column
True if the current expression is null.
IsNull - Class in org.apache.spark.sql.sources
 
IsNull(String) - Constructor for class org.apache.spark.sql.sources.IsNull
 
isOpen() - Method in class org.apache.spark.storage.BlockObjectWriter
 
isOpen() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
isotonic() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
 
IsotonicRegression - Class in org.apache.spark.mllib.regression
:: Experimental ::
IsotonicRegression() - Constructor for class org.apache.spark.mllib.regression.IsotonicRegression
Constructs IsotonicRegression instance with default parameter isotonic = true.
IsotonicRegressionModel - Class in org.apache.spark.mllib.regression
:: Experimental ::
IsotonicRegressionModel(double[], double[], boolean) - Constructor for class org.apache.spark.mllib.regression.IsotonicRegressionModel
 
isParquetBinaryAsString() - Method in class org.apache.spark.sql.SQLConf
When set to true, we always treat byte arrays in Parquet files as strings.
isParquetINT96AsTimestamp() - Method in class org.apache.spark.sql.SQLConf
When set to true, we always treat INT96Values in Parquet files as timestamp.
isPartitioned() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
isPrimitiveType(DataType) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
isRDD() - Method in class org.apache.spark.storage.BlockId
 
isReady() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
isReady() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
isReceiverStarted() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Check if receiver has been marked for stopping
isReceiverStopped() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Check if receiver has been marked for stopping
isRegistered() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
isRegistered() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
isRoot() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
 
isRunningInYarnContainer(SparkConf) - Static method in class org.apache.spark.util.Utils
 
isRunningLocally() - Method in class org.apache.spark.TaskContext
Returns true if the task is running locally in the driver program.
isRunningLocally() - Method in class org.apache.spark.TaskContextImpl
 
isSet(Param<?>) - Method in interface org.apache.spark.ml.param.Params
Checks whether a param is explicitly set.
isShuffle() - Method in class org.apache.spark.storage.BlockId
 
isShuffleMap() - Method in class org.apache.spark.scheduler.Stage
 
isSparkPortConf(String) - Static method in class org.apache.spark.SparkConf
Return true if the given config matches either spark.*.port or spark.port.*.
isSplitable(JobContext, Path) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Override of isSplitable to ensure initial computation of the record length
isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver
Check if the receiver has started or not.
isStopped() - Method in class org.apache.spark.SparkEnv
 
isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver
Check if receiver has been marked for stopping.
isSymlink(File) - Static method in class org.apache.spark.util.Utils
Check to see if file is a symbolic link.
isTermVertex(Tuple2<Object, ?>) - Static method in class org.apache.spark.mllib.clustering.LDA
 
isTesting() - Static method in class org.apache.spark.util.Utils
Indicates whether Spark is currently running unit tests.
isTimeValid(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Checks whether the 'time' is valid wrt slideDuration for generating RDD
isTimeValid(Time) - Method in class org.apache.spark.streaming.dstream.InputDStream
Checks whether the 'time' is valid wrt slideDuration for generating RDD.
isTraceEnabled() - Method in interface org.apache.spark.Logging
 
isTransposed() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
isTransposed() - Method in interface org.apache.spark.mllib.linalg.Matrix
Flag that keeps track whether the matrix is transposed or not.
isTransposed() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
isUDAFBridgeRequired() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
isUnordered(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
isValid() - Method in class org.apache.spark.broadcast.Broadcast
Whether this Broadcast is actually usable.
isValid() - Method in class org.apache.spark.rdd.BlockRDD
Whether this BlockRDD is actually usable.
isValid() - Method in class org.apache.spark.storage.StorageLevel
 
isWindows() - Static method in class org.apache.spark.util.Utils
Whether the underlying operating system is Windows.
isWorthCompressing(Encoder<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
isZero() - Method in class org.apache.spark.streaming.Duration
 
isZombie() - Method in class org.apache.spark.scheduler.TaskSetManager
 
it() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
item() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
 
item() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
 
item() - Method in class org.apache.spark.streaming.receiver.SingleItemData
 
itemCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for the column name for item ids.
items() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
 
iterationTimes() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
 
iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator() - Method in class org.apache.spark.graphx.impl.EdgePartition
Get an iterator over the edges in this partition.
iterator() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
Returns an iterator over all vertex ids stored in this `RoutingTablePartition`.
iterator() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
iterator() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
iterator() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator() - Method in class org.apache.spark.sql.sources.CaseInsensitiveMap
 
iterator() - Method in class org.apache.spark.storage.IteratorValues
 
iterator() - Method in class org.apache.spark.streaming.receiver.IteratorBlock
 
iterator() - Method in class org.apache.spark.streaming.receiver.IteratorData
 
iterator() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
iterator() - Method in class org.apache.spark.util.TimeStampedHashMap
 
iterator() - Method in class org.apache.spark.util.TimeStampedHashSet
 
iterator() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
IteratorBlock - Class in org.apache.spark.streaming.receiver
class representing a block received as an Iterator
IteratorBlock(Iterator<Object>) - Constructor for class org.apache.spark.streaming.receiver.IteratorBlock
 
IteratorData<T> - Class in org.apache.spark.streaming.receiver
 
IteratorData(Iterator<T>) - Constructor for class org.apache.spark.streaming.receiver.IteratorData
 
IteratorValues - Class in org.apache.spark.storage
 
IteratorValues(Iterator<Object>) - Constructor for class org.apache.spark.storage.IteratorValues
 

J

j() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
jarDir() - Method in class org.apache.spark.HttpFileServer
 
jarOfClass(Class<?>) - Static method in class org.apache.spark.api.java.JavaSparkContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.SparkContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.StreamingContext
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfObject(Object) - Static method in class org.apache.spark.api.java.JavaSparkContext
Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jarOfObject(Object) - Static method in class org.apache.spark.SparkContext
Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jars() - Method in class org.apache.spark.api.java.JavaSparkContext
 
jars() - Method in class org.apache.spark.SparkContext
 
jars() - Method in class org.apache.spark.streaming.Checkpoint
 
javaClassToDataType(Class<?>) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
JavaDeserializationStream - Class in org.apache.spark.serializer
 
JavaDeserializationStream(InputStream, ClassLoader) - Constructor for class org.apache.spark.serializer.JavaDeserializationStream
 
JavaDoubleRDD - Class in org.apache.spark.api.java
 
JavaDoubleRDD(RDD<Object>) - Constructor for class org.apache.spark.api.java.JavaDoubleRDD
 
JavaDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data.
JavaDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaDStream
 
JavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Interface in org.apache.spark.streaming.api.java
 
JavaFutureAction<T> - Interface in org.apache.spark.api.java
 
JavaFutureActionWrapper<S,T> - Class in org.apache.spark
 
JavaFutureActionWrapper(FutureAction<S>, Function1<S, T>) - Constructor for class org.apache.spark.JavaFutureActionWrapper
 
JavaHadoopRDD<K,V> - Class in org.apache.spark.api.java
 
JavaHadoopRDD(HadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaHadoopRDD
 
JavaInputDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to InputDStream.
JavaInputDStream(InputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaInputDStream
 
javaItems() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
Returns items in a Java List.
JavaIterableWrapperSerializer - Class in org.apache.spark.serializer
A Kryo serializer for serializing results returned by asJavaIterable.
JavaIterableWrapperSerializer() - Constructor for class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
JavaKinesisWordCountASL - Class in org.apache.spark.examples.streaming
Java-friendly Kinesis Spark Streaming WordCount example See http://spark.apache.org/docs/latest/streaming-kinesis.html for more details on the Kinesis Spark Streaming integration.
JavaNewHadoopRDD<K,V> - Class in org.apache.spark.api.java
 
JavaNewHadoopRDD(NewHadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaNewHadoopRDD
 
JavaPairDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join.
JavaPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairDStream
 
JavaPairInputDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to InputDStream of key-value pairs.
JavaPairInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
JavaPairRDD<K,V> - Class in org.apache.spark.api.java
 
JavaPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaPairRDD
 
JavaPairReceiverInputDStream<K,V> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaPairReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
JavaRDD<T> - Class in org.apache.spark.api.java
 
JavaRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.api.java.JavaRDD
 
javaRDD() - Method in class org.apache.spark.sql.DataFrame
Returns the content of the DataFrame as a JavaRDD of Rows.
JavaRDDLike<T,This extends JavaRDDLike<T,This>> - Interface in org.apache.spark.api.java
Defines operations common to several Java RDD implementations.
JavaReceiverInputDStream<T> - Class in org.apache.spark.streaming.api.java
A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
JavaSerializationStream - Class in org.apache.spark.serializer
 
JavaSerializationStream(OutputStream, int, boolean) - Constructor for class org.apache.spark.serializer.JavaSerializationStream
 
JavaSerializer - Class in org.apache.spark.serializer
:: DeveloperApi :: A Spark serializer that uses Java's built-in serialization.
JavaSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.JavaSerializer
 
JavaSerializerInstance - Class in org.apache.spark.serializer
 
JavaSerializerInstance(int, boolean, ClassLoader) - Constructor for class org.apache.spark.serializer.JavaSerializerInstance
 
JavaSparkContext - Class in org.apache.spark.api.java
A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones.
JavaSparkContext(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext() - Constructor for class org.apache.spark.api.java.JavaSparkContext
Create a JavaSparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
JavaSparkContext(SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String[]) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkContext(String, String, String, String[], Map<String, String>) - Constructor for class org.apache.spark.api.java.JavaSparkContext
 
JavaSparkListener - Class in org.apache.spark
Java clients should extend this class instead of implementing SparkListener directly.
JavaSparkListener() - Constructor for class org.apache.spark.JavaSparkListener
 
JavaSparkStatusTracker - Class in org.apache.spark.api.java
Low-level status reporting APIs for monitoring job and stage progress.
JavaSparkStatusTracker(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkStatusTracker
 
JavaStreamingContext - Class in org.apache.spark.streaming.api.java
A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality.
JavaStreamingContext(StreamingContext) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
 
JavaStreamingContext(String, String, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[]) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[], Map<String, String>) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a StreamingContext.
JavaStreamingContext(JavaSparkContext, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a JavaStreamingContext using an existing JavaSparkContext.
JavaStreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a JavaStreamingContext using a SparkConf configuration.
JavaStreamingContext(String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Recreate a JavaStreamingContext from a checkpoint file.
JavaStreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
Re-creates a JavaStreamingContext from a checkpoint file.
JavaStreamingContextFactory - Interface in org.apache.spark.streaming.api.java
Factory interface for creating a new JavaStreamingContext
JavaUtils - Class in org.apache.spark.api.java
 
JavaUtils() - Constructor for class org.apache.spark.api.java.JavaUtils
 
JavaUtils.SerializableMapWrapper<A,B> - Class in org.apache.spark.api.java
 
JavaUtils.SerializableMapWrapper(Map<A, B>) - Constructor for class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
jdbc(String, String) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Construct a DataFrame representing the database table accessible via JDBC URL url named table.
jdbc(String, String, String, long, long, int) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Construct a DataFrame representing the database table accessible via JDBC URL url named table.
jdbc(String, String, String[]) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Construct a DataFrame representing the database table accessible via JDBC URL url named table.
JdbcPartition - Class in org.apache.spark.rdd
 
JdbcPartition(int, long, long) - Constructor for class org.apache.spark.rdd.JdbcPartition
 
JDBCPartition - Class in org.apache.spark.sql.jdbc
Data corresponding to one partition of a JDBCRDD.
JDBCPartition(String, int) - Constructor for class org.apache.spark.sql.jdbc.JDBCPartition
 
JDBCPartitioningInfo - Class in org.apache.spark.sql.jdbc
Instructions on how to partition the table among workers.
JDBCPartitioningInfo(String, long, long, int) - Constructor for class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
 
JdbcRDD<T> - Class in org.apache.spark.rdd
An RDD that executes an SQL query on a JDBC connection and reads results.
JdbcRDD(SparkContext, Function0<Connection>, String, long, long, int, Function1<ResultSet, T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.JdbcRDD
 
JDBCRDD - Class in org.apache.spark.sql.jdbc
An RDD representing a table in a database accessed via JDBC.
JDBCRDD(SparkContext, Function0<Connection>, StructType, String, String[], Filter[], Partition[]) - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD
 
JDBCRDD.BinaryConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.BinaryConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.BinaryConversion$
 
JDBCRDD.BinaryLongConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.BinaryLongConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.BinaryLongConversion$
 
JDBCRDD.BooleanConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.BooleanConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.BooleanConversion$
 
JdbcRDD.ConnectionFactory - Interface in org.apache.spark.rdd
 
JDBCRDD.DateConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.DateConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.DateConversion$
 
JDBCRDD.DecimalConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.DecimalConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.DecimalConversion$
 
JDBCRDD.DoubleConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.DoubleConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.DoubleConversion$
 
JDBCRDD.FloatConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.FloatConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.FloatConversion$
 
JDBCRDD.IntegerConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.IntegerConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.IntegerConversion$
 
JDBCRDD.JDBCConversion - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.JDBCConversion() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.JDBCConversion
 
JDBCRDD.LongConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.LongConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.LongConversion$
 
JDBCRDD.StringConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.StringConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.StringConversion$
 
JDBCRDD.TimestampConversion$ - Class in org.apache.spark.sql.jdbc
 
JDBCRDD.TimestampConversion$() - Constructor for class org.apache.spark.sql.jdbc.JDBCRDD.TimestampConversion$
 
JDBCRelation - Class in org.apache.spark.sql.jdbc
 
JDBCRelation(String, String, Partition[], SQLContext) - Constructor for class org.apache.spark.sql.jdbc.JDBCRelation
 
JettyUtils - Class in org.apache.spark.ui
Utilities for launching a web server using Jetty's HTTP Server class
JettyUtils() - Constructor for class org.apache.spark.ui.JettyUtils
 
JettyUtils.ServletParams<T> - Class in org.apache.spark.ui
 
JettyUtils.ServletParams(Function1<HttpServletRequest, T>, String, Function1<T, String>, Function1<T, Object>) - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams
 
JettyUtils.ServletParams$ - Class in org.apache.spark.ui
 
JettyUtils.ServletParams$() - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams$
 
JmxSink - Class in org.apache.spark.metrics.sink
 
JmxSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.JmxSink
 
Job - Class in org.apache.spark.streaming.scheduler
Class representing a Spark computation.
Job(Time, Function0<?>) - Constructor for class org.apache.spark.streaming.scheduler.Job
 
job() - Method in class org.apache.spark.streaming.scheduler.JobCompleted
 
job() - Method in class org.apache.spark.streaming.scheduler.JobStarted
 
JobCancelled - Class in org.apache.spark.scheduler
 
JobCancelled(int) - Constructor for class org.apache.spark.scheduler.JobCancelled
 
JobCompleted - Class in org.apache.spark.streaming.scheduler
 
JobCompleted(Job) - Constructor for class org.apache.spark.streaming.scheduler.JobCompleted
 
jobEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobEndToJson(SparkListenerJobEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
JobExecutionStatus - Enum in org.apache.spark
 
jobFailed(Exception) - Method in class org.apache.spark.partial.ApproximateActionListener
 
JobFailed - Class in org.apache.spark.scheduler
 
JobFailed(Exception) - Constructor for class org.apache.spark.scheduler.JobFailed
 
jobFailed(Exception) - Method in interface org.apache.spark.scheduler.JobListener
 
jobFailed(Exception) - Method in class org.apache.spark.scheduler.JobWaiter
 
jobFinished() - Method in class org.apache.spark.scheduler.JobWaiter
 
JobGenerator - Class in org.apache.spark.streaming.scheduler
This class generates jobs from DStreams as well as drives checkpointing and cleaning up DStream metadata.
JobGenerator(JobScheduler) - Constructor for class org.apache.spark.streaming.scheduler.JobGenerator
 
JobGeneratorEvent - Interface in org.apache.spark.streaming.scheduler
Event classes for JobGenerator
jobGroup() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
JobGroupCancelled - Class in org.apache.spark.scheduler
 
JobGroupCancelled(String) - Constructor for class org.apache.spark.scheduler.JobGroupCancelled
 
jobId() - Method in class org.apache.spark.scheduler.ActiveJob
 
jobId() - Method in class org.apache.spark.scheduler.JobCancelled
 
jobId() - Method in class org.apache.spark.scheduler.JobSubmitted
 
jobId() - Method in class org.apache.spark.scheduler.JobWaiter
 
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
jobId() - Method in class org.apache.spark.scheduler.Stage
 
jobId() - Method in interface org.apache.spark.SparkJobInfo
 
jobId() - Method in class org.apache.spark.SparkJobInfoImpl
 
jobID() - Method in class org.apache.spark.TaskCommitDenied
 
jobId() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
jobIds() - Method in interface org.apache.spark.api.java.JavaFutureAction
Returns the job IDs run by the underlying async operation.
jobIds() - Method in class org.apache.spark.ComplexFutureAction
 
jobIds() - Method in interface org.apache.spark.FutureAction
Returns the job IDs run by the underlying async operation.
jobIds() - Method in class org.apache.spark.JavaFutureActionWrapper
 
jobIds() - Method in class org.apache.spark.scheduler.Stage
Set of jobs that this stage belongs to.
jobIds() - Method in class org.apache.spark.SimpleFutureAction
 
jobIdToActiveJob() - Method in class org.apache.spark.scheduler.DAGScheduler
 
jobIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
jobIdToStageIds() - Method in class org.apache.spark.scheduler.DAGScheduler
 
JobListener - Interface in org.apache.spark.scheduler
Interface used to listen for job completion or failure events after submitting a job to the DAGScheduler.
JobLogger - Class in org.apache.spark.scheduler
:: DeveloperApi :: A logger class to record runtime information for jobs in Spark.
JobLogger(String, String) - Constructor for class org.apache.spark.scheduler.JobLogger
 
JobLogger() - Constructor for class org.apache.spark.scheduler.JobLogger
 
JobPage - Class in org.apache.spark.ui.jobs
Page showing statistics and stage list for a given job
JobPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.JobPage
 
jobProgressListener() - Method in class org.apache.spark.SparkContext
 
JobProgressListener - Class in org.apache.spark.ui.jobs
:: DeveloperApi :: Tracks task-level information to be displayed in the UI.
JobProgressListener(SparkConf) - Constructor for class org.apache.spark.ui.jobs.JobProgressListener
 
jobProgressListener() - Method in class org.apache.spark.ui.SparkUI
 
JobResult - Interface in org.apache.spark.scheduler
:: DeveloperApi :: A result of a job in the DAGScheduler.
jobResult() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
jobResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobResultToJson(JobResult) - Static method in class org.apache.spark.util.JsonProtocol
 
jobs() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
JobScheduler - Class in org.apache.spark.streaming.scheduler
This class schedules jobs to be run on Spark.
JobScheduler(StreamingContext) - Constructor for class org.apache.spark.streaming.scheduler.JobScheduler
 
JobSchedulerEvent - Interface in org.apache.spark.streaming.scheduler
 
JobSet - Class in org.apache.spark.streaming.scheduler
Class representing a set of Jobs belong to the same batch.
JobSet(Time, Seq<Job>, Map<Object, ReceivedBlockInfo[]>) - Constructor for class org.apache.spark.streaming.scheduler.JobSet
 
JobsTab - Class in org.apache.spark.ui.jobs
Web UI showing progress status of all jobs in the given SparkContext.
JobsTab(SparkUI) - Constructor for class org.apache.spark.ui.jobs.JobsTab
 
JobStarted - Class in org.apache.spark.streaming.scheduler
 
JobStarted(Job) - Constructor for class org.apache.spark.streaming.scheduler.JobStarted
 
jobStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
jobStartToJson(SparkListenerJobStart) - Static method in class org.apache.spark.util.JsonProtocol
 
JobSubmitted - Class in org.apache.spark.scheduler
 
JobSubmitted(int, RDD<?>, Function2<TaskContext, Iterator<Object>, ?>, int[], boolean, CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.JobSubmitted
 
JobSucceeded - Class in org.apache.spark.scheduler
 
JobSucceeded() - Constructor for class org.apache.spark.scheduler.JobSucceeded
 
JobWaiter<T> - Class in org.apache.spark.scheduler
An object that waits for a DAGScheduler job to complete.
JobWaiter(DAGScheduler, int, int, Function2<Object, T, BoxedUnit>) - Constructor for class org.apache.spark.scheduler.JobWaiter
 
join(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
join(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD containing all pairs of elements with matching keys in this and other.
join(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD containing all pairs of elements with matching keys in this and other.
join(DataFrame) - Method in class org.apache.spark.sql.DataFrame
Cartesian join with another DataFrame.
join(DataFrame, Column) - Method in class org.apache.spark.sql.DataFrame
Inner join with another DataFrame, using the given join expression.
join(DataFrame, Column, String) - Method in class org.apache.spark.sql.DataFrame
Join with another DataFrame, using the given join expression.
join(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
joinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD>, ClassTag<U>) - Method in class org.apache.spark.graphx.GraphOps
Join the vertices with an RDD and then apply a function from the the vertex and RDD entry to a new vertex value.
jsonFile(String) - Method in class org.apache.spark.sql.SQLContext
Loads a JSON file (one object per line), returning the result as a DataFrame.
jsonFile(String, StructType) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Loads a JSON file (one object per line) and applies the given schema, returning the result as a DataFrame.
jsonFile(String, double) - Method in class org.apache.spark.sql.SQLContext
:: Experimental ::
jsonOption(JsonAST.JValue) - Static method in class org.apache.spark.util.Utils
Return an option that translates JNothing to None
JsonProtocol - Class in org.apache.spark.util
Serializes SparkListener events to/from JSON.
JsonProtocol() - Constructor for class org.apache.spark.util.JsonProtocol
 
JsonRDD - Class in org.apache.spark.sql.json
 
JsonRDD() - Constructor for class org.apache.spark.sql.json.JsonRDD
 
jsonRDD(RDD<String>) - Method in class org.apache.spark.sql.SQLContext
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a DataFrame.
jsonRDD(JavaRDD<String>) - Method in class org.apache.spark.sql.SQLContext
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a DataFrame.
jsonRDD(RDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Loads an RDD[String] storing JSON objects (one object per record) and applies the given schema, returning the result as a DataFrame.
jsonRDD(JavaRDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Loads an JavaRDD storing JSON objects (one object per record) and applies the given schema, returning the result as a DataFrame.
jsonRDD(RDD<String>, double) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Loads an RDD[String] storing JSON objects (one object per record) inferring the schema, returning the result as a DataFrame.
jsonRDD(JavaRDD<String>, double) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Loads a JavaRDD[String] storing JSON objects (one object per record) inferring the schema, returning the result as a DataFrame.
JSONRelation - Class in org.apache.spark.sql.json
 
JSONRelation(String, double, Option<StructType>, SQLContext) - Constructor for class org.apache.spark.sql.json.JSONRelation
 
jsonResponderToServlet(Function1<HttpServletRequest, JsonAST.JValue>) - Static method in class org.apache.spark.ui.JettyUtils
 
jsonStringToRow(RDD<String>, StructType, String) - Static method in class org.apache.spark.sql.json.JsonRDD
 
jvmInformation() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
JvmSource - Class in org.apache.spark.metrics.source
 
JvmSource() - Constructor for class org.apache.spark.metrics.source.JvmSource
 

K

k() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
 
k() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
 
k() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
 
k() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
Number of gaussians in mixture
k() - Method in class org.apache.spark.mllib.clustering.KMeansModel
Total number of clusters.
k() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
k() - Method in class org.apache.spark.mllib.clustering.LDAModel
Number of topics
k() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
 
k() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
 
k() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
K_MEANS_PARALLEL() - Static method in class org.apache.spark.mllib.clustering.KMeans
 
KafkaCluster - Class in org.apache.spark.streaming.kafka
Convenience methods for interacting with a Kafka cluster.
KafkaCluster(Map<String, String>) - Constructor for class org.apache.spark.streaming.kafka.KafkaCluster
 
KafkaCluster.LeaderOffset - Class in org.apache.spark.streaming.kafka
 
KafkaCluster.LeaderOffset(String, int, long) - Constructor for class org.apache.spark.streaming.kafka.KafkaCluster.LeaderOffset
 
KafkaCluster.LeaderOffset$ - Class in org.apache.spark.streaming.kafka
 
KafkaCluster.LeaderOffset$() - Constructor for class org.apache.spark.streaming.kafka.KafkaCluster.LeaderOffset$
 
KafkaCluster.SimpleConsumerConfig - Class in org.apache.spark.streaming.kafka
High-level kafka consumers connect to ZK.
KafkaCluster.SimpleConsumerConfig$ - Class in org.apache.spark.streaming.kafka
 
KafkaCluster.SimpleConsumerConfig$() - Constructor for class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig$
 
KafkaInputDStream<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka
Input stream that pulls messages from a Kafka Broker.
KafkaInputDStream(StreamingContext, Map<String, String>, Map<String, Object>, boolean, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.KafkaInputDStream
 
kafkaParams() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
kafkaParams() - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
KafkaRDD<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>,R> - Class in org.apache.spark.streaming.kafka
A batch-oriented interface for consuming from Kafka.
KafkaRDD(SparkContext, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Tuple2<String, Object>>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>, ClassTag<R>) - Constructor for class org.apache.spark.streaming.kafka.KafkaRDD
 
KafkaRDDPartition - Class in org.apache.spark.streaming.kafka
 
KafkaRDDPartition(int, String, int, long, long, String, int) - Constructor for class org.apache.spark.streaming.kafka.KafkaRDDPartition
 
KafkaReceiver<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka
 
KafkaReceiver(Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.KafkaReceiver
 
KafkaUtils - Class in org.apache.spark.streaming.kafka
 
KafkaUtils() - Constructor for class org.apache.spark.streaming.kafka.KafkaUtils
 
KafkaUtilsPythonHelper - Class in org.apache.spark.streaming.kafka
This is a helper class that wraps the KafkaUtils.createStream() into more Python-friendly class and function so that it can be easily instantiated and called from Python's KafkaUtils (see SPARK-6027).
KafkaUtilsPythonHelper() - Constructor for class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
 
kClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
 
kClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
 
kClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
keyBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Creates tuples of the elements in this RDD by applying f.
keyBy(Function1<T, K>) - Method in class org.apache.spark.rdd.RDD
Creates tuples of the elements in this RDD by applying f.
keyClass() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
keyOrdering() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
keyOrdering() - Method in class org.apache.spark.ShuffleDependency
 
keyPassword() - Method in class org.apache.spark.SSLOptions
 
keys() - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the keys of each tuple.
keyStore() - Method in class org.apache.spark.SSLOptions
 
keyStorePassword() - Method in class org.apache.spark.SSLOptions
 
kFold(RDD<T>, int, int, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils
:: Experimental :: Return a k element array of pairs of RDDs with the first element of each pair containing the training data, a complement of the validation data and the second element, the validation data, containing a unique 1/kth of the data.
kill(boolean) - Method in class org.apache.spark.scheduler.Task
Kills a task by setting the interrupted flag to true.
killed() - Method in class org.apache.spark.scheduler.Task
Whether the task has been killed.
KILLED() - Static method in class org.apache.spark.TaskState
 
killEnabled() - Method in class org.apache.spark.ui.jobs.JobsTab
 
killEnabled() - Method in class org.apache.spark.ui.jobs.StagesTab
 
killEnabled() - Method in class org.apache.spark.ui.SparkUI
 
killExecutor(String) - Method in interface org.apache.spark.ExecutorAllocationClient
Request that the cluster manager kill the specified executor.
killExecutor(String) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Request that cluster manager the kill the specified executor.
killExecutors(Seq<String>) - Method in interface org.apache.spark.ExecutorAllocationClient
Request that the cluster manager kill the specified executors.
killExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
Request that the cluster manager kill the specified executors.
killExecutors(Seq<String>) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Request that the cluster manager kill the specified executors.
killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
KillTask - Class in org.apache.spark.scheduler.local
 
KillTask(long, boolean) - Constructor for class org.apache.spark.scheduler.local.KillTask
 
killTask(long, String, boolean) - Method in class org.apache.spark.scheduler.local.LocalBackend
 
killTask(long, String, boolean) - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
KinesisCheckpointState - Class in org.apache.spark.streaming.kinesis
This is a helper class for managing checkpoint clocks.
KinesisCheckpointState(Duration, Clock) - Constructor for class org.apache.spark.streaming.kinesis.KinesisCheckpointState
 
kinesisClientLibConfiguration() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
KinesisReceiver - Class in org.apache.spark.streaming.kinesis
Custom AWS Kinesis-specific implementation of Spark Streaming's Receiver.
KinesisReceiver(String, String, String, Duration, InitialPositionInStream, StorageLevel) - Constructor for class org.apache.spark.streaming.kinesis.KinesisReceiver
 
KinesisRecordProcessor - Class in org.apache.spark.streaming.kinesis
Kinesis-specific implementation of the Kinesis Client Library (KCL) IRecordProcessor.
KinesisRecordProcessor(KinesisReceiver, String, KinesisCheckpointState) - Constructor for class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
 
KinesisUtils - Class in org.apache.spark.streaming.kinesis
Helper class to create Amazon Kinesis Input Stream :: Experimental ::
KinesisUtils() - Constructor for class org.apache.spark.streaming.kinesis.KinesisUtils
 
KinesisWordCountASL - Class in org.apache.spark.examples.streaming
Kinesis Spark Streaming WordCount example.
KinesisWordCountASL() - Constructor for class org.apache.spark.examples.streaming.KinesisWordCountASL
 
KinesisWordCountProducerASL - Class in org.apache.spark.examples.streaming
Usage: KinesisWordCountProducerASL is the name of the Kinesis stream (ie.
KinesisWordCountProducerASL() - Constructor for class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
kManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
KMeans - Class in org.apache.spark.mllib.clustering
K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al).
KMeans() - Constructor for class org.apache.spark.mllib.clustering.KMeans
Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, runs: 1, initializationMode: "k-means||", initializationSteps: 5, epsilon: 1e-4, seed: random}.
kMeans(VertexRDD<Object>, int) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Runs k-means clustering.
KMeansDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate test data for KMeans.
KMeansDataGenerator() - Constructor for class org.apache.spark.mllib.util.KMeansDataGenerator
 
KMeansModel - Class in org.apache.spark.mllib.clustering
A clustering model for K-means.
KMeansModel(Vector[]) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel
 
kMeansPlusPlus(int, VectorWithNorm[], double[], int, int) - Static method in class org.apache.spark.mllib.clustering.LocalKMeans
Run K-means++ on the weighted point set points.
KryoDeserializationStream - Class in org.apache.spark.serializer
 
KryoDeserializationStream(Kryo, InputStream) - Constructor for class org.apache.spark.serializer.KryoDeserializationStream
 
KryoRegistrator - Interface in org.apache.spark.serializer
Interface implemented by clients to register their classes with Kryo when using Kryo serialization.
KryoSerializationStream - Class in org.apache.spark.serializer
 
KryoSerializationStream(Kryo, OutputStream) - Constructor for class org.apache.spark.serializer.KryoSerializationStream
 
KryoSerializer - Class in org.apache.spark.serializer
A Spark serializer that uses the Kryo serialization library.
KryoSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.KryoSerializer
 
KryoSerializerInstance - Class in org.apache.spark.serializer
 
KryoSerializerInstance(KryoSerializer) - Constructor for class org.apache.spark.serializer.KryoSerializerInstance
 

L

L1Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Updater for L1 regularized problems.
L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
 
label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
label() - Method in class org.apache.spark.mllib.tree.impl.TreePoint
 
labelCol() - Method in interface org.apache.spark.ml.param.HasLabelCol
param for label column name
LabeledPoint - Class in org.apache.spark.mllib.regression
Class that represents the features and labels of a data point.
LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
 
LabelPropagation - Class in org.apache.spark.graphx.lib
Label Propagation algorithm.
LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
 
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns the sequence of labels in ascending order
labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns the sequence of labels in ascending order
LassoModel - Class in org.apache.spark.mllib.regression
Regression model trained using Lasso.
LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
 
LassoWithSGD - Class in org.apache.spark.mllib.regression
Train a regression model with L1-regularization using Stochastic Gradient Descent.
LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD
Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 0.01, miniBatchFraction: 1.0}.
last(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the last value in a group.
last(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the last value of the column in a group.
lastCompletedBatch() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastDir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
lastFinishTime() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastId() - Static method in class org.apache.spark.Accumulators
 
lastLaunchTime() - Method in class org.apache.spark.scheduler.TaskSetManager
 
lastProgressBar() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastReceivedBatch() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastReceivedBatchRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
lastSeenMs() - Method in class org.apache.spark.storage.BlockManagerInfo
 
lastUpdateTime() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
laterViewToken() - Static method in class org.apache.spark.sql.hive.HiveQl
 
latestInfo() - Method in class org.apache.spark.scheduler.Stage
Pointer to the latest [StageInfo] object, set by DAGScheduler.
latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Return the latest model.
latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Return the latest model.
LAUNCHING() - Static method in class org.apache.spark.TaskState
 
launchTasks(Seq<Seq<TaskDescription>>) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
 
LBFGS - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to solve an optimization problem using Limited-memory BFGS.
LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
 
LDA - Class in org.apache.spark.mllib.clustering
:: Experimental ::
LDA() - Constructor for class org.apache.spark.mllib.clustering.LDA
 
LDA.EMOptimizer - Class in org.apache.spark.mllib.clustering
Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters.
LDA.EMOptimizer(Graph<DenseVector<Object>, Object>, int, int, double, double, int) - Constructor for class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
LDAModel - Class in org.apache.spark.mllib.clustering
:: Experimental ::
LDAModel() - Constructor for class org.apache.spark.mllib.clustering.LDAModel
 
learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
LeastSquaresGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a Least-squared loss function, as used in linear regression.
LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
left() - Method in class org.apache.spark.sql.sources.And
 
left() - Method in class org.apache.spark.sql.sources.Or
 
leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the index of the left child of this node.
leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
leftJoin(Self, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Left outer join another VertexPartition.
leftJoin(Iterator<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Left outer join another iterator of messages.
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
Left joins this VertexRDD with an RDD containing vertex attribute pairs.
leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
 
leftNodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a left outer join of this and other.
leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
Left joins this RDD with another VertexRDD with the same index.
length() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
Size the of block.
length() - Method in class org.apache.spark.scheduler.SplitInfo
 
length() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
length() - Method in class org.apache.spark.storage.FileSegment
 
length() - Method in class org.apache.spark.storage.TachyonFileSegment
 
length() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
length() - Method in class org.apache.spark.util.Distribution
 
length() - Method in class org.apache.spark.util.Vector
 
leq(Object) - Method in class org.apache.spark.sql.Column
Less than or equal to.
less(Duration) - Method in class org.apache.spark.streaming.Duration
 
less(Time) - Method in class org.apache.spark.streaming.Time
 
lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
 
lessEq(Time) - Method in class org.apache.spark.streaming.Time
 
LessThan - Class in org.apache.spark.sql.sources
 
LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
 
LessThanOrEqual - Class in org.apache.spark.sql.sources
 
LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
 
level() - Method in class org.apache.spark.storage.BlockInfo
 
lexicographicOrdering() - Static method in class org.apache.spark.graphx.Edge
 
lexicographicOrdering() - Static method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
libraryPathEnvName() - Static method in class org.apache.spark.util.Utils
Return the current system LD_LIBRARY_PATH name
libraryPathEnvPrefix(Seq<String>) - Static method in class org.apache.spark.util.Utils
Return the prefix of a command that appends the given library paths to the system-specific library path environment variable.
like(String) - Method in class org.apache.spark.sql.Column
SQL like expression.
LIKE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
limit(int) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame by taking the first n rows.
LinearDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate sample data used for Linear Data.
LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
 
LinearRegression - Class in org.apache.spark.ml.regression
:: AlphaComponent ::
LinearRegression() - Constructor for class org.apache.spark.ml.regression.LinearRegression
 
LinearRegressionModel - Class in org.apache.spark.ml.regression
:: AlphaComponent ::
LinearRegressionModel(LinearRegression, ParamMap, Vector, double) - Constructor for class org.apache.spark.ml.regression.LinearRegressionModel
 
LinearRegressionModel - Class in org.apache.spark.mllib.regression
Regression model trained using LinearRegression.
LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
 
LinearRegressionParams - Interface in org.apache.spark.ml.regression
Params for linear regression.
LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
Train a linear regression model with no regularization using Stochastic Gradient Descent.
LinearRegressionWithSGD(double, int, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
 
LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Construct a LinearRegression object with default parameters: {stepSize: 1.0, numIterations: 100, miniBatchFraction: 1.0}.
listener() - Method in class org.apache.spark.scheduler.ActiveJob
 
listener() - Method in class org.apache.spark.scheduler.JobSubmitted
 
listener() - Method in class org.apache.spark.streaming.ui.StreamingTab
 
listener() - Method in class org.apache.spark.ui.env.EnvironmentTab
 
listener() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
listener() - Method in class org.apache.spark.ui.jobs.JobsTab
 
listener() - Method in class org.apache.spark.ui.jobs.StagesTab
 
listener() - Method in class org.apache.spark.ui.storage.StorageTab
 
listenerBus() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
listenerBus() - Method in class org.apache.spark.SparkContext
 
listenerBus() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
ListenerBus<L,E> - Interface in org.apache.spark.util
An event bus which posts events to its listeners.
listeners() - Method in interface org.apache.spark.util.ListenerBus
 
listenerThreadIsAlive() - Method in class org.apache.spark.util.AsynchronousListenerBus
For testing only.
listFiles(String, Configuration) - Static method in class org.apache.spark.sql.parquet.FileSystemHelper
 
listingTable(Seq<String>, Function1<T, Seq<Node>>, Iterable<T>, boolean, Option<String>, Seq<String>, boolean) - Static method in class org.apache.spark.ui.UIUtils
Returns an HTML table constructed by generating a row for each object in a sequence.
lit(Object) - Static method in class org.apache.spark.sql.functions
Creates a Column of literal value.
literals() - Method in class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues
 
LiveListenerBus - Class in org.apache.spark.scheduler
Asynchronously passes SparkListenerEvents to registered SparkListeners.
LiveListenerBus() - Constructor for class org.apache.spark.scheduler.LiveListenerBus
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.SVMModel
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
load(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LassoModel
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
load(SparkContext, String, String, int) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
load(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Loader
Load a model from the given path.
load(String) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Returns the dataset stored at path as a DataFrame, using the default data source configured by spark.sql.sources.default.
load(String, String) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Returns the dataset stored at path as a DataFrame, using the given data source.
load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: (Java-specific) Returns the dataset specified by the given data source and a set of options as a DataFrame.
load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: (Scala-specific) Returns the dataset specified by the given data source and a set of options as a DataFrame.
load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: (Java-specific) Returns the dataset specified by the given data source and a set of options as a DataFrame, using the given schema as the schema of the DataFrame.
load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: (Scala-specific) Returns the dataset specified by the given data source and a set of options as a DataFrame, using the given schema as the schema of the DataFrame.
loadClass(String, boolean) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
 
loadClass(String) - Method in class org.apache.spark.util.ParentClassLoader
 
loadClass(String, boolean) - Method in class org.apache.spark.util.ParentClassLoader
 
loadData(SparkContext, String, String) - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
Helper method for loading GLM classification model data.
loadData(SparkContext, String, String, int) - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
Helper method for loading GLM regression model data.
loadDefaultSparkProperties(SparkConf, String) - Static method in class org.apache.spark.util.Utils
Load default Spark properties from the given file.
Loader<M extends Saveable> - Interface in org.apache.spark.mllib.util
:: DeveloperApi ::
loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile.
loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile with the default number of partitions.
loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of partitions.
loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
 
loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of features determined automatically and the default number of partitions.
loadTrees(SparkContext, String, String) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
Load trees for an ensemble, and return them in order.
loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads vectors saved using RDD[Vector].saveAsTextFile.
loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Loads vectors saved using RDD[Vector].saveAsTextFile with the default number of partitions.
localAccums() - Static method in class org.apache.spark.Accumulators
 
LocalActor - Class in org.apache.spark.scheduler.local
Calls to LocalBackend are all serialized through LocalActor.
LocalActor(TaskSchedulerImpl, LocalBackend, int) - Constructor for class org.apache.spark.scheduler.local.LocalActor
 
localActor() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
LocalBackend - Class in org.apache.spark.scheduler.local
LocalBackend is used when running a local version of Spark where the executor, backend, and master all run in the same JVM.
LocalBackend(TaskSchedulerImpl, int) - Constructor for class org.apache.spark.scheduler.local.LocalBackend
 
localDirs() - Method in class org.apache.spark.storage.DiskBlockManager
 
localDstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
localFraction() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
Computes the fraction of the parents' partitions containing preferredLocation within their getPreferredLocs.
localHostName() - Static method in class org.apache.spark.util.Utils
Get the local machine's hostname.
localIndex(int) - Method in class org.apache.spark.ml.recommendation.ALS.LocalIndexEncoder
Gets the local index from an encoded index.
localIpAddress() - Static method in class org.apache.spark.util.Utils
Get the local host's IP address in dotted-quad format (e.g.
localIpAddressHostname() - Static method in class org.apache.spark.util.Utils
 
localityWaits() - Method in class org.apache.spark.scheduler.TaskSetManager
 
LocalKMeans - Class in org.apache.spark.mllib.clustering
An utility object to run K-means locally.
LocalKMeans() - Constructor for class org.apache.spark.mllib.clustering.LocalKMeans
 
LocalLDAModel - Class in org.apache.spark.mllib.clustering
:: Experimental ::
LocalLDAModel(Matrix) - Constructor for class org.apache.spark.mllib.clustering.LocalLDAModel
 
localSeqToDataFrameHolder(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext.implicits
Creates a DataFrame from a local Seq of Product.
localSrcId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
localValue() - Method in class org.apache.spark.Accumulable
Get the current value of this accumulator from within a task.
location() - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
location() - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
location() - Method in interface org.apache.spark.scheduler.MapStatus
Location where this task was run.
location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
locations_() - Method in class org.apache.spark.rdd.BlockRDD
 
log() - Method in interface org.apache.spark.Logging
 
log() - Method in interface org.apache.spark.util.ActorLogReceive
 
log1pExp(double) - Static method in class org.apache.spark.mllib.util.MLUtils
When x is positive and large, computing math.log(1 + math.exp(x)) will lead to arithmetic overflow.
log2(double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
 
log_() - Method in interface org.apache.spark.Logging
 
logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
 
logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logDirName() - Method in class org.apache.spark.scheduler.JobLogger
 
logError(Function0<String>) - Method in interface org.apache.spark.Logging
 
logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logFileRegex() - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
 
logFilesTologInfo(Seq<Path>) - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
Convert a sequence of files to a sequence of sorted LogInfo objects
loggedEvents() - Method in class org.apache.spark.scheduler.EventLoggingListener
 
Logging - Interface in org.apache.spark
:: DeveloperApi :: Utility trait for classes that want to log data.
logicalRelation() - Method in class org.apache.spark.sql.sources.InsertIntoDataSource
 
LogicalRelation - Class in org.apache.spark.sql.sources
Used to link a BaseRelation in to a logical query plan.
LogicalRelation(BaseRelation) - Constructor for class org.apache.spark.sql.sources.LogicalRelation
 
logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
 
logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
LogisticGradient - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Compute gradient and loss for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression).
LogisticGradient(int) - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
 
LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
 
LogisticRegression - Class in org.apache.spark.ml.classification
:: AlphaComponent ::
LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
 
LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate test data for LogisticRegression.
LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
 
LogisticRegressionModel - Class in org.apache.spark.ml.classification
:: AlphaComponent ::
LogisticRegressionModel(LogisticRegression, ParamMap, Vector, double) - Constructor for class org.apache.spark.ml.classification.LogisticRegressionModel
 
LogisticRegressionModel - Class in org.apache.spark.mllib.classification
Classification model trained using Multinomial/Binary Logistic Regression.
LogisticRegressionModel(Vector, double, int, int) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
 
LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
Constructs a LogisticRegressionModel with weights and intercept for binary classification.
LogisticRegressionParams - Interface in org.apache.spark.ml.classification
Params for logistic regression.
LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification
Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS.
LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
 
LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent.
LogisticRegressionWithSGD(double, int, double, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
 
LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Construct a LogisticRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
logLikelihood() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
Log likelihood of the observed tokens in the training set, given the current parameter estimates: log P(docs | topics, topic distributions for docs, alpha, eta)
logLikelihood() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
 
LogLoss - Class in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Class for log loss calculation (for classification).
LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
 
logMemoryUsage() - Method in class org.apache.spark.storage.MemoryStore
Log information about current memory usage.
logName() - Method in interface org.apache.spark.Logging
 
LogNormalGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
LogNormalGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.LogNormalGenerator
 
logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
Generate a graph whose vertex out degree distribution is log normal.
logNormalJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
logNormalJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
logNormalJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
logNormalRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d. samples from the log normal distribution with the input mean and standard deviation
logNormalVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from a log normal distribution.
logPath() - Method in class org.apache.spark.scheduler.EventLoggingListener
 
logpdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
Returns the log-density of this multivariate Gaussian at given point, x
logpdf(Vector<Object>) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
Returns the log-density of this multivariate Gaussian at given point, x
logPrior() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
Log probability of the current parameter estimate: log P(topics, topic distributions for docs | alpha, eta)
logStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
logStartToJson(SparkListenerLogStart) - Static method in class org.apache.spark.util.JsonProtocol
 
logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
 
logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
logUncaughtExceptions(Function0<T>) - Static method in class org.apache.spark.util.Utils
Execute the given block, logging and re-throwing any uncaught exception.
logUnrollFailureMessage(BlockId, long) - Method in class org.apache.spark.storage.MemoryStore
Log a warning for failing to unroll a block.
logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
 
logUrls() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
 
logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
 
LONG - Class in org.apache.spark.sql.columnar
 
LONG() - Constructor for class org.apache.spark.sql.columnar.LONG
 
LONG_FORM() - Static method in class org.apache.spark.util.CallSite
 
LongColumnAccessor - Class in org.apache.spark.sql.columnar
 
LongColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.LongColumnAccessor
 
LongColumnBuilder - Class in org.apache.spark.sql.columnar
 
LongColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.LongColumnBuilder
 
LongColumnStats - Class in org.apache.spark.sql.columnar
 
LongColumnStats() - Constructor for class org.apache.spark.sql.columnar.LongColumnStats
 
LongConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
LongDelta - Class in org.apache.spark.sql.columnar.compression
 
LongDelta() - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta
 
LongDelta.Decoder - Class in org.apache.spark.sql.columnar.compression
 
LongDelta.Decoder(ByteBuffer, NativeColumnType<LongType$>) - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
LongDelta.Encoder - Class in org.apache.spark.sql.columnar.compression
 
LongDelta.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
longForm() - Method in class org.apache.spark.util.CallSite
 
LongParam - Class in org.apache.spark.ml.param
Specialized version of Param[Long] for Java.
LongParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.LongParam
 
LongParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
 
longRddToDataFrameHolder(RDD<Object>) - Method in class org.apache.spark.sql.SQLContext.implicits
Creates a single column DataFrame from an RDD[Long].
longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
 
longWritableConverter() - Static method in class org.apache.spark.SparkContext
 
longWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
longWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the list of values in the RDD for key key.
lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return the list of values in the RDD for key key.
lookupCachedData(DataFrame) - Method in class org.apache.spark.sql.CacheManager
Optionally returns cached data for the given DataFrame
lookupCachedData(LogicalPlan) - Method in class org.apache.spark.sql.CacheManager
Optionally returns cached data for the given LogicalPlan.
lookupDataSource(String) - Static method in class org.apache.spark.sql.sources.ResolvedDataSource
Given a provider name, look up the data source class definition.
lookupFunction(String, Seq<Expression>) - Method in class org.apache.spark.sql.hive.HiveFunctionRegistry
 
lookupRelation(Seq<String>, Option<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
lookupTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the default Spark timeout to use for Akka remote actor lookup.
loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
Loss - Interface in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Trait for adding "pluggable" loss functions for the gradient boosting algorithm.
Losses - Class in org.apache.spark.mllib.tree.loss
 
Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
 
LOST() - Static method in class org.apache.spark.TaskState
 
low() - Method in class org.apache.spark.partial.BoundedDouble
 
lower() - Method in class org.apache.spark.rdd.JdbcPartition
 
lower(Column) - Static method in class org.apache.spark.sql.functions
Converts a string exprsesion to lower case.
LOWER() - Static method in class org.apache.spark.sql.hive.HiveQl
 
lowerBound() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
lowerBound() - Method in class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
 
lowerCase() - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
lowSplit() - Method in class org.apache.spark.mllib.tree.model.Bin
 
lt(Object) - Method in class org.apache.spark.sql.Column
Less than.
LZ4CompressionCodec - Class in org.apache.spark.io
:: DeveloperApi :: LZ4 implementation of CompressionCodec.
LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
 
LZFCompressionCodec - Class in org.apache.spark.io
:: DeveloperApi :: LZF implementation of CompressionCodec.
LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec
 

M

main(String[]) - Static method in class org.apache.spark.examples.streaming.JavaKinesisWordCountASL
 
main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountASL
 
main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
 
main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
 
main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
 
main(String[]) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
main(String[]) - Static method in class org.apache.spark.streaming.util.RawTextSender
 
main(String[]) - Static method in class org.apache.spark.streaming.util.RecurringTimer
 
main(String[]) - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
main(String[]) - Static method in class org.apache.spark.util.random.XORShiftRandom
Main method for running benchmark
makeBinarySearch(Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.util.CollectionsUtils
 
makeDriverRef(String, SparkConf, ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
 
makeExecutorRef(String, SparkConf, String, int, ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
 
makeOffers() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
makeOffers(String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
makeParquetFile(Seq<T>, File, ClassTag<T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
 
makeParquetFile(DataFrame, File, ClassTag<T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
 
makePartitionDir(File, String, Seq<Tuple2<String, Object>>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
 
makeProgressBar(int, int, int, int, int) - Static method in class org.apache.spark.ui.UIUtils
 
makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD.
makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD, with one or more location preferences (hostnames of Spark nodes) for each object.
makeRDDForPartitionedTable(Seq<Partition>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
 
makeRDDForPartitionedTable(Map<Partition, Class<? extends Deserializer>>, Option<PathFilter>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
Create a HadoopRDD for every partition key specified in the query.
makeRDDForPartitionedTable(Seq<Partition>) - Method in interface org.apache.spark.sql.hive.TableReader
 
makeRDDForTable(Table) - Method in class org.apache.spark.sql.hive.HadoopTableReader
 
makeRDDForTable(Table, Class<? extends Deserializer>, Option<PathFilter>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
Creates a Hadoop RDD to read data from the target table's data directory.
makeRDDForTable(Table) - Method in interface org.apache.spark.sql.hive.TableReader
 
managedIfNoPath() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
managedIfNoPath() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
ManualClock - Class in org.apache.spark.util
A Clock whose time can be manually set and modified.
ManualClock(long) - Constructor for class org.apache.spark.util.ManualClock
 
ManualClock() - Constructor for class org.apache.spark.util.ManualClock
 
map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
map(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Construct a new edge partition by applying the function f to all edges in this partition.
map(Iterator<ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Construct a new edge partition by using the edge attributes contained in the iterator.
map(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Pass each vertex attribute along with the vertex id through a map function and retain the original RDD's partitioning and index.
map(Function1<Object, Object>) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
map(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix
Map the values of this matrix using a function.
map(Function1<Object, Object>) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult
Transform this PartialResult into a PartialResult of type T.
map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to all elements of this RDD.
map(DataType, DataType) - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type map
map(MapType) - Method in class org.apache.spark.sql.ColumnName
 
map(Function1<Row, R>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
Returns a new RDD by applying a function to all rows of this DataFrame.
map(Function1<T, R>, ClassTag<R>) - Method in interface org.apache.spark.sql.RDDApi
 
map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream.
map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by applying a function to all elements of this DStream.
MAP_KEY_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
MAP_OUTPUT_TRACKER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
MAP_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
MAP_VALUE_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
mapAsSerializableJavaMap(Map<A, B>) - Static method in class org.apache.spark.api.java.JavaUtils
 
mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute in the graph using the map function.
mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute using the map function, passing it a whole partition at a time.
mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-------------------------------- * Util JSON deserialization methods | ---------------------------------
mapId() - Method in class org.apache.spark.FetchFailed
 
mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
MapOutputTracker - Class in org.apache.spark
Class that keeps track of the location of the map output of a stage.
MapOutputTracker(SparkConf) - Constructor for class org.apache.spark.MapOutputTracker
 
mapOutputTracker() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
mapOutputTracker() - Method in class org.apache.spark.SparkEnv
 
MapOutputTrackerMaster - Class in org.apache.spark
MapOutputTracker for the driver.
MapOutputTrackerMaster(SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerMaster
 
MapOutputTrackerMasterActor - Class in org.apache.spark
Actor class for MapOutputTrackerMaster
MapOutputTrackerMasterActor(MapOutputTrackerMaster, SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerMasterActor
 
MapOutputTrackerMessage - Interface in org.apache.spark
 
MapOutputTrackerWorker - Class in org.apache.spark
MapOutputTracker for the executors, which fetches map output information from the driver's MapOutputTrackerMaster.
MapOutputTrackerWorker(SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerWorker
 
MapPartitionedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
MapPartitionedDStream(DStream<T>, Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<Row>, Iterator<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
Returns a new RDD by applying a function to each partition of this DataFrame.
mapPartitions(Function1<Iterator<T>, Iterator<R>>, ClassTag<R>) - Method in interface org.apache.spark.sql.RDDApi
 
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
MapPartitionsRDD<U,T> - Class in org.apache.spark.rdd
 
MapPartitionsRDD(RDD<T>, Function3<TaskContext, Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.MapPartitionsRDD
 
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
:: DeveloperApi :: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD
Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.HadoopRDD
Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.NewHadoopRDD
Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
MappedDStream<T,U> - Class in org.apache.spark.streaming.dstream
 
MappedDStream(DStream<T>, Function1<T, U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MappedDStream
 
mapper() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
MAPRED_REDUCE_TASKS() - Method in class org.apache.spark.sql.SQLConf.Deprecated$
 
mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
 
MapStatus - Interface in org.apache.spark.scheduler
Result returned by a ShuffleMapTask to a scheduler.
mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
mapToJson(Map<String, String>) - Static method in class org.apache.spark.util.JsonProtocol
------------------------------ * Util JSON serialization methods | -------------------------------
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream by applying a function to all elements of this DStream.
mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Transforms each edge attribute a partition at a time using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
MapValuedDStream<K,V,U> - Class in org.apache.spark.streaming.dstream
 
MapValuedDStream(DStream<Tuple2<K, V>>, Function1<V, U>, ClassTag<K>, ClassTag<V>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MapValuedDStream
 
mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD
Map the values in an edge partitioning preserving the structure but changing the values.
mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Maps each vertex attribute, preserving the index.
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Maps each vertex attribute, additionally supplying the vertex ID.
mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapVertexPartitions(Function1<ShippableVertexPartition<VD>, ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
mapVertexPartitions(Function1<ShippableVertexPartition<VD>, ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Applies a function to each VertexPartition of this RDD and returns a new VertexRDD.
mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph
Transforms each vertex attribute in the graph using the map function.
mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Maps f over this RDD, where f takes an additional parameter of type A.
markCheckpointed(RDD<?>) - Method in class org.apache.spark.rdd.RDD
Changes the dependencies of this RDD from its original parents to a new RDD (newRDD) created from the checkpoint file, and forget its old dependencies and partitions.
MarkedForCheckpoint() - Static method in class org.apache.spark.rdd.CheckpointState
 
markFailed(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markFailure() - Method in class org.apache.spark.storage.BlockInfo
Mark this BlockInfo as ready but failed
markForCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
markGettingResult(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markInterrupted() - Method in class org.apache.spark.TaskContextImpl
Marks the task for interruption, i.e.
markPartiallyConstructed(SparkContext, boolean) - Static method in class org.apache.spark.SparkContext
Called at the beginning of the SparkContext constructor to ensure that no SparkContext is running.
markReady(long) - Method in class org.apache.spark.storage.BlockInfo
Mark this BlockInfo as ready (i.e.
markSuccessful(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
markTaskCompleted() - Method in class org.apache.spark.TaskContextImpl
Marks the task as completed and triggers the listeners.
mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
Restricts the graph to only the vertices and edges that are also in other, but keeps the attributes from this graph.
mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
mask() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
mask() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
mask() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
master() - Method in class org.apache.spark.api.java.JavaSparkContext
 
master() - Method in class org.apache.spark.SparkContext
 
master() - Method in class org.apache.spark.storage.BlockManager
 
master() - Method in class org.apache.spark.storage.TachyonBlockManager
 
master() - Method in class org.apache.spark.streaming.Checkpoint
 
Matrices - Class in org.apache.spark.mllib.linalg
Factory methods for Matrix.
Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
 
Matrix - Interface in org.apache.spark.mllib.linalg
Trait for a local matrix.
MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents an entry in an distributed matrix.
MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation
Model representing the result of matrix factorization.
MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
MatrixFactorizationModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.recommendation
 
MatrixFactorizationModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
 
max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the maximum element from this RDD as defined by the specified Comparator[T].
max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Maximum value of each column.
max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the max of this RDD as defined by the implicit Ordering[T].
max(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the maximum value of the expression in a group.
max(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the maximum value of the column in a group.
max(String...) - Method in class org.apache.spark.sql.GroupedData
Compute the max value for each numeric columns for each group.
max(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
Compute the max value for each numeric columns for each group.
MAX() - Static method in class org.apache.spark.sql.hive.HiveQl
 
max(Duration) - Method in class org.apache.spark.streaming.Duration
 
max(Time) - Method in class org.apache.spark.streaming.Time
 
max(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
max() - Method in class org.apache.spark.util.StatCounter
 
MAX_ATTEMPTS() - Method in class org.apache.spark.streaming.CheckpointWriter
 
MAX_DICT_SIZE() - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
MAX_SLAVE_FAILURES() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
maxAkkaFrameSize() - Method in class org.apache.spark.MapOutputTrackerMasterActor
 
maxBatchSize() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxBins() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
maxCores() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxDepth() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
maxFrameSizeBytes(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the configured max frame size for Akka messages in bytes.
maxIter() - Method in interface org.apache.spark.ml.param.HasMaxIter
param for max number of iterations
maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
maxMem() - Method in class org.apache.spark.storage.BlockManagerInfo
 
maxMem() - Method in class org.apache.spark.storage.StorageStatus
 
maxMemory() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
maxMemSize() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the maximum number of nodes which can be in the given level of the tree.
maxRegisteredWaitingTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
maxResultSize() - Method in class org.apache.spark.scheduler.TaskSetManager
 
maxRetries() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
maxTaskFailures() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
maxTaskFailures() - Method in class org.apache.spark.scheduler.TaskSetManager
 
maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
maybePartitionSpec() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
maybeSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
mean() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
 
mean() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
 
mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample mean vector.
mean() - Method in class org.apache.spark.partial.BoundedDouble
 
mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the mean of this RDD's elements.
mean(String...) - Method in class org.apache.spark.sql.GroupedData
Compute the average value for each numeric columns for each group.
mean(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
Compute the average value for each numeric columns for each group.
mean() - Method in class org.apache.spark.util.StatCounter
 
meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.
meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the approximate mean of the elements in this RDD.
meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the mean within a timeout.
meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
:: Experimental :: Approximate operation to return the mean within a timeout.
meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
Returns the mean average precision (MAP) of all the queries.
MeanEvaluator - Class in org.apache.spark.partial
An ApproximateEvaluator for means.
MeanEvaluator(int, double) - Constructor for class org.apache.spark.partial.MeanEvaluator
 
means() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
 
meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.
megabytesToString(long) - Static method in class org.apache.spark.util.Utils
Convert a quantity in megabytes to a human-readable string such as "4.0 MB".
MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
 
MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
 
memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
MemoryEntry - Class in org.apache.spark.storage
 
MemoryEntry(Object, long, boolean) - Constructor for class org.apache.spark.storage.MemoryEntry
 
MemoryParam - Class in org.apache.spark.util
An extractor object for parsing JVM memory strings, such as "10g", into an Int representing the number of megabytes.
MemoryParam() - Constructor for class org.apache.spark.util.MemoryParam
 
memoryStore() - Method in class org.apache.spark.storage.BlockManager
 
MemoryStore - Class in org.apache.spark.storage
Stores blocks in memory, either as Arrays of deserialized Java objects or as serialized ByteBuffers.
MemoryStore(BlockManager, long) - Constructor for class org.apache.spark.storage.MemoryStore
 
memoryStringToMb(String) - Static method in class org.apache.spark.util.Utils
Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) to a number of megabytes.
memoryUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
MemoryUtils - Class in org.apache.spark.scheduler.cluster.mesos
 
MemoryUtils() - Constructor for class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
memRemaining() - Method in class org.apache.spark.storage.StorageStatus
Return the memory remaining in this block manager.
memSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
memSize() - Method in class org.apache.spark.storage.BlockStatus
 
memSize() - Method in class org.apache.spark.storage.RDDInfo
 
memUsed() - Method in class org.apache.spark.storage.StorageStatus
Return the memory used by this block manager.
memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
Return the memory used by the given RDD in this block manager in O(1) time.
merge(R) - Method in class org.apache.spark.Accumulable
Merge two accumulable objects together
merge(ALS.NormalEquation) - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
Merges another normal equation object.
merge(ALS.RatingBlock<ID>) - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
Merges another ALS.RatingBlockBuilder.
merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
Merges another.
merge(FPTree<T>) - Method in class org.apache.spark.mllib.fpm.FPTree
Merges another FP-Tree.
merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
Merge another MultivariateOnlineSummarizer, and update the statistical summary.
merge(DTStatsAggregator) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Merge this aggregator with another, and returns this aggregator.
merge(double[], int, int) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
Merge the stats from one bin into another.
merge(int, U) - Method in interface org.apache.spark.partial.ApproximateEvaluator
 
merge(int, long) - Method in class org.apache.spark.partial.CountEvaluator
 
merge(int, OpenHashMap<T, Object>) - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
merge(int, HashMap<T, StatCounter>) - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
merge(int, HashMap<T, StatCounter>) - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
merge(int, StatCounter) - Method in class org.apache.spark.partial.MeanEvaluator
 
merge(int, StatCounter) - Method in class org.apache.spark.partial.SumEvaluator
 
merge(Option<AcceptanceResult>) - Method in class org.apache.spark.util.random.AcceptanceResult
 
merge(double) - Method in class org.apache.spark.util.StatCounter
Add a value into this StatCounter, updating the internal statistics.
merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter
Add multiple values into this StatCounter, updating the internal statistics.
merge(StatCounter) - Method in class org.apache.spark.util.StatCounter
Merge another StatCounter into this one, adding up the internal statistics.
MERGE_SCHEMA() - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
 
mergeCombiners() - Method in class org.apache.spark.Aggregator
 
mergeForFeature(int, int, int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
For a given feature, merge the stats for two bins.
mergeMetastoreParquetSchema(StructType, StructType) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
Reconciles Hive Metastore case insensitivity issue and data type conflicts between Metastore schema and Parquet schema.
mergeValue() - Method in class org.apache.spark.Aggregator
 
MesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos
A SchedulerBackend for running fine-grained tasks on Mesos.
MesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
MesosTaskLaunchData - Class in org.apache.spark.scheduler.cluster.mesos
Wrapper for serializing the data sent when launching Mesos tasks.
MesosTaskLaunchData(ByteBuffer, int) - Constructor for class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
 
message() - Method in class org.apache.spark.FetchFailed
 
message() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
 
message() - Method in class org.apache.spark.scheduler.ExecutorLossReason
 
message() - Method in exception org.apache.spark.storage.BlockException
 
message() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
metadata() - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
 
metadataCleaner() - Method in class org.apache.spark.SparkContext
 
MetadataCleaner - Class in org.apache.spark.util
Runs a timer task to periodically clean up metadata (e.g.
MetadataCleaner(Enumeration.Value, Function1<Object, BoxedUnit>, SparkConf) - Constructor for class org.apache.spark.util.MetadataCleaner
 
MetadataCleanerType - Class in org.apache.spark.util
 
MetadataCleanerType() - Constructor for class org.apache.spark.util.MetadataCleanerType
 
METASTORE_SCHEMA() - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
 
MetastoreRelation - Class in org.apache.spark.sql.hive
 
MetastoreRelation(String, String, Option<String>, Table, Seq<Partition>, SQLContext) - Constructor for class org.apache.spark.sql.hive.MetastoreRelation
 
MetastoreRelation.SchemaAttribute - Class in org.apache.spark.sql.hive
 
MetastoreRelation.SchemaAttribute(FieldSchema) - Constructor for class org.apache.spark.sql.hive.MetastoreRelation.SchemaAttribute
 
method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
param for metric name in evaluation
metricRegistry() - Method in class org.apache.spark.metrics.source.JvmSource
 
metricRegistry() - Method in interface org.apache.spark.metrics.source.Source
 
metricRegistry() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
metricRegistry() - Method in class org.apache.spark.storage.BlockManagerSource
 
metricRegistry() - Method in class org.apache.spark.streaming.StreamingSource
 
metrics() - Method in class org.apache.spark.ExceptionFailure
 
metrics() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
metrics() - Method in class org.apache.spark.scheduler.Task
 
MetricsConfig - Class in org.apache.spark.metrics
 
MetricsConfig(Option<String>) - Constructor for class org.apache.spark.metrics.MetricsConfig
 
MetricsServlet - Class in org.apache.spark.metrics.sink
 
MetricsServlet(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.MetricsServlet
 
MetricsSystem - Class in org.apache.spark.metrics
Spark Metrics System, created by specific "instance", combined by source, sink, periodically poll source metrics data to sink destinations.
metricsSystem() - Method in class org.apache.spark.SparkContext
 
metricsSystem() - Method in class org.apache.spark.SparkEnv
 
MFDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate RDD(s) containing data for Matrix Factorization.
MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
 
microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns micro-averaged label-based f1-measure (equals to micro-averaged document-based f1-measure)
microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns micro-averaged label-based precision (equals to micro-averaged document-based precision)
microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns micro-averaged label-based recall (equals to micro-averaged document-based recall)
milliseconds() - Method in class org.apache.spark.streaming.Duration
 
milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
 
Milliseconds - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of milliseconds.
Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
 
milliseconds() - Method in class org.apache.spark.streaming.Time
 
millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener
Reformat a time interval in milliseconds to a prettier format for output
min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the minimum element from this RDD as defined by the specified Comparator[T].
min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Minimum value of each column.
min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the min of this RDD as defined by the implicit Ordering[T].
min(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the minimum value of the expression in a group.
min(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the minimum value of the column in a group.
min(String...) - Method in class org.apache.spark.sql.GroupedData
Compute the min value for each numeric column for each group.
min(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
Compute the min value for each numeric column for each group.
MIN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
min(Duration) - Method in class org.apache.spark.streaming.Duration
 
min(Time) - Method in class org.apache.spark.streaming.Time
 
min() - Method in class org.apache.spark.util.StatCounter
 
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
 
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
 
MINIMUM_INTERVAL_SECONDS() - Static method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
MINIMUM_SHARES_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
MINIMUM_SIZE_BYTES() - Static method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
minInfoGain() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
minMemoryMapBytes() - Method in class org.apache.spark.storage.DiskStore
 
minPollTime() - Method in class org.apache.spark.util.SystemClock
 
minRegisteredRatio() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
minSamplingRate() - Static method in class org.apache.spark.util.random.BinomialBounds
 
minShare() - Method in class org.apache.spark.scheduler.Pool
 
minShare() - Method in interface org.apache.spark.scheduler.Schedulable
 
minShare() - Method in class org.apache.spark.scheduler.TaskSetManager
 
minus(Object) - Method in class org.apache.spark.sql.Column
Subtraction.
minus(Duration) - Method in class org.apache.spark.streaming.Duration
 
minus(Time) - Method in class org.apache.spark.streaming.Time
 
minus(Duration) - Method in class org.apache.spark.streaming.Time
 
minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
minutes(long) - Static method in class org.apache.spark.streaming.Durations
 
Minutes - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of minutes.
Minutes() - Constructor for class org.apache.spark.streaming.Minutes
 
MINUTES_PER_HOUR() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
MLUtils - Class in org.apache.spark.mllib.util
Helper methods to load, save and pre-process data used in ML Lib.
MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
 
mod(Object) - Method in class org.apache.spark.sql.Column
Modulo (a.k.a.
mode() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
 
mode() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
mode() - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
 
Model<M extends Model<M>> - Class in org.apache.spark.ml
:: AlphaComponent :: A fitted model, i.e., a Transformer produced by an Estimator.
Model() - Constructor for class org.apache.spark.ml.Model
 
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.InBlock$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.Rating$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.RatingBlock$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.FloatAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.IntAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.LongAccumulatorParam$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveQl.Token$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.SQLConf.Deprecated$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.ExpireDeadHosts$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocations$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetPeers$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.streaming.kafka.KafkaCluster.LeaderOffset$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ui.JettyUtils.ServletParams$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ui.jobs.UIData.JobUIData$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ui.jobs.UIData.TaskUIData$
Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.util.Vector.VectorAccumParam$
Static reference to the singleton instance of this Scala object.
MQTTInputDStream - Class in org.apache.spark.streaming.mqtt
Input stream that subscribe messages from a Mqtt Broker.
MQTTInputDStream(StreamingContext, String, String, StorageLevel) - Constructor for class org.apache.spark.streaming.mqtt.MQTTInputDStream
 
MQTTReceiver - Class in org.apache.spark.streaming.mqtt
 
MQTTReceiver(String, String, StorageLevel) - Constructor for class org.apache.spark.streaming.mqtt.MQTTReceiver
 
MQTTUtils - Class in org.apache.spark.streaming.mqtt
 
MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
 
msDurationToString(long) - Static method in class org.apache.spark.util.Utils
Returns a human-readable string representing a duration such as "35ms"
msg() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
msg() - Method in class org.apache.spark.streaming.scheduler.ErrorReported
 
mu() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
 
MulticlassMetrics - Class in org.apache.spark.mllib.evaluation
::Experimental:: Evaluator for multiclass classification.
MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
 
MultilabelMetrics - Class in org.apache.spark.mllib.evaluation
Evaluator for multilabel classification.
MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
 
multiLabelValidator(int) - Static method in class org.apache.spark.mllib.util.DataValidators
Function to check if labels used for k class multi-label classification are in the range of {0, 1, ..., k - 1}.
multiply(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Left multiplies this BlockMatrix to other, another BlockMatrix.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Multiply this matrix by a local matrix on the right.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Multiply this matrix by a local matrix on the right.
multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix
Convenience method for `Matrix`-`DenseMatrix` multiplication.
multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix
Convenience method for `Matrix`-`DenseVector` multiplication.
multiply(Object) - Method in class org.apache.spark.sql.Column
Multiplication of this expression and another expression.
multiply(double) - Method in class org.apache.spark.util.Vector
 
multiplyGramianMatrixBy(DenseVector<Object>) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Multiplies the Gramian matrix A^T A by a dense vector on the right without computing A^T A.
MultivariateGaussian - Class in org.apache.spark.mllib.stat.distribution
:: DeveloperApi :: This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.
MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
 
MultivariateGaussian(DenseVector<Object>, DenseMatrix<Object>) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
private[mllib] constructor
MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat
:: DeveloperApi :: MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for samples in sparse or dense vector format in a online fashion.
MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat
Trait for multivariate statistical summary of a data matrix.
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
 
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
MutablePair<T1,T2> - Class in org.apache.spark.util
:: DeveloperApi :: A tuple of 2 elements.
MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
 
MutablePair() - Constructor for class org.apache.spark.util.MutablePair
No-arg constructor for serialization
MutableRowWriteSupport - Class in org.apache.spark.sql.parquet
 
MutableRowWriteSupport() - Constructor for class org.apache.spark.sql.parquet.MutableRowWriteSupport
 
MutableURLClassLoader - Class in org.apache.spark.util
URL class loader that exposes the addURL and getURLs methods in URLClassLoader.
MutableURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.MutableURLClassLoader
 
myLocalityLevels() - Method in class org.apache.spark.scheduler.TaskSetManager
 
myName() - Method in class org.apache.spark.util.InnerClosureFinder
 
MySQLQuirks - Class in org.apache.spark.sql.jdbc
 
MySQLQuirks() - Constructor for class org.apache.spark.sql.jdbc.MySQLQuirks
 

N

n() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
Number of observations.
n() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
n() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
NaiveBayes - Class in org.apache.spark.mllib.classification
Trains a Naive Bayes model given an RDD of (label, features) pairs.
NaiveBayes() - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
 
NaiveBayesModel - Class in org.apache.spark.mllib.classification
Model for Naive Bayes Classifiers.
NaiveBayesModel(double[], double[], double[][]) - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel
 
name() - Method in class org.apache.spark.Accumulable
 
name() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
name() - Method in class org.apache.spark.ml.param.Param
 
name() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
name() - Method in class org.apache.spark.rdd.RDD
A friendly name for this RDD
name() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
name() - Method in class org.apache.spark.scheduler.Pool
 
name() - Method in interface org.apache.spark.scheduler.Schedulable
 
name() - Method in class org.apache.spark.scheduler.Stage
 
name() - Method in class org.apache.spark.scheduler.StageInfo
 
name() - Method in class org.apache.spark.scheduler.TaskDescription
 
name() - Method in class org.apache.spark.scheduler.TaskSetManager
 
name() - Method in interface org.apache.spark.SparkStageInfo
 
name() - Method in class org.apache.spark.SparkStageInfoImpl
 
name() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.QualifiedTableName
 
name() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
name() - Method in class org.apache.spark.storage.BlockId
A globally unique identifier for this Block.
name() - Method in class org.apache.spark.storage.BroadcastBlockId
 
name() - Method in class org.apache.spark.storage.RDDBlockId
 
name() - Method in class org.apache.spark.storage.RDDInfo
 
name() - Method in class org.apache.spark.storage.ShuffleBlockId
 
name() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
name() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
name() - Method in class org.apache.spark.storage.StreamBlockId
 
name() - Method in class org.apache.spark.storage.TaskResultBlockId
 
name() - Method in class org.apache.spark.storage.TempLocalBlockId
 
name() - Method in class org.apache.spark.storage.TempShuffleBlockId
 
name() - Method in class org.apache.spark.storage.TestBlockId
 
name() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
name() - Method in class org.apache.spark.ui.WebUITab
 
name() - Method in class org.apache.spark.util.MetadataCleaner
 
namedThreadFactory(String) - Static method in class org.apache.spark.util.Utils
Create a thread factory that names threads with a prefix and also sets the threads to daemon.
nameToObjectMap() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
NANOS_PER_MILLI() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
NANOS_PER_SECOND() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
NanoTime - Class in org.apache.spark.sql.parquet.timestamp
 
NanoTime() - Constructor for class org.apache.spark.sql.parquet.timestamp.NanoTime
 
NarrowCoGroupSplitDep - Class in org.apache.spark.rdd
 
NarrowCoGroupSplitDep(RDD<?>, int, Partition) - Constructor for class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
NarrowDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD.
NarrowDependency(RDD<T>) - Constructor for class org.apache.spark.NarrowDependency
 
NativeColumnAccessor<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnAccessor(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.NativeColumnAccessor
 
NativeColumnBuilder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnBuilder(ColumnStats, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.NativeColumnBuilder
 
NativeColumnType<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar
 
NativeColumnType(T, int, int) - Constructor for class org.apache.spark.sql.columnar.NativeColumnType
 
NativePlaceholder - Class in org.apache.spark.sql.hive
Used when we need to start parsing the AST before deciding that we are going to pass the command back for Hive to execute natively.
NativePlaceholder() - Constructor for class org.apache.spark.sql.hive.NativePlaceholder
 
ndcgAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
Compute the average NDCG value of all the queries, truncated at ranking position k.
negate(Column) - Static method in class org.apache.spark.sql.functions
Unary minus, i.e.
networkStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented receiver.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop file with an arbitrary new API InputFormat.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.SparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAttemptId() - Method in class org.apache.spark.scheduler.Stage
Return a new attempt id, starting with 0.
newBroadcast(T, boolean, long, ClassTag<T>) - Method in interface org.apache.spark.broadcast.BroadcastFactory
Creates a new broadcast variable.
newBroadcast(T, boolean, ClassTag<T>) - Method in class org.apache.spark.broadcast.BroadcastManager
 
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
newDaemonCachedThreadPool(String) - Static method in class org.apache.spark.util.Utils
Wrapper over newCachedThreadPool.
newDaemonFixedThreadPool(int, String) - Static method in class org.apache.spark.util.Utils
Wrapper over newFixedThreadPool.
newGetLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
NewHadoopPartition - Class in org.apache.spark.rdd
 
NewHadoopPartition(int, int, InputSplit) - Constructor for class org.apache.spark.rdd.NewHadoopPartition
 
NewHadoopRDD<K,V> - Class in org.apache.spark.rdd
:: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the new MapReduce API (org.apache.hadoop.mapreduce).
NewHadoopRDD(SparkContext, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, Configuration) - Constructor for class org.apache.spark.rdd.NewHadoopRDD
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD<U,T> - Class in org.apache.spark.rdd
Analogous to MapPartitionsRDD, but passes in an InputSplit to the given function rather than the index of the partition.
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD(RDD<T>, Function2<InputSplit, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$ - Class in org.apache.spark.rdd
 
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$() - Constructor for class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
 
newId() - Static method in class org.apache.spark.Accumulators
 
newInputSplit() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
newInstance() - Method in class org.apache.spark.serializer.JavaSerializer
 
newInstance() - Method in class org.apache.spark.serializer.KryoSerializer
 
newInstance() - Method in class org.apache.spark.serializer.Serializer
Creates a new SerializerInstance.
newInstance() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
newInstance() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
newInstance() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
newInstance() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
newInstance() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
newJobContext(JobConf, JobID) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newJobContext(Configuration, JobID) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newKryo() - Method in class org.apache.spark.serializer.KryoSerializer
 
newKryoOutput() - Method in class org.apache.spark.serializer.KryoSerializer
 
newMesosTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
newRddId() - Method in class org.apache.spark.SparkContext
Register a new RDD, returning its RDD ID
newShuffleId() - Method in class org.apache.spark.SparkContext
 
newTaskAttemptContext(JobConf, TaskAttemptID) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newTaskAttemptContext(Configuration, TaskAttemptID) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newTaskAttemptID(String, int, boolean, int, int) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
 
newTaskAttemptID(String, int, boolean, int, int) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
 
newTaskId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
next() - Method in class org.apache.spark.InterruptibleIterator
 
next() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
next() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
next(MutableRow, int) - Method in interface org.apache.spark.sql.columnar.compression.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
next(MutableRow, int) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
next() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
next() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
next() - Method in class org.apache.spark.util.CompletionIterator
 
next() - Method in class org.apache.spark.util.IdGenerator
 
next() - Method in class org.apache.spark.util.NextIterator
 
next() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
next() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
NextIterator<U> - Class in org.apache.spark.util
Provides a basic/boilerplate Iterator implementation.
NextIterator() - Constructor for class org.apache.spark.util.NextIterator
 
nextJobId() - Method in class org.apache.spark.scheduler.DAGScheduler
 
nextKeyValue() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
nextKeyValue() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
nextKeyValue() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
nextMesosTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
nextNullIndex() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
nextTaskId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
nextValue() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
 
nextValue() - Method in class org.apache.spark.mllib.random.GammaGenerator
 
nextValue() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
 
nextValue() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
nextValue() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
Returns an i.i.d.
nextValue() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
nextValue() - Method in class org.apache.spark.mllib.random.UniformGenerator
 
NNLS - Class in org.apache.spark.mllib.optimization
Object used to solve nonnegative least squares problems using a modified projected gradient method.
NNLS() - Constructor for class org.apache.spark.mllib.optimization.NNLS
 
NNLS.Workspace - Class in org.apache.spark.mllib.optimization
 
NNLS.Workspace(int) - Constructor for class org.apache.spark.mllib.optimization.NNLS.Workspace
 
NO_PREF() - Static method in class org.apache.spark.scheduler.TaskLocality
 
Node - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Node in a decision tree.
Node(int, Predict, double, boolean, Option<Split>, Option<Node>, Option<Node>, Option<InformationGainStats>) - Constructor for class org.apache.spark.mllib.tree.model.Node
 
node() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData
 
NODE_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
nodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
NodeIdCache - Class in org.apache.spark.mllib.tree.impl
:: DeveloperApi :: A given TreePoint would belong to a particular node per tree.
NodeIdCache(RDD<int[]>, int) - Constructor for class org.apache.spark.mllib.tree.impl.NodeIdCache
 
nodeIdsForInstances() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
 
nodeIndex() - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
nodeIndexInGroup() - Method in class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
NodeIndexUpdater - Class in org.apache.spark.mllib.tree.impl
:: DeveloperApi :: This is used by the node id cache to find the child id that a data point would belong to.
NodeIndexUpdater(Split, int) - Constructor for class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
nodesToGenerator(Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
nodeToRelation(Node) - Static method in class org.apache.spark.sql.hive.HiveQl
 
nodeToSortOrder(Node) - Static method in class org.apache.spark.sql.hive.HiveQl
 
noLocality() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
NONE - Static variable in class org.apache.spark.api.java.StorageLevels
 
None - Static variable in class org.apache.spark.graphx.TripletFields
None of the triplet fields are exposed.
NONE() - Static method in class org.apache.spark.scheduler.SchedulingMode
 
NONE() - Static method in class org.apache.spark.storage.StorageLevel
 
nonLocalPaths(String, boolean) - Static method in class org.apache.spark.util.Utils
Return all non-local paths from a comma-separated list of paths.
nonnegative() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for whether to apply nonnegativity constraints.
nonNegativeHash(Object) - Static method in class org.apache.spark.util.Utils
 
nonNegativeMod(int, int) - Static method in class org.apache.spark.util.Utils
 
NoopColumnStats - Class in org.apache.spark.sql.columnar
A no-op ColumnStats only used for testing purposes.
NoopColumnStats() - Constructor for class org.apache.spark.sql.columnar.NoopColumnStats
 
NoQuirks - Class in org.apache.spark.sql.jdbc
 
NoQuirks() - Constructor for class org.apache.spark.sql.jdbc.NoQuirks
 
norm() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
 
norm(Vector, double) - Static method in class org.apache.spark.mllib.linalg.Vectors
Returns the p-norm of this vector.
NORMAL_APPROX_SAMPLE_SIZE() - Method in class org.apache.spark.partial.StudentTCacher
 
normalApprox() - Method in class org.apache.spark.partial.StudentTCacher
 
normalize(RDD<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Normalizes the affinity matrix (A) by row sums and returns the normalized affinity matrix (W).
Normalizer - Class in org.apache.spark.mllib.feature
:: Experimental :: Normalizes samples individually to unit L^p^ norm
Normalizer(double) - Constructor for class org.apache.spark.mllib.feature.Normalizer
 
Normalizer() - Constructor for class org.apache.spark.mllib.feature.Normalizer
 
normalJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
RandomRDDs.normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.
normalJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
normalRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d. samples from the standard normal distribution.
normalVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the standard normal distribution.
normL1() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
normL1() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
L1 norm of each column
normL2() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
normL2() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Euclidean magnitude of each column
not(Column) - Static method in class org.apache.spark.sql.functions
Inversion of boolean expression, i.e.
NOT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
Not - Class in org.apache.spark.sql.sources
 
Not(Filter) - Constructor for class org.apache.spark.sql.sources.Not
 
NOT_SET() - Static method in class org.apache.spark.ExecutorAllocationManager
 
notEqual(Object) - Method in class org.apache.spark.sql.Column
Inequality test.
notifyError(Throwable) - Method in class org.apache.spark.streaming.ContextWaiter
 
notifyStop() - Method in class org.apache.spark.streaming.ContextWaiter
 
nullable() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
nullable() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
NullableColumnAccessor - Interface in org.apache.spark.sql.columnar
 
NullableColumnBuilder - Interface in org.apache.spark.sql.columnar
A stackable trait used for building byte buffer for a column containing null values.
nullCount() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
nullCount() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
nullCount() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
nullCount() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
nullHypothesis() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
nullHypothesis() - Method in interface org.apache.spark.mllib.stat.test.TestResult
Null hypothesis of the test.
nulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
nullsBuffer() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
nullTypeToStringType(StructType) - Static method in class org.apache.spark.sql.json.JsonRDD
 
NUM_PARTITIONS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
 
numAccepted() - Method in class org.apache.spark.util.random.AcceptanceResult
 
numActives() - Method in class org.apache.spark.graphx.impl.EdgePartition
The number of active vertices, if any exist.
numActiveStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numActiveTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numActiveTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numActiveTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numActiveTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numAvailableOutputs() - Method in class org.apache.spark.scheduler.Stage
 
numberOfHiccups() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numberOfMsgs() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numberOfWorkers() - Method in class org.apache.spark.streaming.receiver.Statistics
 
numBins() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
 
numBins() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numBlocks() - Method in class org.apache.spark.storage.StorageStatus
Return the number of blocks stored in this block manager in O(RDDs) time.
numCachedPartitions() - Method in class org.apache.spark.storage.RDDInfo
 
numClasses() - Method in class org.apache.spark.ml.classification.ClassificationModel
Number of classes (values which the label can take).
numClasses() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
numClasses() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
numClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
numClasses() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numColBlocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Gets or computes the number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.Matrix
Number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
numCompletedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
numCompletedTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numCompletedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numCompletedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numCompleteTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numDescendants() - Method in class org.apache.spark.mllib.tree.model.Node
Get the number of nodes in tree below this node, including leaf nodes.
numEdgePartitions() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
The maximum number of edge partitions this `RoutingTablePartition` is built to join with.
numEdges() - Method in class org.apache.spark.graphx.GraphOps
The number of edges in the graph.
numericAstTypes() - Static method in class org.apache.spark.sql.hive.HiveQl
 
NumericParser - Class in org.apache.spark.mllib.util
Simple parser for a numeric structure consisting of three types:
NumericParser() - Constructor for class org.apache.spark.mllib.util.NumericParser
 
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.rdd.RDD
 
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.SparkContext
 
numExamples() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numExistingExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
Return the number of executors currently registered with this backend.
numFailedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
numFailedStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numFailedTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numFailedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numFailedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numFailedTasks() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
numFalseNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of false negatives
numFalseNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of false negatives
numFalsePositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of false positives
numFalsePositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of false positives
numFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
number of features
numFeatures() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
numFeatures() - Method in class org.apache.spark.mllib.feature.HashingTF
 
numFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numFeaturesPerNode() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numFinished() - Method in class org.apache.spark.scheduler.ActiveJob
 
numFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
param for number of folds for cross validation
numItemBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for number of item blocks.
numItems() - Method in class org.apache.spark.util.random.AcceptanceResult
 
numIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
numNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of negatives
numNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of negatives
numNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
numNodes() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Get number of nodes in tree, including leaf nodes.
numNonzeros() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
numNonzeros() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Number of nonzero elements (including explicitly presented zero values) in each column.
numPartitions() - Method in class org.apache.spark.HashPartitioner
 
numPartitions() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
numPartitions() - Method in class org.apache.spark.Partitioner
 
numPartitions() - Method in class org.apache.spark.RangePartitioner
 
numPartitions() - Method in class org.apache.spark.scheduler.ActiveJob
 
numPartitions() - Method in class org.apache.spark.scheduler.Stage
 
numPartitions() - Method in class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
 
numPartitions() - Method in class org.apache.spark.storage.RDDInfo
 
numPartitionsInRdd2() - Method in class org.apache.spark.rdd.CartesianRDD
 
numPositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of positives
numPositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of positives
numPositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
numRddBlocks() - Method in class org.apache.spark.storage.StorageStatus
Return the number of RDD blocks stored in this block manager in O(RDDs) time.
numRddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus
Return the number of blocks that belong to the given RDD in O(1) time.
numReceivers() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numRecords() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
numRetries(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the configured number of times to retry connecting
numRowBlocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Gets or computes the number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.Matrix
Number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
numShufflePartitions() - Method in class org.apache.spark.sql.SQLConf
Number of partitions to use for shuffle operators.
numSkippedStages() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numSkippedTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numSplits(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Number of splits for the given feature.
numTasks() - Method in class org.apache.spark.scheduler.Stage
 
numTasks() - Method in class org.apache.spark.scheduler.StageInfo
 
numTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
numTasks() - Method in interface org.apache.spark.SparkStageInfo
 
numTasks() - Method in class org.apache.spark.SparkStageInfoImpl
 
numTasks() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
numThreadsUnrolling() - Method in class org.apache.spark.storage.MemoryStore
Return the number of threads currently unrolling blocks.
numTopFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
 
numTotalCompletedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTotalJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
numTotalProcessedRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTotalReceivedRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numTrees() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
numTrees() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Get number of trees in forest.
numTrueNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of true negatives
numTrueNegatives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of true negatives
numTruePositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix
number of true positives
numTruePositives() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
number of true positives
numUnorderedBins(int) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Given the arity of a categorical feature (arity = number of categories), return the number of bins for the feature if it is to be treated as an unordered feature.
numUnprocessedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
numUserBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for number of user blocks.
numVertices() - Method in class org.apache.spark.graphx.GraphOps
The number of vertices in the graph.

O

objectFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
ObjectInputStreamWithLoader - Class in org.apache.spark.streaming
 
ObjectInputStreamWithLoader(InputStream, ClassLoader) - Constructor for class org.apache.spark.streaming.ObjectInputStreamWithLoader
 
of(RDD<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.evaluation.AreaUnderCurve
Returns the area under the given curve.
of(Iterable<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.evaluation.AreaUnderCurve
Returns the area under the given curve.
OFF_HEAP - Static variable in class org.apache.spark.api.java.StorageLevels
 
OFF_HEAP() - Static method in class org.apache.spark.storage.StorageLevel
 
offerRescinded(SchedulerDriver, Protos.OfferID) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
offerRescinded(SchedulerDriver, Protos.OfferID) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
offHeapUsed() - Method in class org.apache.spark.storage.StorageStatus
Return the off-heap space used by this block manager.
offHeapUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
Return the off-heap space used by the given RDD in this block manager in O(1) time.
offset() - Method in class org.apache.spark.storage.FileSegment
 
offset() - Method in class org.apache.spark.storage.TachyonFileSegment
 
offset() - Method in class org.apache.spark.streaming.kafka.KafkaCluster.LeaderOffset
 
offset() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
offsetBytes(String, long, long) - Static method in class org.apache.spark.util.Utils
Return a string containing part of a file from byte 'start' to 'end'.
offsetBytes(Seq<File>, long, long) - Static method in class org.apache.spark.util.Utils
Return a string containing data across a set of files.
OffsetRange - Class in org.apache.spark.streaming.kafka
:: Experimental :: Represents a range of offsets from a single Kafka TopicAndPartition.
offsetRanges() - Method in interface org.apache.spark.streaming.kafka.HasOffsetRanges
 
offsetRanges() - Method in class org.apache.spark.streaming.kafka.KafkaRDD
 
onAddData(Object, Object) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener
Called after a data item is added into the BlockGenerator.
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.JavaSparkListener
 
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onApplicationEnd(SparkListenerApplicationEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the application ends
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.SparkFirehoseListener
 
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.JavaSparkListener
 
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onApplicationStart(SparkListenerApplicationStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the application starts
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.SparkFirehoseListener
 
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
onBatchCompleted(StreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when processing of a batch of jobs has completed.
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBatchCompletion(Time) - Method in class org.apache.spark.streaming.scheduler.JobGenerator
Callback called when a batch has been completely processed.
onBatchStarted(StreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when processing of a batch of jobs has started.
onBatchStarted(StreamingListenerBatchStarted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a batch of jobs has been submitted for processing.
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.JavaSparkListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a new block manager has joined
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.SparkFirehoseListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.storage.StorageStatusListener
 
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.JavaSparkListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in interface org.apache.spark.scheduler.SparkListener
Called when an existing block manager has been removed
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.SparkFirehoseListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.storage.StorageStatusListener
 
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onCheckpointCompletion(Time) - Method in class org.apache.spark.streaming.scheduler.JobGenerator
Callback called when the checkpoint of a batch has been written.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
 
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in interface org.apache.spark.FutureAction
When this action is completed, either through an exception, or a value, applies the provided function.
onComplete(Function1<R, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult
Set a handler to be called when this PartialResult completes.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
 
onDropEvent(SparkListenerEvent) - Method in class org.apache.spark.scheduler.LiveListenerBus
 
onDropEvent(StreamingListenerEvent) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
onDropEvent(E) - Method in class org.apache.spark.util.AsynchronousListenerBus
If the event queue exceeds its capacity, the new events will be dropped.
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.JavaSparkListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in interface org.apache.spark.scheduler.SparkListener
Called when environment properties have been updated
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.SparkFirehoseListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.env.EnvironmentListener
 
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onError(Throwable) - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessLoop
 
onError(String, Throwable) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener
Called when an error has occurred in the BlockGenerator.
ones(int, int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
Generate a DenseMatrix consisting of ones.
ones(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a DenseMatrix consisting of ones.
ones(int) - Static method in class org.apache.spark.util.Vector
 
OneToOneDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs.
OneToOneDependency(RDD<T>) - Constructor for class org.apache.spark.OneToOneDependency
 
onEvent(SparkListenerEvent) - Method in class org.apache.spark.SparkFirehoseListener
 
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.JavaSparkListener
 
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onExecutorAdded(SparkListenerExecutorAdded) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the driver registers a new executor.
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.SparkFirehoseListener
 
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.JavaSparkListener
 
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the driver receives task metrics from an executor in a heartbeat.
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.SparkFirehoseListener
 
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in class org.apache.spark.JavaSparkListener
 
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in interface org.apache.spark.scheduler.SparkListener
Called when the driver removes an executor.
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in class org.apache.spark.SparkFirehoseListener
 
onFail(Function1<Exception, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult
Set a handler to be called if this PartialResult's job fails.
onGenerateBlock(StreamBlockId) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener
Called when a new block of data is generated by the block generator.
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.JavaSparkListener
 
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.JobLogger
When job ends, recording job completion status and close log file
onJobEnd(SparkListenerJobEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a job ends
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.SparkFirehoseListener
 
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.JavaSparkListener
 
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.JobLogger
When job starts, record job property and stage graph
onJobStart(SparkListenerJobStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a job starts
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.SparkFirehoseListener
 
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onPostEvent(SparkListener, SparkListenerEvent) - Method in interface org.apache.spark.scheduler.SparkListenerBus
 
onPostEvent(StreamingListener, StreamingListenerEvent) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
onPostEvent(L, E) - Method in interface org.apache.spark.util.ListenerBus
Post an event to the specified listener.
onPushBlock(StreamBlockId, ArrayBuffer<?>) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener
Called when a new block is ready to be pushed.
onReceive(DAGSchedulerEvent) - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessLoop
The main event loop of the DAG scheduler.
onReceiverError(StreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has reported an error
onReceiverError(StreamingListenerReceiverError) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onReceiverStarted(StreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has been started
onReceiverStarted(StreamingListenerReceiverStarted) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onReceiverStopped(StreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener
Called when a receiver has been stopped
onReceiverStopped(StreamingListenerReceiverStopped) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.JavaSparkListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.JobLogger
When stage is completed, record stage completion status
onStageCompleted(SparkListenerStageCompleted) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a stage completes successfully or fails, with information on the completed stage.
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.StatsReportListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.SparkFirehoseListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.storage.StorageListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.JavaSparkListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.JobLogger
When stage is submitted, record stage submit info
onStageSubmitted(SparkListenerStageSubmitted) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a stage is submitted
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.SparkFirehoseListener
 
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
For FIFO, all stages are contained by "default" pool but "default" pool here is meaningless
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.storage.StorageListener
 
onStart() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
onStart() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
 
onStart() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
onStart() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
onStart() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
onStart() - Method in class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
onStart() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
This is called when the KinesisReceiver starts and must be non-blocking.
onStart() - Method in class org.apache.spark.streaming.mqtt.MQTTReceiver
 
onStart() - Method in class org.apache.spark.streaming.receiver.ActorReceiver
 
onStart() - Method in class org.apache.spark.streaming.receiver.Receiver
This method is called by the system when the receiver is started.
onStart() - Method in class org.apache.spark.streaming.twitter.TwitterReceiver
 
onStop() - Method in class org.apache.spark.scheduler.DAGSchedulerEventProcessLoop
 
onStop() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
onStop() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
 
onStop() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
onStop() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
onStop() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
onStop() - Method in class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
onStop() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
This is called when the KinesisReceiver stops.
onStop() - Method in class org.apache.spark.streaming.mqtt.MQTTReceiver
 
onStop() - Method in class org.apache.spark.streaming.receiver.ActorReceiver
 
onStop() - Method in class org.apache.spark.streaming.receiver.Receiver
This method is called by the system when the receiver is stopped.
onStop() - Method in class org.apache.spark.streaming.twitter.TwitterReceiver
 
onTaskCompletion(TaskContext) - Method in interface org.apache.spark.util.TaskCompletionListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.JavaSparkListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.JobLogger
When task ends, record task completion status and metrics
onTaskEnd(SparkListenerTaskEnd) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task ends
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.StatsReportListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.SparkFirehoseListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.storage.StorageStatusListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.storage.StorageListener
Assumes the storage status list is fully up-to-date.
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.JavaSparkListener
 
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task begins remotely fetching its result (will not be called for tasks that do not need to fetch the result remotely).
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.SparkFirehoseListener
 
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.JavaSparkListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onTaskStart(SparkListenerTaskStart) - Method in interface org.apache.spark.scheduler.SparkListener
Called when a task starts
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.SparkFirehoseListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.JavaSparkListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.scheduler.EventLoggingListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in interface org.apache.spark.scheduler.SparkListener
Called when an RDD is manually unpersisted by the application
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.SparkFirehoseListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.storage.StorageStatusListener
 
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.ui.storage.StorageListener
 
OOM() - Static method in class org.apache.spark.util.SparkExitCode
The default uncaught exception handler was reached, and the uncaught exception was an OutOfMemoryError.
open() - Method in class org.apache.spark.input.PortableDataStream
Create a new DataInputStream from the split and context
open() - Method in class org.apache.spark.SparkHadoopWriter
 
open() - Method in class org.apache.spark.storage.BlockObjectWriter
 
open() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
openEventLog(Path, FileSystem) - Static method in class org.apache.spark.scheduler.EventLoggingListener
Opens an event log file and returns an input stream that contains the event data.
ops() - Method in class org.apache.spark.graphx.Graph
The associated GraphOps object.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.GradientDescent
:: DeveloperApi :: Runs gradient descent on the given training data.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.LBFGS
 
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in interface org.apache.spark.mllib.optimization.Optimizer
Solve the provided convex optimization problem.
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
 
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
 
Optimizer - Interface in org.apache.spark.mllib.optimization
:: DeveloperApi :: Trait for optimization problem solvers.
optimizer() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
The optimizer to solve the problem.
optimizer() - Method in class org.apache.spark.mllib.regression.LassoWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
 
optimizer() - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
 
options() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
options() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
 
options() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
options() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
options() - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
 
options() - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
 
optionToOptional(Option<T>) - Static method in class org.apache.spark.api.java.JavaUtils
 
or(Column) - Method in class org.apache.spark.sql.Column
Boolean OR.
OR() - Static method in class org.apache.spark.sql.hive.HiveQl
 
Or - Class in org.apache.spark.sql.sources
 
Or(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.Or
 
orderBy(String, String...) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame sorted by the given expressions.
orderBy(Column...) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame sorted by the given expressions.
orderBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame sorted by the given expressions.
orderBy(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame sorted by the given expressions.
OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
OrderedRDDFunctions(RDD<P>, Ordering<K>, ClassTag<K>, ClassTag<V>, ClassTag<P>) - Constructor for class org.apache.spark.rdd.OrderedRDDFunctions
 
ordering() - Static method in class org.apache.spark.streaming.Time
 
org.apache.spark - package org.apache.spark
Core Spark classes in Scala.
org.apache.spark.annotation - package org.apache.spark.annotation
Spark annotations to mark an API experimental or intended only for advanced usages by developers.
org.apache.spark.api.java - package org.apache.spark.api.java
Spark Java programming APIs.
org.apache.spark.api.java.function - package org.apache.spark.api.java.function
Set of interfaces to represent functions in Spark's Java API.
org.apache.spark.broadcast - package org.apache.spark.broadcast
Spark's broadcast variables, used to broadcast immutable datasets to all nodes.
org.apache.spark.examples.streaming - package org.apache.spark.examples.streaming
 
org.apache.spark.graphx - package org.apache.spark.graphx
ALPHA COMPONENT GraphX is a graph processing framework built on top of Spark.
org.apache.spark.graphx.impl - package org.apache.spark.graphx.impl
 
org.apache.spark.graphx.lib - package org.apache.spark.graphx.lib
Various analytics functions for graphs.
org.apache.spark.graphx.util - package org.apache.spark.graphx.util
Collections of utilities used by graphx.
org.apache.spark.input - package org.apache.spark.input
 
org.apache.spark.io - package org.apache.spark.io
IO codecs used for compression.
org.apache.spark.mapred - package org.apache.spark.mapred
 
org.apache.spark.mapreduce - package org.apache.spark.mapreduce
 
org.apache.spark.metrics - package org.apache.spark.metrics
 
org.apache.spark.metrics.sink - package org.apache.spark.metrics.sink
 
org.apache.spark.metrics.source - package org.apache.spark.metrics.source
 
org.apache.spark.ml - package org.apache.spark.ml
Spark ML is an ALPHA component that adds a new set of machine learning APIs to let users quickly assemble and configure practical machine learning pipelines.
org.apache.spark.ml.classification - package org.apache.spark.ml.classification
 
org.apache.spark.ml.evaluation - package org.apache.spark.ml.evaluation
 
org.apache.spark.ml.feature - package org.apache.spark.ml.feature
 
org.apache.spark.ml.impl.estimator - package org.apache.spark.ml.impl.estimator
 
org.apache.spark.ml.param - package org.apache.spark.ml.param
 
org.apache.spark.ml.recommendation - package org.apache.spark.ml.recommendation
 
org.apache.spark.ml.regression - package org.apache.spark.ml.regression
 
org.apache.spark.ml.tuning - package org.apache.spark.ml.tuning
 
org.apache.spark.mllib.classification - package org.apache.spark.mllib.classification
 
org.apache.spark.mllib.classification.impl - package org.apache.spark.mllib.classification.impl
 
org.apache.spark.mllib.clustering - package org.apache.spark.mllib.clustering
 
org.apache.spark.mllib.evaluation - package org.apache.spark.mllib.evaluation
 
org.apache.spark.mllib.evaluation.binary - package org.apache.spark.mllib.evaluation.binary
 
org.apache.spark.mllib.feature - package org.apache.spark.mllib.feature
 
org.apache.spark.mllib.fpm - package org.apache.spark.mllib.fpm
 
org.apache.spark.mllib.impl - package org.apache.spark.mllib.impl
 
org.apache.spark.mllib.linalg - package org.apache.spark.mllib.linalg
 
org.apache.spark.mllib.linalg.distributed - package org.apache.spark.mllib.linalg.distributed
 
org.apache.spark.mllib.optimization - package org.apache.spark.mllib.optimization
 
org.apache.spark.mllib.random - package org.apache.spark.mllib.random
 
org.apache.spark.mllib.rdd - package org.apache.spark.mllib.rdd
 
org.apache.spark.mllib.recommendation - package org.apache.spark.mllib.recommendation
 
org.apache.spark.mllib.regression - package org.apache.spark.mllib.regression
 
org.apache.spark.mllib.regression.impl - package org.apache.spark.mllib.regression.impl
 
org.apache.spark.mllib.stat - package org.apache.spark.mllib.stat
 
org.apache.spark.mllib.stat.correlation - package org.apache.spark.mllib.stat.correlation
 
org.apache.spark.mllib.stat.distribution - package org.apache.spark.mllib.stat.distribution
 
org.apache.spark.mllib.stat.test - package org.apache.spark.mllib.stat.test
 
org.apache.spark.mllib.tree - package org.apache.spark.mllib.tree
 
org.apache.spark.mllib.tree.configuration - package org.apache.spark.mllib.tree.configuration
 
org.apache.spark.mllib.tree.impl - package org.apache.spark.mllib.tree.impl
 
org.apache.spark.mllib.tree.impurity - package org.apache.spark.mllib.tree.impurity
 
org.apache.spark.mllib.tree.loss - package org.apache.spark.mllib.tree.loss
 
org.apache.spark.mllib.tree.model - package org.apache.spark.mllib.tree.model
 
org.apache.spark.mllib.util - package org.apache.spark.mllib.util
 
org.apache.spark.partial - package org.apache.spark.partial
 
org.apache.spark.rdd - package org.apache.spark.rdd
Provides implementation's of various RDDs.
org.apache.spark.scheduler - package org.apache.spark.scheduler
Spark's DAG scheduler.
org.apache.spark.scheduler.cluster - package org.apache.spark.scheduler.cluster
 
org.apache.spark.scheduler.cluster.mesos - package org.apache.spark.scheduler.cluster.mesos
 
org.apache.spark.scheduler.local - package org.apache.spark.scheduler.local
 
org.apache.spark.serializer - package org.apache.spark.serializer
Pluggable serializers for RDD and shuffle data.
org.apache.spark.sql - package org.apache.spark.sql
 
org.apache.spark.sql.api.java - package org.apache.spark.sql.api.java
Allows the execution of relational queries, including those expressed in SQL using Spark.
org.apache.spark.sql.columnar - package org.apache.spark.sql.columnar
 
org.apache.spark.sql.columnar.compression - package org.apache.spark.sql.columnar.compression
 
org.apache.spark.sql.hive - package org.apache.spark.sql.hive
 
org.apache.spark.sql.hive.execution - package org.apache.spark.sql.hive.execution
 
org.apache.spark.sql.jdbc - package org.apache.spark.sql.jdbc
 
org.apache.spark.sql.json - package org.apache.spark.sql.json
 
org.apache.spark.sql.parquet - package org.apache.spark.sql.parquet
 
org.apache.spark.sql.parquet.timestamp - package org.apache.spark.sql.parquet.timestamp
 
org.apache.spark.sql.sources - package org.apache.spark.sql.sources
 
org.apache.spark.sql.test - package org.apache.spark.sql.test
 
org.apache.spark.storage - package org.apache.spark.storage
 
org.apache.spark.streaming - package org.apache.spark.streaming
 
org.apache.spark.streaming.api.java - package org.apache.spark.streaming.api.java
Java APIs for spark streaming.
org.apache.spark.streaming.dstream - package org.apache.spark.streaming.dstream
Various implementations of DStreams.
org.apache.spark.streaming.flume - package org.apache.spark.streaming.flume
Spark streaming receiver for Flume.
org.apache.spark.streaming.kafka - package org.apache.spark.streaming.kafka
Kafka receiver for spark streaming.
org.apache.spark.streaming.kinesis - package org.apache.spark.streaming.kinesis
 
org.apache.spark.streaming.mqtt - package org.apache.spark.streaming.mqtt
MQTT receiver for Spark Streaming.
org.apache.spark.streaming.rdd - package org.apache.spark.streaming.rdd
 
org.apache.spark.streaming.receiver - package org.apache.spark.streaming.receiver
 
org.apache.spark.streaming.scheduler - package org.apache.spark.streaming.scheduler
 
org.apache.spark.streaming.twitter - package org.apache.spark.streaming.twitter
Twitter feed receiver for spark streaming.
org.apache.spark.streaming.ui - package org.apache.spark.streaming.ui
 
org.apache.spark.streaming.util - package org.apache.spark.streaming.util
 
org.apache.spark.streaming.zeromq - package org.apache.spark.streaming.zeromq
Zeromq receiver for spark streaming.
org.apache.spark.ui - package org.apache.spark.ui
 
org.apache.spark.ui.env - package org.apache.spark.ui.env
 
org.apache.spark.ui.exec - package org.apache.spark.ui.exec
 
org.apache.spark.ui.jobs - package org.apache.spark.ui.jobs
 
org.apache.spark.ui.storage - package org.apache.spark.ui.storage
 
org.apache.spark.util - package org.apache.spark.util
Spark utilities.
org.apache.spark.util.io - package org.apache.spark.util.io
 
org.apache.spark.util.logging - package org.apache.spark.util.logging
 
org.apache.spark.util.random - package org.apache.spark.util.random
Utilities for random number generation.
originals() - Static method in class org.apache.spark.Accumulators
 
originalType() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
other() - Method in class org.apache.spark.scheduler.RuntimePercentage
 
otherCopyArgs() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
otherInfo() - Method in class org.apache.spark.streaming.receiver.Statistics
 
otherVertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet
Given one vertex in the edge return the other vertex.
otherVertexId(long) - Method in class org.apache.spark.graphx.Edge
Given one vertex in the edge return the other vertex.
Out() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges originating from a vertex.
outDegrees() - Method in class org.apache.spark.graphx.GraphOps
The out-degree of each vertex in the graph.
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option<U>, VD2>, ClassTag<U>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph
Joins the vertices with entries in the table RDD and merges the results using mapFunc.
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option<U>, VD2>, ClassTag<U>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
output() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
output() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
output() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
output() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
output() - Method in class org.apache.spark.sql.hive.execution.HiveNativeCommand
 
output() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
output() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
output() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
output() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
output() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
output() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
output() - Method in class org.apache.spark.sql.parquet.ParquetRelation
Attributes
output() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
output() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
output() - Method in class org.apache.spark.sql.sources.DescribeCommand
 
output() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
OUTPUT() - Static method in class org.apache.spark.ui.ToolTips
 
outputBytes() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
outputBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
outputClass() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
outputCol() - Method in interface org.apache.spark.ml.param.HasOutputCol
param for output column name
OutputCommitCoordinationMessage - Interface in org.apache.spark.scheduler
 
OutputCommitCoordinator - Class in org.apache.spark.scheduler
Authority that decides whether tasks can commit output to HDFS.
OutputCommitCoordinator(SparkConf) - Constructor for class org.apache.spark.scheduler.OutputCommitCoordinator
 
outputCommitCoordinator() - Method in class org.apache.spark.SparkEnv
 
OutputCommitCoordinator.OutputCommitCoordinatorActor - Class in org.apache.spark.scheduler
 
OutputCommitCoordinator.OutputCommitCoordinatorActor(OutputCommitCoordinator) - Constructor for class org.apache.spark.scheduler.OutputCommitCoordinator.OutputCommitCoordinatorActor
 
outputId() - Method in class org.apache.spark.scheduler.ResultTask
 
outputLocs() - Method in class org.apache.spark.scheduler.Stage
 
outputMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
outputMetricsToJson(OutputMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
outputRecords() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
outputRecords() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
outputRowFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
outputRowFormatMap() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
outputSerdeClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
outputSerdeProps() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
outputsMerged() - Method in class org.apache.spark.partial.CountEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.MeanEvaluator
 
outputsMerged() - Method in class org.apache.spark.partial.SumEvaluator
 
OVERHEAD_FRACTION() - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
OVERHEAD_MINIMUM() - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
overwrite() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
overwrite() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
overwrite() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
overwrite() - Method in class org.apache.spark.sql.sources.InsertIntoDataSource
 

P

pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps
Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
PageRank - Class in org.apache.spark.graphx.lib
PageRank algorithm implementation.
PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
 
pages() - Method in class org.apache.spark.ui.WebUITab
 
PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream
Extra functions available on DStream of (key, value) pairs through an implicit conversion.
PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function
A function that returns zero or more key-value pair records from each input record.
PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function
A function that returns key-value pairs (Tuple2<K, V>), and can be used to construct PairRDDs.
pairFunToScalaFun(PairFunction<A, B, C>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
PairRDDFunctions<K,V> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
 
ParallelCollectionPartition<T> - Class in org.apache.spark.rdd
 
ParallelCollectionPartition(long, int, Seq<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ParallelCollectionPartition
 
ParallelCollectionRDD<T> - Class in org.apache.spark.rdd
 
ParallelCollectionRDD(SparkContext, Seq<T>, int, Map<Object, Seq<String>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ParallelCollectionRDD
 
parallelism() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Distribute a local Scala collection to form an RDD.
Param<T> - Class in org.apache.spark.ml.param
:: AlphaComponent :: A param with self-contained documentation and optionally default value.
Param(Params, String, String, Option<T>) - Constructor for class org.apache.spark.ml.param.Param
 
param() - Method in class org.apache.spark.ml.param.ParamPair
 
parameters() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
ParamGridBuilder - Class in org.apache.spark.ml.tuning
:: AlphaComponent :: Builder for a param grid used in grid search-based model selection.
ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
 
ParamMap - Class in org.apache.spark.ml.param
:: AlphaComponent :: A param to value map.
ParamMap(Map<Param<Object>, Object>) - Constructor for class org.apache.spark.ml.param.ParamMap
 
ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap
Creates an empty param map.
paramMap() - Method in interface org.apache.spark.ml.param.Params
Internal param map.
ParamPair<T> - Class in org.apache.spark.ml.param
A param amd its value.
ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
 
Params - Interface in org.apache.spark.ml.param
:: AlphaComponent :: Trait for components that take parameters.
params() - Method in interface org.apache.spark.ml.param.Params
Returns all params.
parent() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
parent() - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
parent() - Method in class org.apache.spark.ml.Model
The parent estimator that produced this model.
parent() - Method in class org.apache.spark.ml.param.Param
 
parent() - Method in class org.apache.spark.ml.PipelineModel
 
parent() - Method in class org.apache.spark.ml.recommendation.ALSModel
 
parent() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
 
parent() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
parent() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
 
parent() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
parent() - Method in class org.apache.spark.scheduler.Pool
 
parent() - Method in interface org.apache.spark.scheduler.Schedulable
 
parent() - Method in class org.apache.spark.scheduler.TaskSetManager
 
parent() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
parent() - Method in class org.apache.spark.streaming.ui.StreamingTab
 
ParentClassLoader - Class in org.apache.spark.util
A class loader which makes some protected methods in ClassLoader accesible.
ParentClassLoader(ClassLoader) - Constructor for class org.apache.spark.util.ParentClassLoader
 
parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Get the parent index of the given node, or 0 if it is the root.
parentPartition() - Method in class org.apache.spark.rdd.UnionPartition
 
parentRddIndex() - Method in class org.apache.spark.rdd.UnionPartition
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
parents() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
parents() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
parents() - Method in class org.apache.spark.scheduler.Stage
 
parentsIndices() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
parentSplit() - Method in class org.apache.spark.rdd.PartitionPruningRDDPartition
 
PARQUET_BINARY_AS_STRING() - Static method in class org.apache.spark.sql.SQLConf
 
PARQUET_CACHE_METADATA() - Static method in class org.apache.spark.sql.SQLConf
 
PARQUET_COMPRESSION() - Static method in class org.apache.spark.sql.SQLConf
 
PARQUET_FILTER_DATA() - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
PARQUET_FILTER_PUSHDOWN_ENABLED() - Static method in class org.apache.spark.sql.SQLConf
 
PARQUET_INT96_AS_TIMESTAMP() - Static method in class org.apache.spark.sql.SQLConf
 
PARQUET_USE_DATA_SOURCE_API() - Static method in class org.apache.spark.sql.SQLConf
 
parquetCompressionCodec() - Method in class org.apache.spark.sql.SQLConf
The compression codec for writing to a Parquetfile
ParquetConversion() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
ParquetConversions() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
parquetFile(String...) - Method in class org.apache.spark.sql.SQLContext
Loads a Parquet file, returning the result as a DataFrame.
parquetFile(Seq<String>) - Method in class org.apache.spark.sql.SQLContext
Loads a Parquet file, returning the result as a DataFrame.
parquetFilterPushDown() - Method in class org.apache.spark.sql.SQLConf
When true predicates will be passed to the parquet record reader when possible.
ParquetFilters - Class in org.apache.spark.sql.parquet
 
ParquetFilters() - Constructor for class org.apache.spark.sql.parquet.ParquetFilters
 
ParquetRelation - Class in org.apache.spark.sql.parquet
Relation that consists of data stored in a Parquet columnar format.
ParquetRelation(String, Option<Configuration>, SQLContext, Seq<Attribute>) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation
 
ParquetRelation2 - Class in org.apache.spark.sql.parquet
An alternative to ParquetRelation that plugs in using the data sources API.
ParquetRelation2(Seq<String>, Map<String, String>, Option<StructType>, Option<PartitionSpec>, SQLContext) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation2
 
ParquetRelation2.PartitionValues - Class in org.apache.spark.sql.parquet
 
ParquetRelation2.PartitionValues(Seq<String>, Seq<Literal>) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues
 
ParquetRelation2.PartitionValues$ - Class in org.apache.spark.sql.parquet
 
ParquetRelation2.PartitionValues$() - Constructor for class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues$
 
parquetSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation
Schema derived from ParquetFile
ParquetTableScan - Class in org.apache.spark.sql.parquet
:: DeveloperApi :: Parquet table scan operator.
ParquetTableScan(Seq<Attribute>, ParquetRelation, Seq<Expression>) - Constructor for class org.apache.spark.sql.parquet.ParquetTableScan
 
ParquetTest - Interface in org.apache.spark.sql.parquet
A helper trait that provides convenient facilities for Parquet testing.
ParquetTestData - Class in org.apache.spark.sql.parquet
 
ParquetTestData() - Constructor for class org.apache.spark.sql.parquet.ParquetTestData
 
parquetTsCalendar() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
ParquetTypeInfo - Class in org.apache.spark.sql.parquet
A class representing Parquet info fields we care about, for passing back to Parquet
ParquetTypeInfo(PrimitiveType.PrimitiveTypeName, Option<OriginalType>, Option<DecimalMetadata>, Option<Object>) - Constructor for class org.apache.spark.sql.parquet.ParquetTypeInfo
 
ParquetTypesConverter - Class in org.apache.spark.sql.parquet
 
ParquetTypesConverter() - Constructor for class org.apache.spark.sql.parquet.ParquetTypesConverter
 
parquetUseDataSourceApi() - Method in class org.apache.spark.sql.SQLConf
When true uses Parquet implementation based on data source API
parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
Parses a string resulted from Vector.toString into a Vector.
parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
Parses a string resulted from LabeledPoint#toString into an LabeledPoint.
parse(String) - Static method in class org.apache.spark.mllib.util.NumericParser
Parses a string into a Double, an Array[Double], or a Seq[Any].
parse(SparkConf, String, Option<SSLOptions>) - Static method in class org.apache.spark.SSLOptions
Resolves SSLOptions settings from a given Spark configuration object at a given namespace.
parseAttrs(Seq<Expression>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
parseDdl(String) - Static method in class org.apache.spark.sql.hive.HiveQl
 
parseHostPort(String) - Static method in class org.apache.spark.util.Utils
 
parseNumeric(Object) - Static method in class org.apache.spark.mllib.linalg.Vectors
 
parsePartition(Path, String) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
Parses a single partition, returns column names and values of each partition column.
parsePartitions(Seq<Path>, String) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
Given a group of qualified paths, tries to parse them and returns a partition specification.
parseSql(String) - Static method in class org.apache.spark.sql.hive.HiveQl
Returns a LogicalPlan for a given HiveQL string.
parseStream(PortableDataStream) - Method in class org.apache.spark.input.StreamBasedRecordReader
Parse the stream (and close it afterwards) and return the value as in type T
parseStream(PortableDataStream) - Method in class org.apache.spark.input.StreamRecordReader
 
parseType(String) - Method in class org.apache.spark.sql.sources.DDLParser
 
PartialResult<R> - Class in org.apache.spark.partial
 
PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
 
Partition - Interface in org.apache.spark
An identifier for a partition in an RDD.
partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
partition() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
Partition - Class in org.apache.spark.sql.parquet
 
Partition(Row, String) - Constructor for class org.apache.spark.sql.parquet.Partition
 
partition() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
 
partition() - Method in class org.apache.spark.streaming.kafka.OffsetRange
Kafka partition id
partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph
Repartitions the edges in the graph according to partitionStrategy.
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph
Repartitions the edges in the graph according to partitionStrategy.
partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return a copy of the RDD partitioned using the specified partitioner.
PartitionCoalescer - Class in org.apache.spark.rdd
Coalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of this RDD computes one or more of the parent ones.
PartitionCoalescer(int, RDD<?>, double) - Constructor for class org.apache.spark.rdd.PartitionCoalescer
 
PartitionCoalescer.LocationIterator - Class in org.apache.spark.rdd
 
PartitionCoalescer.LocationIterator(RDD<?>) - Constructor for class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
partitionColumns() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
partitionColumns() - Method in class org.apache.spark.sql.parquet.PartitionSpec
 
partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
If partitionsRDD already has a partitioner, use it.
partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
Partitioner - Class in org.apache.spark
An object that defines how the elements in a key-value pair RDD are partitioned by key.
Partitioner() - Constructor for class org.apache.spark.Partitioner
 
partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
partitioner() - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
partitioner() - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
partitioner() - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
partitioner() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
partitioner() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
partitioner() - Method in class org.apache.spark.rdd.RDD
Optionally overridden by subclasses to specify how they are partitioned.
partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
 
partitioner() - Method in class org.apache.spark.rdd.SubtractedRDD
 
partitioner() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
partitioner() - Method in class org.apache.spark.ShuffleDependency
 
PartitionerAwareUnionRDD<T> - Class in org.apache.spark.rdd
Class representing an RDD that can take multiple RDDs partitioned by the same partitioner and unify them into a single RDD while preserving the partitioner.
PartitionerAwareUnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
PartitionerAwareUnionRDDPartition - Class in org.apache.spark.rdd
Class representing partitions of PartitionerAwareUnionRDD, which maintains the list of corresponding partitions of parent RDDs.
PartitionerAwareUnionRDDPartition(Seq<RDD<?>>, int) - Constructor for class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
partitionFilters() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
PartitionGroup - Class in org.apache.spark.rdd
 
PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
 
partitionId() - Method in class org.apache.spark.scheduler.Task
 
partitionID() - Method in class org.apache.spark.TaskCommitDenied
 
partitionId() - Method in class org.apache.spark.TaskContext
The ID of the RDD partition that is computed by this task.
partitionId() - Method in class org.apache.spark.TaskContextImpl
 
partitioningAttributes() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
partitionKeys() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
partitionPruningPred() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
PartitionPruningRDD<T> - Class in org.apache.spark.rdd
:: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
 
PartitionPruningRDDPartition - Class in org.apache.spark.rdd
 
PartitionPruningRDDPartition(int, Partition) - Constructor for class org.apache.spark.rdd.PartitionPruningRDDPartition
 
partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
Set of partitions in this RDD.
partitions() - Method in class org.apache.spark.rdd.PruneDependency
 
partitions() - Method in class org.apache.spark.rdd.RDD
Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.
partitions() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
partitions() - Method in class org.apache.spark.scheduler.ActiveJob
 
partitions() - Method in class org.apache.spark.scheduler.JobSubmitted
 
partitions() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
partitions() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
partitions() - Method in class org.apache.spark.sql.parquet.PartitionSpec
 
partitionSize(int) - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
Returns the number of vertices that will be sent to the specified edge partition.
partitionSpec() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
PartitionSpec - Class in org.apache.spark.sql.parquet
 
PartitionSpec(StructType, Seq<Partition>) - Constructor for class org.apache.spark.sql.parquet.PartitionSpec
 
partitionsRDD() - Method in class org.apache.spark.graphx.EdgeRDD
 
partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
partitionsRDD() - Method in class org.apache.spark.graphx.VertexRDD
 
partitionStatistics() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
PartitionStatistics - Class in org.apache.spark.sql.columnar
 
PartitionStatistics(Seq<Attribute>) - Constructor for class org.apache.spark.sql.columnar.PartitionStatistics
 
PartitionStrategy - Interface in org.apache.spark.graphx
Represents the way edges are assigned to edge partitions based on their source and destination vertex IDs.
PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx
Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical direction, resulting in a random vertex cut that colocates all edges between two vertices, regardless of direction.
PartitionStrategy.CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
 
PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx
Assigns edges to partitions using only the source vertex ID, colocating edges with the same source.
PartitionStrategy.EdgePartition1D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
 
PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx
Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) - 1 bound on vertex replication.
PartitionStrategy.EdgePartition2D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
 
PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx
Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a random vertex cut that colocates all same-direction edges between two vertices.
PartitionStrategy.RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
 
partitionToOps(VertexPartition<VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition
Implicit conversion to allow invoking VertexPartitionBase operations directly on a VertexPartition.
partitionValues() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
PartitionwiseSampledRDD<T,U> - Class in org.apache.spark.rdd
A RDD sampled from its parent RDD partition-wise.
PartitionwiseSampledRDD(RDD<T>, RandomSampler<T, U>, boolean, long, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.PartitionwiseSampledRDD
 
PartitionwiseSampledRDDPartition - Class in org.apache.spark.rdd
 
PartitionwiseSampledRDDPartition(Partition, long) - Constructor for class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
parts() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
 
PassThrough - Class in org.apache.spark.sql.columnar.compression
 
PassThrough() - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough
 
PassThrough.Decoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
PassThrough.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
 
PassThrough.Encoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
PassThrough.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
path() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
path() - Method in class org.apache.spark.scheduler.SplitInfo
 
path() - Method in class org.apache.spark.sql.hive.execution.AddFile
 
path() - Method in class org.apache.spark.sql.hive.execution.AddJar
 
path() - Method in class org.apache.spark.sql.json.JSONRelation
 
path() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
path() - Method in class org.apache.spark.sql.parquet.Partition
 
path() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
path() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
paths() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
pdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
Returns density of this multivariate Gaussian at given point, x
pdf(Vector<Object>) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
Returns density of this multivariate Gaussian at given point, x
PEARSON() - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
PearsonCorrelation - Class in org.apache.spark.mllib.stat.correlation
Compute Pearson correlation for two RDDs of the type RDD[Double] or the correlation matrix for an RDD of the type RDD[Vector].
PearsonCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
 
pendingStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
pendingTasks() - Method in class org.apache.spark.scheduler.Stage
 
pendingTasksWithNoPrefs() - Method in class org.apache.spark.scheduler.TaskSetManager
 
pendingTimes() - Method in class org.apache.spark.streaming.Checkpoint
 
percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
PeriodicGraphCheckpointer<VD,ED> - Class in org.apache.spark.mllib.impl
This class helps with persisting and checkpointing Graphs.
PeriodicGraphCheckpointer(Graph<VD, ED>, int) - Constructor for class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
 
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph
Caches the vertices and edges associated with this graph at the specified storage level, ignoring any target storage levels previously set.
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
Persists the edge partitions at the specified storage level, ignoring any existing target storage level.
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
Persists the vertex partitions at the specified storage level, ignoring any existing target storage level.
persist(StorageLevel) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Persists the underlying RDD with the specified storage level.
persist(StorageLevel) - Method in class org.apache.spark.rdd.HadoopRDD
 
persist(StorageLevel) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD
Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.rdd.RDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist() - Method in class org.apache.spark.sql.DataFrame
 
persist(StorageLevel) - Method in class org.apache.spark.sql.DataFrame
 
persist() - Method in interface org.apache.spark.sql.RDDApi
 
persist(StorageLevel) - Method in interface org.apache.spark.sql.RDDApi
 
persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist the RDDs of this DStream with the given storage level
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream
Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.dstream.DStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
persistentRdds() - Method in class org.apache.spark.SparkContext
 
persistRDD(RDD<?>) - Method in class org.apache.spark.SparkContext
Register an RDD to be persisted in memory and/or disk storage
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
pickBin(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
Takes a parent RDD partition and decides which of the partition groups to put it in Takes locality into account, but also uses power of 2 choices to load balance It strikes a balance between the two use the balanceSlack variable
pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps
Picks a random vertex from the graph and returns its ID.
pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an RDD created by piping elements to a forked external process.
pipe(String) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD
Return an RDD created by piping elements to a forked external process.
PipedRDD<T> - Class in org.apache.spark.rdd
An RDD that pipes the contents of each parent partition through an external command (printing them one per line) and returns the output as a collection of strings.
PipedRDD(RDD<T>, Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PipedRDD
 
PipedRDD(RDD<T>, String, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PipedRDD
 
PipedRDD.NotEqualsFileNameFilter - Class in org.apache.spark.rdd
A FilenameFilter that accepts anything that isn't equal to the name passed in.
PipedRDD.NotEqualsFileNameFilter(String) - Constructor for class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
 
Pipeline - Class in org.apache.spark.ml
:: AlphaComponent :: A simple pipeline, which acts as an estimator.
Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
 
PipelineModel - Class in org.apache.spark.ml
:: AlphaComponent :: Represents a compiled pipeline.
PipelineModel(Pipeline, ParamMap, Transformer[]) - Constructor for class org.apache.spark.ml.PipelineModel
 
PipelineStage - Class in org.apache.spark.ml
:: AlphaComponent :: A stage in a pipeline, either an Estimator or a Transformer.
PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
 
plan() - Method in class org.apache.spark.sql.CachedData
 
PluggableInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
PluggableInputDStream(StreamingContext, Receiver<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.PluggableInputDStream
 
plus(Object) - Method in class org.apache.spark.sql.Column
Sum of this expression and another expression.
plus(Duration) - Method in class org.apache.spark.streaming.Duration
 
plus(Duration) - Method in class org.apache.spark.streaming.Time
 
plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector
return (this + plus) dot other, but without creating any intermediate storage
point() - Method in class org.apache.spark.mllib.feature.VocabWord
 
pointCost(TraversableOnce<VectorWithNorm>, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
Returns the K-means cost of a given point against the given cluster centers.
POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
PoissonBounds - Class in org.apache.spark.util.random
Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample sizes with high confidence when sampling with replacement.
PoissonBounds() - Constructor for class org.apache.spark.util.random.PoissonBounds
 
PoissonGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
 
poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d. samples from the Poisson distribution with the input mean.
PoissonSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler for sampling with replacement, based on values drawn from Poisson distribution.
PoissonSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
 
poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the Poisson distribution with the input mean.
pollDir() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollPeriod() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.CsvSink
 
pollUnit() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
Pool - Class in org.apache.spark.scheduler
An Schedulable entity that represent collection of Pools or TaskSetManagers
Pool(String, Enumeration.Value, int, int) - Constructor for class org.apache.spark.scheduler.Pool
 
POOL_NAME_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
poolName() - Method in class org.apache.spark.scheduler.Pool
 
PoolPage - Class in org.apache.spark.ui.jobs
Page showing specific pool details
PoolPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.PoolPage
 
POOLS_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
PoolTable - Class in org.apache.spark.ui.jobs
Table showing list of pools
PoolTable(Seq<Schedulable>, StagesTab) - Constructor for class org.apache.spark.ui.jobs.PoolTable
 
poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
port() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
port() - Method in class org.apache.spark.storage.BlockManagerId
 
port() - Method in class org.apache.spark.streaming.kafka.Broker
Broker's port
port() - Method in class org.apache.spark.streaming.kafka.KafkaCluster.LeaderOffset
 
port() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
 
PortableDataStream - Class in org.apache.spark.input
A class that allows DataStreams to be serialized and moved around by not creating them until they need to be read
PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
 
portMaxRetries(SparkConf) - Static method in class org.apache.spark.util.Utils
Maximum number of retries when binding to a port before giving up.
pos() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
pos() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
post(E) - Method in class org.apache.spark.util.AsynchronousListenerBus
 
post(E) - Method in class org.apache.spark.util.EventLoop
Put the event into the event queue.
PostgresQuirks - Class in org.apache.spark.sql.jdbc
 
PostgresQuirks() - Constructor for class org.apache.spark.sql.jdbc.PostgresQuirks
 
postStartHook() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
postStartHook() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
postToAll(E) - Method in interface org.apache.spark.util.ListenerBus
Post the event to all registered listeners.
powerIter(Graph<Object, Object>, int) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Runs power iteration.
PowerIterationClustering - Class in org.apache.spark.mllib.clustering
:: Experimental ::
PowerIterationClustering(int, int, String) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering
 
PowerIterationClustering() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering
Constructs a PIC instance with default parameters: {k: 2, maxIterations: 100, initMode: "random"}.
PowerIterationClustering.Assignment - Class in org.apache.spark.mllib.clustering
:: Experimental :: Cluster assignment.
PowerIterationClustering.Assignment(long, int) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
 
PowerIterationClusteringModel - Class in org.apache.spark.mllib.clustering
:: Experimental ::
PowerIterationClusteringModel(int, RDD<PowerIterationClustering.Assignment>) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
 
pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.
Precision - Class in org.apache.spark.mllib.evaluation.binary
Precision.
Precision() - Constructor for class org.apache.spark.mllib.evaluation.binary.Precision
 
precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns precision for a given label (category)
precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns precision
precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns document-based precision averaged by the number of documents
precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns precision for a given label (category)
precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
Compute the average precision of all the queries, truncated at ranking position k.
precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, precision) curve.
predicates() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
Predict values for examples stored in a JavaRDD.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
Maps given points to their cluster indices.
predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Returns the cluster index that a given point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Maps given points to their cluster indices.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Maps given points to their cluster indices.
predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Predict the rating of one user for one product.
predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Predict the rating of many users for many products.
predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Java-friendly version of MatrixFactorizationModel.predict.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
Predict values for a single data point using the model trained.
predict(RDD<Object>) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
Predict labels for provided features.
predict(JavaDoubleRDD) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
Predict labels for provided features.
predict(double) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
Predict a single label.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
Predict values for examples stored in a JavaRDD.
predict() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Prediction which should be made based on the sufficient statistics.
predict() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Prediction which should be made based on the sufficient statistics.
predict() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Prediction which should be made based on the sufficient statistics.
predict() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Prediction which should be made based on the sufficient statistics.
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Predict values for the given data set using the model trained.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Predict values for the given data set using the model trained.
predict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
predict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
 
predict() - Method in class org.apache.spark.mllib.tree.model.Node
 
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node
predict value if node is not leaf
Predict - Class in org.apache.spark.mllib.tree.model
Predicted value for a node
Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
 
predict() - Method in class org.apache.spark.mllib.tree.model.Predict
 
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Predict values for the given data set.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
predictionCol() - Method in interface org.apache.spark.ml.param.HasPredictionCol
param for prediction column name
PredictionModel<FeaturesType,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml.impl.estimator
:: AlphaComponent ::
PredictionModel() - Constructor for class org.apache.spark.ml.impl.estimator.PredictionModel
 
predictions() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
 
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Use the clustering model to make predictions on batches of data from a DStream.
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Use the model to make predictions on batches of data from a DStream
predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Java-friendly version of `predictOn`.
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Use the model to make predictions on the values of a DStream and carry over its keys.
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Use the model to make predictions on the values of a DStream and carry over its keys.
predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Java-friendly version of `predictOnValues`.
Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml.impl.estimator
:: AlphaComponent ::
Predictor() - Constructor for class org.apache.spark.ml.impl.estimator.Predictor
 
PredictorParams - Interface in org.apache.spark.ml.impl.estimator
:: DeveloperApi ::
predictSoft(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
Given the input vectors, return the membership value of each vector to all mixture components.
preferredLocation() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
preferredLocation() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver
Override this to specify a preferred location (hostname).
preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
Get the preferred locations of a partition, taking into account whether the RDD is checkpointed.
preferredLocations() - Method in class org.apache.spark.rdd.UnionPartition
 
preferredLocations() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
 
preferredLocations() - Method in class org.apache.spark.scheduler.ResultTask
 
preferredLocations() - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
preferredLocations() - Method in class org.apache.spark.scheduler.Task
 
preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
 
prefix() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
PREFIX() - Static method in class org.apache.spark.streaming.Checkpoint
 
prefix() - Method in class org.apache.spark.ui.WebUIPage
 
prefix() - Method in class org.apache.spark.ui.WebUITab
 
prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
 
pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps
Execute a Pregel-like iterative vertex-parallel abstraction.
Pregel - Class in org.apache.spark.graphx
Implements a Pregel-like bulk-synchronous message-passing API.
Pregel() - Constructor for class org.apache.spark.graphx.Pregel
 
PreInsertCastAndRename - Class in org.apache.spark.sql.sources
A rule to do pre-insert data type casting and field renaming.
PreInsertCastAndRename() - Constructor for class org.apache.spark.sql.sources.PreInsertCastAndRename
 
PreInsertionCasts() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
prepare(int) - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
prepareForRead(Configuration, Map<String, String>, MessageType, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.RowReadSupport
 
prepareForWrite(RecordConsumer) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
prepareForWrite(RecordConsumer) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
prepareWritable(Writable) - Static method in class org.apache.spark.sql.hive.HiveShim
 
prependBaseUri(String, String) - Static method in class org.apache.spark.ui.UIUtils
 
preSetup() - Method in class org.apache.spark.SparkHadoopWriter
 
preStart() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
preStart() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
preStart() - Method in class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
prettyPrint() - Method in class org.apache.spark.streaming.Duration
 
prev() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
prev() - Method in class org.apache.spark.rdd.CoalescedRDD
 
prev() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
prev() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
prev() - Method in class org.apache.spark.rdd.ShuffledRDD
 
prev() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
prevHandler() - Method in class org.apache.spark.util.SignalLoggerHandler
 
PreWriteCheck - Class in org.apache.spark.sql.sources
A rule to do various checks before inserting into or writing to a data source table.
PreWriteCheck(Catalog) - Constructor for class org.apache.spark.sql.sources.PreWriteCheck
 
primitiveType() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Print the first ten elements of each RDD generated in this DStream.
print(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Print the first num elements of each RDD generated in this DStream.
print() - Method in class org.apache.spark.streaming.dstream.DStream
Print the first ten elements of each RDD generated in this DStream.
print(int) - Method in class org.apache.spark.streaming.dstream.DStream
Print the first num elements of each RDD generated in this DStream.
printSchema() - Method in class org.apache.spark.sql.DataFrame
Prints the schema to the console in a nice tree format.
printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
prioritizeContainers(HashMap<K, ArrayBuffer<T>>) - Static method in class org.apache.spark.scheduler.TaskSchedulerImpl
Used to balance containers across hosts.
priority() - Method in class org.apache.spark.scheduler.Pool
 
priority() - Method in interface org.apache.spark.scheduler.Schedulable
 
priority() - Method in class org.apache.spark.scheduler.TaskSet
 
priority() - Method in class org.apache.spark.scheduler.TaskSetManager
 
prob(double) - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Probability of the label given by predict.
prob(double) - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Probability of the label given by predict.
prob(double) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Probability of the label given by predict, or -1 if no probability is available.
prob() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
 
prob() - Method in class org.apache.spark.mllib.tree.model.Predict
 
ProbabilisticClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
:: AlphaComponent ::
ProbabilisticClassificationModel() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassificationModel
 
ProbabilisticClassifier<FeaturesType,E extends ProbabilisticClassifier<FeaturesType,E,M>,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
:: AlphaComponent ::
ProbabilisticClassifier() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassifier
 
ProbabilisticClassifierParams - Interface in org.apache.spark.ml.classification
Params for probabilistic classification.
probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
probabilityCol() - Method in interface org.apache.spark.ml.param.HasProbabilityCol
param for predicted class conditional probabilities column name
PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for the all jobs of this batch to finish processing from the time they started processing.
processingDelay() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
processingDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
processRecords(List<Record>, IRecordProcessorCheckpointer) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
This method is called by the KCL when a batch of records is pulled from the Kinesis stream.
processResults(ArrayList<Object>) - Static method in class org.apache.spark.sql.hive.HiveShim
 
processStreamByLine(String, InputStream, Function1<String, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Return and start a daemon thread that processes the content of the input stream line by line.
product() - Method in class org.apache.spark.mllib.recommendation.Rating
 
productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
progressBar() - Method in class org.apache.spark.SparkContext
 
progressListener() - Method in class org.apache.spark.streaming.StreamingContext
 
properties() - Method in class org.apache.spark.metrics.MetricsConfig
 
properties() - Method in class org.apache.spark.scheduler.ActiveJob
 
properties() - Method in class org.apache.spark.scheduler.JobSubmitted
 
properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
properties() - Method in class org.apache.spark.scheduler.TaskSet
 
propertiesFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
propertiesToJson(Properties) - Static method in class org.apache.spark.util.JsonProtocol
 
property() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
property() - Method in class org.apache.spark.metrics.sink.CsvSink
 
property() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
property() - Method in class org.apache.spark.metrics.sink.JmxSink
 
property() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
propertyCategories() - Method in class org.apache.spark.metrics.MetricsConfig
 
propertyToOption(String) - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
protocol() - Method in class org.apache.spark.SSLOptions
 
protocol(ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
 
protocol(boolean) - Static method in class org.apache.spark.util.AkkaUtils
 
provider() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
provider() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
 
provider() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
provider() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
provider() - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
 
provider() - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
 
provider() - Method in class org.apache.spark.sql.sources.ResolvedDataSource
 
proxyBase() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
pruneColumns(Seq<Attribute>) - Method in class org.apache.spark.sql.parquet.ParquetTableScan
Applies a (candidate) projection.
PruneDependency<T> - Class in org.apache.spark.rdd
Represents a dependency between the PartitionPruningRDD and its parent.
PruneDependency(RDD<T>, Function1<Object, Object>) - Constructor for class org.apache.spark.rdd.PruneDependency
 
PrunedFilteredScan - Interface in org.apache.spark.sql.sources
::DeveloperApi:: A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Row objects.
PrunedScan - Interface in org.apache.spark.sql.sources
::DeveloperApi:: A BaseRelation that can eliminate unneeded columns before producing an RDD containing all of its tuples as Row objects.
prunePartitions(Seq<Partition>) - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
Prunes partitions not involve the query plan.
Pseudorandom - Interface in org.apache.spark.util.random
:: DeveloperApi :: A class with pseudorandom behavior.
pushAndReportBlock(ReceivedBlock, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Store block and report it to driver
pushArrayBuffer(ArrayBuffer<?>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Store an ArrayBuffer of received data as a data block into Spark's memory.
pushArrayBuffer(ArrayBuffer<?>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Store an ArrayBuffer of received data as a data block into Spark's memory.
pushBytes(ByteBuffer, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Store the bytes of received data as a data block into Spark's memory.
pushBytes(ByteBuffer, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Store the bytes of received data as a data block into Spark's memory.
pushIterator(Iterator<Object>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Store a iterator of received data as a data block into Spark's memory.
pushIterator(Iterator<Object>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Store a iterator of received data as a data block into Spark's memory.
pushSingle(Object) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Push a single data item to backend data store.
pushSingle(Object) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Push a single record of received data into block generator.
put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap
Puts a list of param pairs (overwrites if the input params exists).
put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
Puts a (param, value) pair (overwrites if the input param exists).
put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap
Puts a list of param pairs (overwrites if the input params exists).
putAll(Map<A, B>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
putArray(BlockId, Object[], StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
Put a new block of values to the block manager.
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.DiskStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.MemoryStore
 
putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.TachyonStore
 
putBlockData(BlockId, ManagedBuffer, StorageLevel) - Method in class org.apache.spark.storage.BlockManager
Put the block locally, using the given storage level.
putBytes(BlockId, ByteBuffer, StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
Put a new block of serialized bytes to the block manager.
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.BlockStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.DiskStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.MemoryStore
 
putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.TachyonStore
 
putCachedMetadata(String, Object) - Static method in class org.apache.spark.rdd.HadoopRDD
 
putIfAbsent(A, B) - Method in class org.apache.spark.util.TimeStampedHashMap
 
putIfAbsent(A, B) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockStore
Put in a block and, possibly, also return its content as either bytes or another Iterator.
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.DiskStore
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.MemoryStore
 
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean, boolean) - Method in class org.apache.spark.storage.MemoryStore
Attempt to put the given block in memory store.
putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.TachyonStore
 
PutResult - Class in org.apache.spark.storage
Result of adding a block into a BlockStore.
PutResult(long, Either<Iterator<Object>, ByteBuffer>, Seq<Tuple2<BlockId, BlockStatus>>) - Constructor for class org.apache.spark.storage.PutResult
 
putSingle(BlockId, Object, StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockManager
Write a block consisting of a single object.
pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult
The probability of obtaining a test statistic result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.
pythonExec() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
pythonIncludes() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
pyUDT() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
pyUDT() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 

Q

quantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
QuantileStrategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum for selecting the quantile calculation strategy
QuantileStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
quantileStrategy() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
query() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
 
query() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
query() - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
 
query() - Method in class org.apache.spark.sql.sources.InsertIntoDataSource
 
queryExecution() - Method in class org.apache.spark.sql.DataFrame
 
queue() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
QueueInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
QueueInputDStream(StreamingContext, Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.QueueInputDStream
 
queueStream(Queue<JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean, JavaRDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from an queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream from a queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream from a queue of RDDs.

R

r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns R^2^, the coefficient of determination.
RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
 
rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
Generate a DenseMatrix consisting of i.i.d. uniform random numbers.
rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a DenseMatrix consisting of i.i.d. uniform random numbers.
RAND() - Static method in class org.apache.spark.sql.hive.HiveQl
 
randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
Generate a DenseMatrix consisting of i.i.d. gaussian random numbers.
randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a DenseMatrix consisting of i.i.d. gaussian random numbers.
RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
 
random() - Static method in class org.apache.spark.util.Utils
 
random(int, Random) - Static method in class org.apache.spark.util.Vector
Creates this Vector of given length containing random numbers between 0.0 and 1.0.
RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random
:: DeveloperApi :: Trait for random data generators that generate i.i.d.
RandomForest - Class in org.apache.spark.mllib.tree
:: Experimental :: A class that implements a Random Forest learning algorithm for classification and regression.
RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
 
RandomForest.NodeIndexInfo - Class in org.apache.spark.mllib.tree
 
RandomForest.NodeIndexInfo(int, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
 
RandomForestModel - Class in org.apache.spark.mllib.tree.model
:: Experimental :: Represents a random forest model.
RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
 
randomInit(Graph<Object, Object>) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Generates random vertex properties (v0) to start power iteration.
randomize(TraversableOnce<T>, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
Shuffle the elements of a collection into a random order, returning the result in a new collection.
randomizeInPlace(Object, Random) - Static method in class org.apache.spark.util.Utils
Shuffle the elements of an array into a random order, modifying the original array.
randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs
:: DeveloperApi :: Generates an RDD comprised of i.i.d. samples produced by the input RandomDataGenerator.
RandomRDD<T> - Class in org.apache.spark.mllib.rdd
 
RandomRDD(SparkContext, long, int, RandomDataGenerator<T>, long, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RandomRDD
 
RandomRDDPartition<T> - Class in org.apache.spark.mllib.rdd
 
RandomRDDPartition(int, int, RandomDataGenerator<T>, long) - Constructor for class org.apache.spark.mllib.rdd.RandomRDDPartition
 
RandomRDDs - Class in org.apache.spark.mllib.random
:: Experimental :: Generator methods for creating RDDs comprised of i.i.d. samples from some distribution.
RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
 
RandomSampler<T,U> - Interface in org.apache.spark.util.random
:: DeveloperApi :: A pseudorandom sampler.
randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD
Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD
Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD
Randomly splits this RDD with the provided weights.
randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
:: DeveloperApi :: Generates an RDD[Vector] with vectors containing i.i.d. samples produced by the input RandomDataGenerator.
RandomVectorRDD - Class in org.apache.spark.mllib.rdd
 
RandomVectorRDD(SparkContext, long, int, int, RandomDataGenerator<Object>, long) - Constructor for class org.apache.spark.mllib.rdd.RandomVectorRDD
 
RangeDependency<T> - Class in org.apache.spark
:: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
 
RangePartitioner<K,V> - Class in org.apache.spark
A Partitioner that partitions sortable records by range into roughly equal ranges.
RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
 
rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
rank() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for rank of the matrix factorization.
rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation
::Experimental:: Evaluator for ranking algorithms.
RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
 
RateLimitedOutputStream - Class in org.apache.spark.streaming.util
 
RateLimitedOutputStream(OutputStream, int) - Constructor for class org.apache.spark.streaming.util.RateLimitedOutputStream
 
RateLimiter - Class in org.apache.spark.streaming.receiver
Provides waitToPush() method to limit the rate at which receivers consume data.
RateLimiter(SparkConf) - Constructor for class org.apache.spark.streaming.receiver.RateLimiter
 
rating() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
 
Rating - Class in org.apache.spark.mllib.recommendation
A more compact class to represent a rating than Tuple3[Int, Int, Double].
Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
 
rating() - Method in class org.apache.spark.mllib.recommendation.Rating
 
ratingCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for the column name for ratings.
ratings() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
 
ratings() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
 
ratings() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
 
RawInputDStream<T> - Class in org.apache.spark.streaming.dstream
An input stream that reads blocks of serialized objects from a given network address.
RawInputDStream(StreamingContext, String, int, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.RawInputDStream
 
RawNetworkReceiver - Class in org.apache.spark.streaming.dstream
 
RawNetworkReceiver(String, int, StorageLevel) - Constructor for class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
rawPredictionCol() - Method in interface org.apache.spark.ml.param.HasRawPredictionCol
param for raw prediction column name
rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
RawTextHelper - Class in org.apache.spark.streaming.util
 
RawTextHelper() - Constructor for class org.apache.spark.streaming.util.RawTextHelper
 
RawTextSender - Class in org.apache.spark.streaming.util
A helper program that sends blocks of Kryo-serialized text strings out on a socket at a specified rate.
RawTextSender() - Constructor for class org.apache.spark.streaming.util.RawTextSender
 
rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
 
rdd() - Method in class org.apache.spark.api.java.JavaRDD
 
rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
rdd() - Method in class org.apache.spark.Dependency
 
rdd() - Method in class org.apache.spark.NarrowDependency
 
rdd() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
 
rdd() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
RDD<T> - Class in org.apache.spark.rdd
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
 
RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
Construct an RDD with just a one-to-one dependency on one parent
rdd() - Method in class org.apache.spark.scheduler.Stage
 
rdd() - Method in class org.apache.spark.ShuffleDependency
 
rdd() - Method in class org.apache.spark.sql.DataFrame
Returns the content of the DataFrame as an RDD of Rows.
RDD() - Static method in class org.apache.spark.storage.BlockId
 
rdd1() - Method in class org.apache.spark.rdd.CartesianRDD
 
rdd1() - Method in class org.apache.spark.rdd.SubtractedRDD
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd2() - Method in class org.apache.spark.rdd.CartesianRDD
 
rdd2() - Method in class org.apache.spark.rdd.SubtractedRDD
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd3() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
rdd3() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
rdd4() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
RDDApi<T> - Interface in org.apache.spark.sql
An internal interface defining the RDD-like methods for DataFrame.
RDDBlockId - Class in org.apache.spark.storage
 
RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
 
rddBlocks() - Method in class org.apache.spark.storage.StorageStatus
Return the RDD blocks stored in this block manager.
rddBlocks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus
Return the blocks that belong to the given RDD stored in this block manager.
RDDCheckpointData<T> - Class in org.apache.spark.rdd
This class contains all the information related to RDD checkpointing.
RDDCheckpointData(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDDCheckpointData
 
rddCleaned(int) - Method in interface org.apache.spark.CleanerListener
 
RDDFunctions<T> - Class in org.apache.spark.mllib.rdd
Machine learning specific RDD functions.
RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
 
rddId() - Method in class org.apache.spark.CleanRDD
 
rddId() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
rddId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
 
rddId() - Method in class org.apache.spark.storage.RDDBlockId
 
RDDInfo - Class in org.apache.spark.storage
 
RDDInfo(int, String, int, StorageLevel) - Constructor for class org.apache.spark.storage.RDDInfo
 
rddInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener
Filter RDD info to include only those with cached partitions
rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
 
rddInfoToJson(RDDInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
RDDPage - Class in org.apache.spark.ui.storage
Page showing storage details for a given RDD
RDDPage(StorageTab) - Constructor for class org.apache.spark.ui.storage.RDDPage
 
rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
rdds() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
rdds() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
 
rdds() - Method in class org.apache.spark.rdd.UnionRDD
 
rdds() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus
Return the storage level, if any, used by the given RDD in this block manager.
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.RDD
 
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
 
rddToDataFrameHolder(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext.implicits
Creates a DataFrame from an RDD of case classes or tuples.
rddToFileName(String, String, Time) - Static method in class org.apache.spark.streaming.StreamingContext
 
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.rdd.RDD
 
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
 
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.rdd.RDD
 
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
 
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, WritableFactory<K>, WritableFactory<V>) - Static method in class org.apache.spark.rdd.RDD
 
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
 
read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
read(String, SparkConf, Configuration) - Static method in class org.apache.spark.streaming.CheckpointReader
 
read(WriteAheadLogFileSegment) - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
read() - Method in class org.apache.spark.util.ByteBufferInputStream
 
read(byte[]) - Method in class org.apache.spark.util.ByteBufferInputStream
 
read(byte[], int, int) - Method in class org.apache.spark.util.ByteBufferInputStream
 
readBatches() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.DirectTaskResult
 
readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
 
readExternal(ObjectInput) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
 
readExternal(ObjectInput) - Static method in class org.apache.spark.streaming.flume.EventTransformer
 
readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
readFromFile(Path, Broadcast<SerializableWritable<Configuration>>, TaskContext) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
readFromLog() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Read all the existing logs from the log directory.
readMetadata(JsonAST.JValue) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
Read metadata from the loaded JSON metadata.
readMetaData(Path, Option<Configuration>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Try to read Parquet metadata at the given Path.
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
 
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.JavaDeserializationStream
 
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.KryoDeserializationStream
 
readPartitions() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
readSchema(Seq<Footer>, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
 
readSchemaFromFile(Path, Option<Configuration>, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Reads in Parquet Metadata from the given path and tries to extract the schema (Catalyst attributes) from the application-specific key-value map.
ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
 
ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
Blocks until this action completes.
ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
 
reason() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
reason() - Method in class org.apache.spark.scheduler.CompletionEvent
 
reason() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
 
reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
reason() - Method in class org.apache.spark.scheduler.TaskSetFailed
 
recache() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
Recall - Class in org.apache.spark.mllib.evaluation.binary
Recall.
Recall() - Constructor for class org.apache.spark.mllib.evaluation.binary.Recall
 
recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns recall for a given label (category)
recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns recall (equals to precision for multiclass classifier because sum of all false positives is equal to sum of all false negatives)
recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns document-based recall averaged by the number of documents
recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns recall for a given label (category)
recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the (threshold, recall) curve.
receive() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
Create a socket connection and receive data until receiver is stopped
receive() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
receive() - Method in class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
receive() - Method in interface org.apache.spark.util.ActorLogReceive
 
ReceivedBlock - Interface in org.apache.spark.streaming.receiver
Trait representing a received block
ReceivedBlockHandler - Interface in org.apache.spark.streaming.receiver
Trait that represents a class that handles the storage of blocks received by receiver
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.AddBlock
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BlockAdditionEvent
 
receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
ReceivedBlockInfo - Class in org.apache.spark.streaming.scheduler
Information about blocks received by the receiver
ReceivedBlockInfo(int, long, ReceivedBlockStoreResult) - Constructor for class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
ReceivedBlockStoreResult - Interface in org.apache.spark.streaming.receiver
Trait that represents the metadata related to storage of blocks
ReceivedBlockTracker - Class in org.apache.spark.streaming.scheduler
Class that keep track of all the received blocks, and allocate them to batches when required.
ReceivedBlockTracker(SparkConf, Configuration, Seq<Object>, Clock, Option<String>) - Constructor for class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
 
ReceivedBlockTrackerLogEvent - Interface in org.apache.spark.streaming.scheduler
Trait representing any event in the ReceivedBlockTracker that updates its state.
receivedRecordsDistributions() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
Receiver<T> - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: Abstract class of a receiver that can be run on worker nodes to receive external data.
Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
 
receiverActor() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
receiverExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
ReceiverInfo - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Class having information about a receiver
ReceiverInfo(int, String, ActorRef, boolean, String, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
receiverInfo(int) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream
Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data.
ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
ReceiverMessage - Interface in org.apache.spark.streaming.receiver
Messages sent to the Receiver.
ReceiverState() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
 
receiverState() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
State of the receiver
receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented receiver.
receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented receiver.
ReceiverSupervisor - Class in org.apache.spark.streaming.receiver
Abstract class that is responsible for supervising a Receiver in the worker.
ReceiverSupervisor(Receiver<?>, SparkConf) - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisor
 
ReceiverSupervisor.ReceiverState - Class in org.apache.spark.streaming.receiver
 
ReceiverSupervisor.ReceiverState() - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
Enumeration to identify current state of the StreamingContext
ReceiverSupervisorImpl - Class in org.apache.spark.streaming.receiver
Concrete implementation of ReceiverSupervisor which provides all the necessary functionality for handling the data received by the receiver.
ReceiverSupervisorImpl(Receiver<?>, SparkEnv, Configuration, Option<String>) - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
 
receiverTracker() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
ReceiverTracker - Class in org.apache.spark.streaming.scheduler
This class manages the execution of the receivers of ReceiverInputDStreams.
ReceiverTracker(StreamingContext, boolean) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverTracker
 
ReceiverTracker.ReceiverLauncher - Class in org.apache.spark.streaming.scheduler
This thread class runs all the receivers on the cluster.
ReceiverTracker.ReceiverLauncher() - Constructor for class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
ReceiverTrackerMessage - Interface in org.apache.spark.streaming.scheduler
Messages used by the NetworkReceiver and the ReceiverTracker to communicate with each other.
receiveWithLogging() - Method in class org.apache.spark.HeartbeatReceiver
 
receiveWithLogging() - Method in class org.apache.spark.MapOutputTrackerMasterActor
 
receiveWithLogging() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
receiveWithLogging() - Method in class org.apache.spark.scheduler.local.LocalActor
 
receiveWithLogging() - Method in class org.apache.spark.scheduler.OutputCommitCoordinator.OutputCommitCoordinatorActor
 
receiveWithLogging() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
receiveWithLogging() - Method in class org.apache.spark.storage.BlockManagerSlaveActor
 
receiveWithLogging() - Method in interface org.apache.spark.util.ActorLogReceive
 
recentExceptions() - Method in class org.apache.spark.scheduler.TaskSetManager
 
recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Recommends products to a user.
recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
Recommends users to a product.
recomputeLocality() - Method in class org.apache.spark.scheduler.TaskSetManager
 
RECORD_LENGTH_PROPERTY() - Static method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Property name to set in Hadoop JobConfs for record length
recordProcessorFactory() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD
Update the input bytes read metric each time this number of records has been read
RECORDS_BETWEEN_BYTES_WRITTEN_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.PairRDDFunctions
 
RecurringTimer - Class in org.apache.spark.streaming.util
 
RecurringTimer(Clock, long, Function1<Object, BoxedUnit>, String) - Constructor for class org.apache.spark.streaming.util.RecurringTimer
 
RedirectThread - Class in org.apache.spark.util
A utility class to redirect the child process's stdout or stderr.
RedirectThread(InputStream, OutputStream, String, boolean) - Constructor for class org.apache.spark.util.RedirectThread
 
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey to each RDD.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Create a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by reducing over a using incremental computation.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Alias for reduceByKeyLocally
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Deprecated.
As this API is not Java compatible.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reducedStream() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
ReducedWindowedDStream<K,V> - Class in org.apache.spark.streaming.dstream
 
ReducedWindowedDStream(DStream<Tuple2<K, V>>, Function2<V, V, V>, Function2<V, V, V>, Option<Function1<Tuple2<K, V>, Object>>, Duration, Duration, Partitioner, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
reduceId() - Method in class org.apache.spark.FetchFailed
 
reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
refreshTable(String) - Method in class org.apache.spark.sql.hive.HiveContext
Invalidate and refresh all the cached the metadata of the given table.
refreshTable(String, String) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
RefreshTable - Class in org.apache.spark.sql.sources
 
RefreshTable(String, String) - Constructor for class org.apache.spark.sql.sources.RefreshTable
 
REGEX() - Static method in class org.apache.spark.streaming.Checkpoint
 
REGEXP() - Static method in class org.apache.spark.sql.hive.HiveQl
 
register(Accumulable<?, ?>, boolean) - Static method in class org.apache.spark.Accumulators
 
register(String, Function0<RT>, TypeTags.TypeTag<RT>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 0 arguments as user-defined function (UDF).
register(String, Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 1 arguments as user-defined function (UDF).
register(String, Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 2 arguments as user-defined function (UDF).
register(String, Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 3 arguments as user-defined function (UDF).
register(String, Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 4 arguments as user-defined function (UDF).
register(String, Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 5 arguments as user-defined function (UDF).
register(String, Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 6 arguments as user-defined function (UDF).
register(String, Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 7 arguments as user-defined function (UDF).
register(String, Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 8 arguments as user-defined function (UDF).
register(String, Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 9 arguments as user-defined function (UDF).
register(String, Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 10 arguments as user-defined function (UDF).
register(String, Function11<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 11 arguments as user-defined function (UDF).
register(String, Function12<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 12 arguments as user-defined function (UDF).
register(String, Function13<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 13 arguments as user-defined function (UDF).
register(String, Function14<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 14 arguments as user-defined function (UDF).
register(String, Function15<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 15 arguments as user-defined function (UDF).
register(String, Function16<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 16 arguments as user-defined function (UDF).
register(String, Function17<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 17 arguments as user-defined function (UDF).
register(String, Function18<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 18 arguments as user-defined function (UDF).
register(String, Function19<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 19 arguments as user-defined function (UDF).
register(String, Function20<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 20 arguments as user-defined function (UDF).
register(String, Function21<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 21 arguments as user-defined function (UDF).
register(String, Function22<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>, TypeTags.TypeTag<A22>) - Method in class org.apache.spark.sql.UDFRegistration
Register a Scala closure of 22 arguments as user-defined function (UDF).
register(String, UDF1<?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 1 arguments.
register(String, UDF2<?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 2 arguments.
register(String, UDF3<?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 3 arguments.
register(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 4 arguments.
register(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 5 arguments.
register(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 6 arguments.
register(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 7 arguments.
register(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 8 arguments.
register(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 9 arguments.
register(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 10 arguments.
register(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 11 arguments.
register(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 12 arguments.
register(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 13 arguments.
register(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 14 arguments.
register(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 15 arguments.
register(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 16 arguments.
register(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 17 arguments.
register(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 18 arguments.
register(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 19 arguments.
register(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 20 arguments.
register(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 21 arguments.
register(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
Register a user-defined function with 22 arguments.
register() - Method in class org.apache.spark.streaming.dstream.DStream
Register this streaming as an output stream.
register(Logger) - Static method in class org.apache.spark.util.SignalLogger
Register a signal handler to log signals on UNIX-like systems.
registerBlockManager(BlockManagerId, long, ActorRef) - Method in class org.apache.spark.storage.BlockManagerMaster
Register the BlockManager's id with the driver.
registerBroadcastForCleanup(Broadcast<T>) - Method in class org.apache.spark.ContextCleaner
Register a Broadcast for cleanup when it is garbage collected.
registerClasses(Kryo) - Method in class org.apache.spark.graphx.GraphKryoRegistrator
 
registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
 
registerDataFrameAsTable(DataFrame, String) - Method in class org.apache.spark.sql.SQLContext
Registers the given DataFrame as a temporary table in the catalog.
registered(SchedulerDriver, Protos.FrameworkID, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
registered(SchedulerDriver, Protos.FrameworkID, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
registeredLock() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
registeredLock() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils
Registers classes that GraphX uses with Kryo.
registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf
Use Kryo serialization and register the given set of classes with Kryo.
registerMapOutput(int, int, MapStatus) - Method in class org.apache.spark.MapOutputTrackerMaster
 
registerMapOutputs(int, MapStatus[], boolean) - Method in class org.apache.spark.MapOutputTrackerMaster
Register multiple map output information for the given shuffle
registerRDDForCleanup(RDD<?>) - Method in class org.apache.spark.ContextCleaner
Register a RDD for cleanup when it is garbage collected.
RegisterReceiver - Class in org.apache.spark.streaming.scheduler
 
RegisterReceiver(int, String, String, ActorRef) - Constructor for class org.apache.spark.streaming.scheduler.RegisterReceiver
 
registerShuffle(int, int) - Method in class org.apache.spark.MapOutputTrackerMaster
 
registerShuffleForCleanup(ShuffleDependency<?, ?, ?>) - Method in class org.apache.spark.ContextCleaner
Register a ShuffleDependency for cleanup when it is garbage collected.
registerShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
 
registerShutdownDeleteDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
 
registerSource(Source) - Method in class org.apache.spark.metrics.MetricsSystem
 
registerTable(Seq<String>, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
UNIMPLEMENTED: It needs to be decided how we will persist in-memory tables to the metastore.
registerTempTable(String) - Method in class org.apache.spark.sql.DataFrame
Registers this RDD as a temporary table using the given name.
registrationDone() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
registrationLock() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
registry() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
registry() - Method in class org.apache.spark.metrics.sink.CsvSink
 
registry() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
registry() - Method in class org.apache.spark.metrics.sink.JmxSink
 
registry() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
regParam() - Method in interface org.apache.spark.ml.param.HasRegParam
param for regularization parameter
Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
RegressionMetrics - Class in org.apache.spark.mllib.evaluation
:: Experimental :: Evaluator for regression.
RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
 
RegressionModel<FeaturesType,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression
:: AlphaComponent ::
RegressionModel() - Constructor for class org.apache.spark.ml.regression.RegressionModel
 
RegressionModel - Interface in org.apache.spark.mllib.regression
 
Regressor<FeaturesType,Learner extends Regressor<FeaturesType,Learner,M>,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression
:: AlphaComponent ::
Regressor() - Constructor for class org.apache.spark.ml.regression.Regressor
 
RegressorParams - Interface in org.apache.spark.ml.regression
:: DeveloperApi :: Params for regression.
reindex() - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Construct a new VertexPartition whose index contains only the vertices in the mask.
reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
reindex() - Method in class org.apache.spark.graphx.VertexRDD
Construct a new VertexRDD that is indexed by only the visible vertices.
relation() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
relation() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
relation() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
relation() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
relation() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
relation() - Method in class org.apache.spark.sql.sources.ResolvedDataSource
 
RelationProvider - Interface in org.apache.spark.sql.sources
::DeveloperApi:: Implemented by objects that produce relations for a specific kind of data source.
relativeDirection(long) - Method in class org.apache.spark.graphx.Edge
Return the relative direction of the edge to the corresponding vertex.
releasePythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
 
releaseUnrollMemoryForThisThread(long) - Method in class org.apache.spark.storage.MemoryStore
Release memory used by this thread for unrolling blocks.
ReliableKafkaReceiver<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka
ReliableKafkaReceiver offers the ability to reliably store data into BlockManager without loss.
ReliableKafkaReceiver(Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
 
remainingMem() - Method in class org.apache.spark.storage.BlockManagerInfo
 
remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Sets each DStreams in this context to remember RDDs it generated in the last given duration.
remember(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
 
remember(Duration) - Method in class org.apache.spark.streaming.DStreamGraph
 
remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext
Set each DStreams in this context to remember RDDs it generated in the last given duration.
rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
rememberDuration() - Method in class org.apache.spark.streaming.DStreamGraph
 
remove(String) - Method in class org.apache.spark.SparkConf
Remove a parameter from the configuration
remove(BlockId) - Method in class org.apache.spark.storage.BlockStore
Remove a block, if it exists.
remove(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
remove(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
remove(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
removeBlock(BlockId, boolean) - Method in class org.apache.spark.storage.BlockManager
Remove a block from both memory and disk.
removeBlock(BlockId) - Method in class org.apache.spark.storage.BlockManagerInfo
 
removeBlock(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove a block from the slaves that have it.
removeBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
Remove the given block from this storage status.
removeBlocks() - Method in class org.apache.spark.rdd.BlockRDD
Remove the data blocks that this BlockRDD is made from.
removeBroadcast(long, boolean) - Method in class org.apache.spark.storage.BlockManager
Remove all blocks belonging to the given broadcast.
removeBroadcast(long, boolean, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove all blocks belonging to the given broadcast.
removeExecutor(String, String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
removeExecutor(String, String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
removeExecutor(String) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove a dead executor from the driver actor.
removeFile(TachyonFile) - Method in class org.apache.spark.storage.TachyonBlockManager
 
removeFromDriver() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
removeOutputLoc(int, BlockManagerId) - Method in class org.apache.spark.scheduler.Stage
 
removeOutputsOnExecutor(String) - Method in class org.apache.spark.scheduler.Stage
Removes all shuffle outputs associated with this executor.
removeRdd(int) - Method in class org.apache.spark.storage.BlockManager
Remove all blocks belonging to the given RDD.
removeRdd(int, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove all blocks belonging to the given RDD.
removeRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager
If the given task ID is in the set of running tasks, removes it.
removeSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
 
removeSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
 
removeSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
 
removeShuffle(int, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
Remove all blocks belonging to the given shuffle.
removeSource(Source) - Method in class org.apache.spark.metrics.MetricsSystem
 
render(HttpServletRequest) - Method in class org.apache.spark.streaming.ui.StreamingPage
Render the page
render(HttpServletRequest) - Method in class org.apache.spark.ui.env.EnvironmentPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.exec.ExecutorsPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.exec.ExecutorThreadDumpPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.AllJobsPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.AllStagesPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.JobPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.PoolPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.StagePage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.storage.RDDPage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.storage.StoragePage
 
render(HttpServletRequest) - Method in class org.apache.spark.ui.WebUIPage
 
renderJson(HttpServletRequest) - Method in class org.apache.spark.ui.WebUIPage
 
repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame that has exactly numPartitions partitions.
repartition(int) - Method in interface org.apache.spark.sql.RDDApi
 
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream with an increased or decreased level of parallelism.
repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
replay(InputStream, String) - Method in class org.apache.spark.scheduler.ReplayListenerBus
Replay each event in the order maintained in the given stream.
ReplayListenerBus - Class in org.apache.spark.scheduler
A SparkListenerBus that can be used to replay events from serialized event data.
ReplayListenerBus() - Constructor for class org.apache.spark.scheduler.ReplayListenerBus
 
replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
ReplicatedVertexView<VD,ED> - Class in org.apache.spark.graphx.impl
Manages shipping vertex attributes to the edge partitions of an EdgeRDD.
ReplicatedVertexView(EdgeRDDImpl<ED, VD>, boolean, boolean, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.impl.ReplicatedVertexView
 
replication() - Method in class org.apache.spark.storage.StorageLevel
 
report() - Method in class org.apache.spark.metrics.MetricsSystem
 
report() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
report() - Method in class org.apache.spark.metrics.sink.CsvSink
 
report() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
report() - Method in class org.apache.spark.metrics.sink.JmxSink
 
report() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
report() - Method in interface org.apache.spark.metrics.sink.Sink
 
reporter() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
reporter() - Method in class org.apache.spark.metrics.sink.CsvSink
 
reporter() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
reporter() - Method in class org.apache.spark.metrics.sink.JmxSink
 
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Report exceptions in receiving data.
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Report errors.
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
Report error to the receiver tracker
reportError(String, Throwable) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
ReportError - Class in org.apache.spark.streaming.scheduler
 
ReportError(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReportError
 
requestedAttributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
requestedPartitionOrdinals() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
requestedTotal() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
 
requestExecutors(int) - Method in interface org.apache.spark.ExecutorAllocationClient
Request an additional number of executors from the cluster manager.
requestExecutors(int) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
Request an additional number of executors from the cluster manager.
requestExecutors(int) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Request an additional number of executors from the cluster manager.
requestTotalExecutors(int) - Method in interface org.apache.spark.ExecutorAllocationClient
Express a preference to the cluster manager for a given total number of executors.
requestTotalExecutors(int) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
Express a preference to the cluster manager for a given total number of executors.
requestTotalExecutors(int) - Method in class org.apache.spark.SparkContext
Express a preference to the cluster manager for a given total number of executors.
reregister() - Method in class org.apache.spark.storage.BlockManager
Re-register with the master and report all blocks to it.
reregisterBlockManager() - Method in class org.apache.spark.HeartbeatResponse
 
reregistered(SchedulerDriver, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
reregistered(SchedulerDriver, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
res() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
reservedSizeBytes() - Static method in class org.apache.spark.util.AkkaUtils
Space reserved for extra data in an Akka message besides serialized task or task result.
reserveUnrollMemoryForThisThread(long) - Method in class org.apache.spark.storage.MemoryStore
Reserve additional memory for unrolling blocks used by this thread.
reservoirSampleAndCount(Iterator<T>, int, long, ClassTag<T>) - Static method in class org.apache.spark.util.random.SamplingUtils
Reservoir sampling implementation that also returns the input size.
reset() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
Resets everything to zero, which should be called after each solve.
resetIterator() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
 
resolveClass(ObjectStreamClass) - Method in class org.apache.spark.streaming.ObjectInputStreamWithLoader
 
resolved() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
ResolvedDataSource - Class in org.apache.spark.sql.sources
Create a ResolvedDataSource for saving the content of the given DataFrame.
ResolvedDataSource(Class<?>, BaseRelation) - Constructor for class org.apache.spark.sql.sources.ResolvedDataSource
 
resolvePartitions(Seq<ParquetRelation2.PartitionValues>) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
Resolves possible type conflicts between partitions by up-casting "lower" types.
resolveTable(String, String) - Static method in class org.apache.spark.sql.jdbc.JDBCRDD
Takes a (schema, table) specification and returns the table's Catalyst schema.
ResolveUdtfsAlias - Class in org.apache.spark.sql.hive
Resolve Udtfs Alias.
ResolveUdtfsAlias() - Constructor for class org.apache.spark.sql.hive.ResolveUdtfsAlias
 
resolveURI(String, boolean) - Static method in class org.apache.spark.util.Utils
Return a well-formed URI for the file described by a user input string.
resolveURIs(String, boolean) - Static method in class org.apache.spark.util.Utils
Resolve a comma-separated list of paths.
resourceOffer(String, String, Enumeration.Value) - Method in class org.apache.spark.scheduler.TaskSetManager
Respond to an offer of a single executor from the scheduler by finding a task
resourceOffers(SchedulerDriver, List<Protos.Offer>) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
Method called by Mesos to offer resources on slaves.
resourceOffers(SchedulerDriver, List<Protos.Offer>) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Method called by Mesos to offer resources on slaves.
resourceOffers(Seq<WorkerOffer>) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
Called by cluster manager to offer resources on slaves.
responder() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
responder() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
restart(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver
Restart the receiver.
restartReceiver(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Restart receiver with delay
restartReceiver(String, Option<Throwable>, int) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Restart receiver with delay
restore() - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
Restore the checkpoint data.
restore() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
restore() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
 
restoreCheckpointData() - Method in class org.apache.spark.streaming.dstream.DStream
Restore the RDDs in generatedRDDs from the checkpointData.
restoreCheckpointData() - Method in class org.apache.spark.streaming.DStreamGraph
 
RESUBMIT_TIMEOUT() - Static method in class org.apache.spark.scheduler.DAGScheduler
 
resubmitFailedStages() - Method in class org.apache.spark.scheduler.DAGScheduler
Resubmit any failed stages.
ResubmitFailedStages - Class in org.apache.spark.scheduler
 
ResubmitFailedStages() - Constructor for class org.apache.spark.scheduler.ResubmitFailedStages
 
Resubmitted - Class in org.apache.spark
:: DeveloperApi :: A ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed.
Resubmitted() - Constructor for class org.apache.spark.Resubmitted
 
result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
 
result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
Awaits and returns the result (of type T) of this action.
result() - Method in class org.apache.spark.scheduler.CompletionEvent
 
result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
 
result() - Method in class org.apache.spark.streaming.scheduler.Job
 
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
resultObject() - Method in class org.apache.spark.partial.ApproximateActionListener
 
resultOfJob() - Method in class org.apache.spark.scheduler.Stage
For stages that are the final (consists of only ResultTasks), link to the ActiveJob.
resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
 
ResultTask<T,U> - Class in org.apache.spark.scheduler
A task that sends back the output to the driver application.
ResultTask(int, Broadcast<byte[]>, Partition, Seq<TaskLocation>, int) - Constructor for class org.apache.spark.scheduler.ResultTask
 
ResultWithDroppedBlocks - Class in org.apache.spark.storage
 
ResultWithDroppedBlocks(boolean, Seq<Tuple2<BlockId, BlockStatus>>) - Constructor for class org.apache.spark.storage.ResultWithDroppedBlocks
 
retag(Class<T>) - Method in class org.apache.spark.rdd.RDD
Private API for changing an RDD's ClassTag.
retag(ClassTag<T>) - Method in class org.apache.spark.rdd.RDD
Private API for changing an RDD's ClassTag.
RETAINED_FILES_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
retainedCompletedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
retainedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
retryRandom(Function0<T>, int, int) - Static method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
Retry the given amount of times with a random backoff time (millis) less than the given maxBackOffMillis
retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the configured number of milliseconds to wait on each retry
returnInspector() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
ReturnStatementFinder - Class in org.apache.spark.util
 
ReturnStatementFinder() - Constructor for class org.apache.spark.util.ReturnStatementFinder
 
reverse() - Method in class org.apache.spark.graphx.EdgeDirection
Reverse the direction of an edge.
reverse() - Method in class org.apache.spark.graphx.EdgeRDD
Reverse all the edges in this RDD.
reverse() - Method in class org.apache.spark.graphx.Graph
Reverses all edges in the graph.
reverse() - Method in class org.apache.spark.graphx.impl.EdgePartition
Reverse all the edges in this partition.
reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
reverse() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Return a new ReplicatedVertexView where edges are reversed and shipping levels are swapped to match.
reverse() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
Returns a new RoutingTablePartition reflecting a reversal of all edge directions.
reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD
Returns a new VertexRDD reflecting a reversal of all edge directions in the corresponding EdgeRDD.
revertPartialWritesAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter
Reverts writes that haven't been flushed yet.
revertPartialWritesAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
reviveOffers() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
reviveOffers() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
reviveOffers() - Method in class org.apache.spark.scheduler.local.LocalActor
 
reviveOffers() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
ReviveOffers - Class in org.apache.spark.scheduler.local
 
ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
 
reviveOffers() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
RidgeRegressionModel - Class in org.apache.spark.mllib.regression
Regression model trained using RidgeRegression.
RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
 
RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression
Train a regression model with L2-regularization using Stochastic Gradient Descent.
RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 0.01, miniBatchFraction: 1.0}.
right() - Method in class org.apache.spark.sql.sources.And
 
right() - Method in class org.apache.spark.sql.sources.Or
 
rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the index of the right child of this node.
rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
 
rightNodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Perform a right outer join of this and other.
rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
rlike(String) - Method in class org.apache.spark.sql.Column
SQL RLIKE expression (LIKE with Regex).
RLIKE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
 
rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
A random graph generator using the R-MAT model, proposed in "R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al.
rnd() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
rolledOver() - Method in interface org.apache.spark.util.logging.RollingPolicy
Notify that rollover has occurred
rolledOver() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Rollover has occurred, so reset the counter
rolledOver() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
Rollover has occurred, so find the next time to rollover
RollingFileAppender - Class in org.apache.spark.util.logging
Continuously appends data from input stream into the given file, and rolls over the file after the given interval.
RollingFileAppender(InputStream, File, RollingPolicy, SparkConf, int) - Constructor for class org.apache.spark.util.logging.RollingFileAppender
 
rollingPolicy() - Method in class org.apache.spark.util.logging.RollingFileAppender
 
RollingPolicy - Interface in org.apache.spark.util.logging
Defines the policy based on which RollingFileAppender will generate rolling files.
rolloverIntervalMillis() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
rolloverSizeBytes() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
root() - Method in class org.apache.spark.mllib.fpm.FPTree
 
rootHandler() - Method in class org.apache.spark.ui.ServerInfo
 
rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
Returns the root mean squared error, which is defined as the square root of the mean squared error.
rootPool() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
rootPool() - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
rootPool() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
rootPool() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
rootPool() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
routingTable() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
RoutingTablePartition - Class in org.apache.spark.graphx.impl
Stores the locations of edge-partition join sites for each vertex attribute in a particular vertex partition.
RoutingTablePartition(Tuple3<long[], BitSet, BitSet>[]) - Constructor for class org.apache.spark.graphx.impl.RoutingTablePartition
 
rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
RowMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a row-oriented distributed Matrix with no meaningful row indices.
RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
RowReadSupport - Class in org.apache.spark.sql.parquet
A parquet.hadoop.api.ReadSupport for Row objects.
RowReadSupport() - Constructor for class org.apache.spark.sql.parquet.RowReadSupport
 
RowRecordMaterializer - Class in org.apache.spark.sql.parquet
A parquet.io.api.RecordMaterializer for Rows.
RowRecordMaterializer(CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.RowRecordMaterializer
 
RowRecordMaterializer(MessageType, Seq<Attribute>) - Constructor for class org.apache.spark.sql.parquet.RowRecordMaterializer
 
rows() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
rowsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
rowsPerPart() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
rowToJSON(StructType, JsonGenerator, Row) - Static method in class org.apache.spark.sql.json.JsonRDD
Transforms a single Row to JSON using Jackson
RowWriteSupport - Class in org.apache.spark.sql.parquet
A parquet.hadoop.api.WriteSupport for Row ojects.
RowWriteSupport() - Constructor for class org.apache.spark.sql.parquet.RowWriteSupport
 
run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
Executes some action enclosed in the closure.
run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents
Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation
Run static Label Propagation for detecting communities in networks.
run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths
Computes shortest paths to the given set of landmark vertices.
run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents
Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
This method is deprecated in favor of runSVDPlusPlus(), which replaces DoubleMatrix with Array[Double] in its return value.
run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
 
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Perform expectation maximization
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans
Train a K-means model on the given set of points; data should be cached for high performance, because this is an iterative algorithm.
run(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LDA
Learn an LDA model using the given dataset.
run(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LDA
Java-friendly version of run()
run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Run the PIC algorithm.
run(JavaRDD<Tuple3<Long, Long, Double>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
A Java-friendly version of PowerIterationClustering.run.
run(RDD<Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
Computes an FP-Growth model that contains frequent itemsets.
run(JavaRDD<Basket>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
 
run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
Run ALS with the configured parameters on an input RDD of (user, product, rating) triples.
run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
Java-friendly version of ALS.run.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries starting from the initial weights provided.
run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
Run IsotonicRegression algorithm to obtain isotonic regression model.
run(JavaRDD<Tuple3<Double, Double, Double>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
Run pool adjacent violators algorithm to obtain isotonic regression model.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model over an RDD
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
Method to train a gradient boosting model
run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#run.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model over an RDD
run() - Method in class org.apache.spark.rdd.PartitionCoalescer
Runs the packing algorithm and returns an array of PartitionGroups that if possible are load balanced and grouped by locality
run(long, int) - Method in class org.apache.spark.scheduler.Task
Called by Executor to run this task.
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.AddFile
 
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.AddJar
 
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
 
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
 
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.DropTable
 
run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.HiveNativeCommand
 
run(SQLContext) - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
 
run(SQLContext) - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
 
run(SQLContext) - Method in class org.apache.spark.sql.sources.InsertIntoDataSource
 
run(SQLContext) - Method in class org.apache.spark.sql.sources.RefreshTable
 
run() - Method in class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
 
run() - Method in class org.apache.spark.streaming.flume.FlumeBatchFetcher
 
run() - Method in class org.apache.spark.streaming.scheduler.Job
 
run() - Method in class org.apache.spark.util.RedirectThread
 
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, CallSite, long, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
 
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Run a job that can return approximate results.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction
Runs a Spark job.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, CallSite, boolean, Function2<Object, U, BoxedUnit>, Properties, ClassTag<U>) - Method in class org.apache.spark.scheduler.DAGScheduler
 
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a function on a given set of partitions in an RDD and pass the results to the given handler function.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on a given set of partitions of an RDD, but take a function of type Iterator[T] => U instead of (TaskContext, Iterator[T]) => U.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and pass the results to a handler function.
runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
Run a job on all partitions in an RDD and pass the results to a handler function.
runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS
Run Limited-memory BFGS (L-BFGS) in parallel.
RunLengthEncoding - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
RunLengthEncoding.Decoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
 
RunLengthEncoding.Encoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
 
RunLengthEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
Run stochastic gradient descent (SGD) in parallel using mini batches.
running() - Method in class org.apache.spark.scheduler.TaskInfo
 
RUNNING() - Static method in class org.apache.spark.TaskState
 
runningBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
runningLocally() - Method in class org.apache.spark.TaskContext
 
runningLocally() - Method in class org.apache.spark.TaskContextImpl
 
runningStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
runningTasks() - Method in class org.apache.spark.scheduler.Pool
 
runningTasks() - Method in interface org.apache.spark.scheduler.Schedulable
 
runningTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
runningTasksSet() - Method in class org.apache.spark.scheduler.TaskSetManager
 
runSVDPlusPlus(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
:: Experimental ::
runTask(TaskContext) - Method in class org.apache.spark.scheduler.ResultTask
 
runTask(TaskContext) - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
runTask(TaskContext) - Method in class org.apache.spark.scheduler.Task
 
RuntimePercentage - Class in org.apache.spark.scheduler
 
RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
 
runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

S

s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
s1() - Method in class org.apache.spark.rdd.CartesianPartition
 
s2() - Method in class org.apache.spark.rdd.CartesianPartition
 
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.hive.MetastoreRelation
Only compare database and tablename, not alias.
sameResult(LogicalPlan) - Method in class org.apache.spark.sql.sources.LogicalRelation
 
sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a sampled subset of this RDD.
sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD
Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame by sampling a fraction of rows.
sample(boolean, double) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame by sampling a fraction of rows, using a random seed.
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
 
sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
 
sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler
take a random sample
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
::Experimental:: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
SampledRDD<T> - Class in org.apache.spark.rdd
 
SampledRDD(RDD<T>, boolean, double, int, ClassTag<T>) - Constructor for class org.apache.spark.rdd.SampledRDD
 
SampledRDDPartition - Class in org.apache.spark.rdd
 
SampledRDDPartition(Partition, int) - Constructor for class org.apache.spark.rdd.SampledRDDPartition
 
sampleLogNormal(double, double, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
Randomly samples from a log normal distribution whose corresponding normal distribution has the given mean and standard deviation.
sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.util.StatCounter
Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the sample variance of this RDD's elements (which corrects for bias in estimating the standard variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the sample variance of this RDD's elements (which corrects for bias in estimating the variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.util.StatCounter
Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
samplingRatio() - Method in class org.apache.spark.sql.json.JSONRelation
 
SamplingUtils - Class in org.apache.spark.util.random
 
SamplingUtils() - Constructor for class org.apache.spark.util.random.SamplingUtils
 
save(SparkContext, String, String, int, int, Vector, double, Option<Object>) - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
Helper method for saving GLM classification model metadata and data.
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.SVMModel
 
save(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
save(MatrixFactorizationModel, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
Saves a MatrixFactorizationModel, where user features are saved under data/users and product features are saved under data/products.
save(SparkContext, String, String, Vector, double) - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
Helper method for saving GLM regression model metadata and data.
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LassoModel
 
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
save(SparkContext, String, DecisionTreeModel) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
save(SparkContext, String, TreeEnsembleModel, String) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
 
save(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Saveable
Save this model to the given path.
save(String) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Saves the contents of this DataFrame to the given path, using the default data source configured by spark.sql.sources.default and SaveMode.ErrorIfExists as the save mode.
save(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Saves the contents of this DataFrame to the given path and SaveMode specified by mode, using the default data source configured by spark.sql.sources.default.
save(String, String) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Saves the contents of this DataFrame to the given path based on the given data source, using SaveMode.ErrorIfExists as the save mode.
save(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Saves the contents of this DataFrame to the given path based on the given data source and SaveMode specified by mode.
save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Saves the contents of this DataFrame based on the given data source, SaveMode specified by mode, and a set of options.
save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: (Scala-specific) Saves the contents of this DataFrame based on the given data source, SaveMode specified by mode, and a set of options
Saveable - Interface in org.apache.spark.mllib.util
:: DeveloperApi ::
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsHiveFile(RDD<Row>, Class<?>, ShimFileSinkDesc, SerializableWritable<JobConf>, SparkHiveWriterContainer) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
Save labeled data in LIBSVM format.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported storage system, using a Configuration object for that storage system.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop Configuration object for that storage system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Save each RDD in this DStream as a Hadoop file.
saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
Save each RDD in this DStream as a Sequence file of serialized objects.
saveAsParquetFile(String) - Method in class org.apache.spark.sql.DataFrame
Saves the contents of this DataFrame as a parquet file, preserving the schema.
saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.
saveAsTable(String) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Creates a table from the the contents of this DataFrame.
saveAsTable(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Creates a table from the the contents of this DataFrame, using the default data source configured by spark.sql.sources.default and SaveMode.ErrorIfExists as the save mode.
saveAsTable(String, String) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Creates a table at the given path from the the contents of this DataFrame based on a given data source and a set of options, using SaveMode.ErrorIfExists as the save mode.
saveAsTable(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Creates a table at the given path from the the contents of this DataFrame based on a given data source, SaveMode specified by mode, and a set of options.
saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: Creates a table at the given path from the the contents of this DataFrame based on a given data source, SaveMode specified by mode, and a set of options.
saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
:: Experimental :: (Scala-specific) Creates a table from the the contents of this DataFrame based on a given data source, SaveMode specified by mode, and a set of options.
saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD
Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
Save each RDD in this DStream as at text file, using string representation of elements.
saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
SaveMode - Enum in org.apache.spark.sql
SaveMode is used to specify the expected behavior of saving a DataFrame to a data source.
sc() - Method in class org.apache.spark.api.java.JavaSparkContext
 
sc() - Method in class org.apache.spark.scheduler.DAGScheduler
 
sc() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
sc() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
sc() - Method in class org.apache.spark.sql.SQLContext.implicits.StringToColumn
 
sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
sc() - Method in class org.apache.spark.streaming.StreamingContext
 
sc() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
sc() - Method in class org.apache.spark.ui.jobs.JobsTab
 
sc() - Method in class org.apache.spark.ui.jobs.StagesTab
 
sc() - Method in class org.apache.spark.ui.SparkUI
 
scal(double, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
x = a * x
scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
scalaTag() - Method in class org.apache.spark.sql.columnar.NativeColumnType
Scala TypeTag.
scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
scale() - Method in class org.apache.spark.mllib.random.GammaGenerator
 
scanTable(SparkContext, StructType, String, String, String, String[], Filter[], Partition[]) - Static method in class org.apache.spark.sql.jdbc.JDBCRDD
Build and return JDBCRDD from the given information.
Schedulable - Interface in org.apache.spark.scheduler
An interface for schedulable entities.
SchedulableBuilder - Interface in org.apache.spark.scheduler
An interface to build Schedulable tree buildPools: build the tree nodes(pools) addTaskSetManager: build the leaf nodes(TaskSetManagers)
schedulableBuilder() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
schedulableNameToSchedulable() - Method in class org.apache.spark.scheduler.Pool
 
schedulableQueue() - Method in class org.apache.spark.scheduler.Pool
 
schedulableQueue() - Method in interface org.apache.spark.scheduler.Schedulable
 
schedulableQueue() - Method in class org.apache.spark.scheduler.TaskSetManager
 
scheduler() - Method in class org.apache.spark.streaming.StreamingContext
 
SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.ToolTips
 
schedulerAllocFile() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
SchedulerBackend - Interface in org.apache.spark.scheduler
A backend interface for scheduling systems that allows plugging in different ones under TaskSchedulerImpl.
schedulerBackend() - Method in class org.apache.spark.SparkContext
 
SCHEDULING_MODE_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
SchedulingAlgorithm - Interface in org.apache.spark.scheduler
An interface for sort algorithm FIFO: FIFO algorithm between TaskSetManagers FS: FS algorithm between Pools, and FIFO or FS within Pools
schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for the first job of this batch to start processing from the time this batch was submitted to the streaming scheduler.
schedulingDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
schedulingMode() - Method in class org.apache.spark.scheduler.Pool
 
schedulingMode() - Method in interface org.apache.spark.scheduler.Schedulable
 
SchedulingMode - Class in org.apache.spark.scheduler
"FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues "NONE" is used when the a Schedulable has no sub-queues.
SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
 
schedulingMode() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
schedulingMode() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
schedulingMode() - Method in class org.apache.spark.scheduler.TaskSetManager
 
schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
schedulingPool() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
schema() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
schema() - Method in class org.apache.spark.sql.columnar.PartitionStatistics
 
schema() - Method in class org.apache.spark.sql.DataFrame
Returns the schema of this DataFrame.
schema() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
 
schema() - Method in class org.apache.spark.sql.json.JSONRelation
 
schema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
schema() - Method in class org.apache.spark.sql.sources.BaseRelation
 
SCHEMA_STRING_LENGTH_THRESHOLD() - Static method in class org.apache.spark.sql.SQLConf
 
schemaLess() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
SchemaRelationProvider - Interface in org.apache.spark.sql.sources
::DeveloperApi:: Implemented by objects that produce relations for a specific kind of data source with a given schema.
schemaStringLengthThreshold() - Method in class org.apache.spark.sql.SQLConf
 
schemes() - Method in interface org.apache.spark.sql.columnar.compression.AllCompressionSchemes
 
schemes() - Method in interface org.apache.spark.sql.columnar.compression.WithCompressionSchemes
 
scoreAndLabels() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
 
scratch() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
Scripts() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
ScriptTransformation - Class in org.apache.spark.sql.hive.execution
Transforms the input by forking and running the specified script.
ScriptTransformation(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveScriptIOSchema, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformation
 
seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
 
seconds(long) - Static method in class org.apache.spark.streaming.Durations
 
Seconds - Class in org.apache.spark.streaming
Helper object that creates instance of Duration representing a given number of seconds.
Seconds() - Constructor for class org.apache.spark.streaming.Seconds
 
SECONDS_PER_MINUTE() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
SecurityManager - Class in org.apache.spark
Spark class responsible for security.
SecurityManager(SparkConf) - Constructor for class org.apache.spark.SecurityManager
 
securityManager() - Method in class org.apache.spark.SparkEnv
 
securityManager() - Method in class org.apache.spark.ui.SparkUI
 
seed() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
seed() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
 
seed() - Method in class org.apache.spark.rdd.SampledRDDPartition
 
seedBrokers() - Method in class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig
 
seenNulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
 
segment() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
segment() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
select(Column...) - Method in class org.apache.spark.sql.DataFrame
Selects a set of expressions.
select(String, String...) - Method in class org.apache.spark.sql.DataFrame
Selects a set of columns.
select(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
Selects a set of expressions.
select(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
Selects a set of columns.
selectedFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
 
selectExpr(String...) - Method in class org.apache.spark.sql.DataFrame
Selects a set of SQL expressions.
selectExpr(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
Selects a set of SQL expressions.
selectNodesToSplit(Queue<Tuple2<Object, Node>>, long, DecisionTreeMetadata, Random) - Static method in class org.apache.spark.mllib.tree.RandomForest
Pull nodes off of the queue, and collect a group of nodes to be split on this iteration.
sender() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext
Sends a message to the destination vertex.
sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext
Sends a message to the source vertex.
sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
Get an RDD for a Hadoop SequenceFile.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext
Version of sequenceFile() for types implicitly convertible to Writables through a WritableConverter.
SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Class<? extends Writable>, Class<? extends Writable>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
 
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
 
ser() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SerializableBuffer - Class in org.apache.spark.util
A wrapper around a java.nio.ByteBuffer that is serializable through Java serialization, to make it easier to pass ByteBuffers in case class messages.
SerializableBuffer(ByteBuffer) - Constructor for class org.apache.spark.util.SerializableBuffer
 
serializableHadoopSplit() - Method in class org.apache.spark.rdd.NewHadoopPartition
 
SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
 
SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
 
SerializationDebugger - Class in org.apache.spark.serializer
 
SerializationDebugger() - Constructor for class org.apache.spark.serializer.SerializationDebugger
 
SerializationDebugger.ObjectStreamClassMethods - Class in org.apache.spark.serializer
An implicit class that allows us to call private methods of ObjectStreamClass.
SerializationDebugger.ObjectStreamClassMethods(ObjectStreamClass) - Constructor for class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
SerializationDebugger.ObjectStreamClassMethods$ - Class in org.apache.spark.serializer
 
SerializationDebugger.ObjectStreamClassMethods$() - Constructor for class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods$
 
SerializationStream - Class in org.apache.spark.serializer
:: DeveloperApi :: A stream for writing serialized objects.
SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
 
serialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
serialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
serialize(T) - Static method in class org.apache.spark.util.Utils
Serialize an object using Java serialization
serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
serializedTask() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
 
serializedTask() - Method in class org.apache.spark.scheduler.TaskDescription
 
serializeFilterExpressions(Seq<Expression>, Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
Note: Inside the Hadoop API we only have access to Configuration, not to SparkContext, so we cannot use broadcasts to convey the actual filter predicate.
serializeMapStatuses(MapStatus[]) - Static method in class org.apache.spark.MapOutputTracker
 
serializePlan(Object, OutputStream) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
Serializer - Class in org.apache.spark.serializer
:: DeveloperApi :: A serializer.
Serializer() - Constructor for class org.apache.spark.serializer.Serializer
 
serializer() - Method in class org.apache.spark.ShuffleDependency
 
serializer() - Method in class org.apache.spark.SparkEnv
 
SerializerInstance - Class in org.apache.spark.serializer
:: DeveloperApi :: An instance of a serializer, for use by one thread at a time.
SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
 
serializeViaNestedStream(OutputStream, SerializerInstance, Function1<SerializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Serialize via nested stream using specific serializer
serializeWithDependencies(Task<?>, HashMap<String, Object>, HashMap<String, Object>, SerializerInstance) - Static method in class org.apache.spark.scheduler.Task
Serialize a task and the current app dependencies (files and JARs added to the SparkContext)
server() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
 
server() - Method in class org.apache.spark.ui.ServerInfo
 
ServerInfo - Class in org.apache.spark.ui
 
ServerInfo(Server, int, ContextHandlerCollection) - Constructor for class org.apache.spark.ui.ServerInfo
 
ServerStateException - Exception in org.apache.spark
Exception type thrown by HttpServer when it is in the wrong state for an operation.
ServerStateException(String) - Constructor for exception org.apache.spark.ServerStateException
 
serverUri() - Method in class org.apache.spark.HttpFileServer
 
SERVLET_DEFAULT_SAMPLE() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
SERVLET_KEY_PATH() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
SERVLET_KEY_SAMPLE() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
servletPath() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
servletShowSample() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
Sets a parameter in the embedded param map.
set(String, Object) - Method in interface org.apache.spark.ml.param.Params
Sets a parameter (by name) in the embedded param map.
set(String, String) - Method in class org.apache.spark.SparkConf
Set a configuration variable.
set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
 
set(Function0<Object>) - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
 
set(int, long) - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 
setAcls(boolean) - Method in class org.apache.spark.SecurityManager
 
setActiveContext(SparkContext, boolean) - Static method in class org.apache.spark.SparkContext
Called at the end of the SparkContext constructor to ensure that no other SparkContext has raced with this constructor and started.
setAdminAcls(String) - Method in class org.apache.spark.SecurityManager
Admin acls should be set before the view or modify acls.
setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD
Set aggregator for RDD's shuffle.
setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Sets Algorithm using a String.
setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
Set multiple parameters together
setAlpha(double) - Method in class org.apache.spark.ml.recommendation.ALS
 
setAlpha(double) - Method in class org.apache.spark.mllib.clustering.LDA
Alias for setDocConcentration()
setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
Sets the constant used in computing confidence in implicit ALS.
setAppName(String) - Method in class org.apache.spark.SparkConf
Set a name for your application.
setAppName(String) - Method in class org.apache.spark.ui.SparkUI
Set the app name for this UI.
setBatchDuration(Duration) - Method in class org.apache.spark.streaming.DStreamGraph
 
setBeta(double) - Method in class org.apache.spark.mllib.clustering.LDA
Alias for setTopicConcentration()
setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of blocks for both user blocks and product blocks to parallelize the computation into; pass -1 for an auto-configured number of blocks.
setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Pass-through to SparkContext.setCallSite.
setCallSite(String) - Method in class org.apache.spark.SparkContext
Set the thread-local property for overriding the call sites of actions and RDDs.
setCallSite(CallSite) - Method in class org.apache.spark.SparkContext
Set the thread-local property for overriding the call sites of actions and RDDs.
setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Sets categoricalFeaturesInfo using a Java Map.
setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Set the directory under which RDDs are going to be checkpointed.
setCheckpointDir(String) - Method in class org.apache.spark.SparkContext
Set the directory under which RDDs are going to be checkpointed.
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.clustering.LDA
Period (in iterations) between checkpoints (default = 10).
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setClock(Clock) - Method in class org.apache.spark.ExecutorAllocationManager
Use a different clock for this allocation manager.
setCompressCodec(String) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setCompressed(boolean) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setCompressType(String) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setConf(Configuration) - Method in interface org.apache.spark.input.Configurable
 
setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
 
setConf(Properties) - Method in class org.apache.spark.sql.SQLConf
Set Spark SQL configuration properties.
setConf(String, String) - Method in class org.apache.spark.sql.SQLConf
Set the given Spark SQL configuration property.
setConf(Properties) - Method in class org.apache.spark.sql.SQLContext
Set Spark SQL configuration properties.
setConf(String, String) - Method in class org.apache.spark.sql.SQLContext
Set the given Spark SQL configuration property.
setConsumerOffsetMetadata(String, Map<TopicAndPartition, OffsetMetadataAndError>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
Requires Kafka >= 0.8.1.1
setConsumerOffsets(String, Map<TopicAndPartition, Object>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
Requires Kafka >= 0.8.1.1
setContext(StreamingContext) - Method in class org.apache.spark.streaming.dstream.DStream
 
setContext(StreamingContext) - Method in class org.apache.spark.streaming.DStreamGraph
 
setConvergenceTol(double) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Set the largest change in log-likelihood at which convergence is considered to have occurred.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the convergence tolerance of iterations for L-BFGS.
setCustomHostname(String) - Static method in class org.apache.spark.util.Utils
Allow setting a custom host name because when we run on Mesos we need to use the same hostname it reports to the master.
setDAGScheduler(DAGScheduler) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
setDAGScheduler(DAGScheduler) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Set the decay factor directly (for forgetful algorithms).
setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer
Sets a class loader for the serializer to use in deserialization.
setDelaySeconds(SparkConf, Enumeration.Value, int) - Static method in class org.apache.spark.util.MetadataCleaner
 
setDelaySeconds(SparkConf, int, boolean) - Static method in class org.apache.spark.util.MetadataCleaner
Set the default delay time (in seconds).
setDestTableId(int) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setDictionary(Dictionary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
 
setDocConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the distance threshold within which we've consider centers to have converged.
setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf
Set an environment variable to be used when launching executors for this application.
setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
Set multiple environment variables to be used when launching executors.
setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf
Set multiple environment variables to be used when launching executors.
setFailure(Exception) - Method in class org.apache.spark.partial.PartialResult
 
setFeatureScaling(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Set if the algorithm should use feature scaling to improve the convergence during optimization.
setFeaturesCol(String) - Method in class org.apache.spark.ml.impl.estimator.PredictionModel
 
setFeaturesCol(String) - Method in class org.apache.spark.ml.impl.estimator.Predictor
 
setField(MutableRow, int, byte[]) - Static method in class org.apache.spark.sql.columnar.BINARY
 
setField(MutableRow, int, boolean) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
setField(MutableRow, int, byte) - Static method in class org.apache.spark.sql.columnar.BYTE
 
setField(MutableRow, int, JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType
Sets row(ordinal) to field.
setField(MutableRow, int, int) - Static method in class org.apache.spark.sql.columnar.DATE
 
setField(MutableRow, int, double) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
setField(MutableRow, int, float) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
setField(MutableRow, int, byte[]) - Static method in class org.apache.spark.sql.columnar.GENERIC
 
setField(MutableRow, int, int) - Static method in class org.apache.spark.sql.columnar.INT
 
setField(MutableRow, int, long) - Static method in class org.apache.spark.sql.columnar.LONG
 
setField(MutableRow, int, short) - Static method in class org.apache.spark.sql.columnar.SHORT
 
setField(MutableRow, int, String) - Static method in class org.apache.spark.sql.columnar.STRING
 
setField(MutableRow, int, Timestamp) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
setFinalRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
:: DeveloperApi :: Sets storage level for final RDDs (user/product used in MatrixFactorizationModel).
setFinalValue(R) - Method in class org.apache.spark.partial.PartialResult
 
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the gradient function (of the loss function of one single data example) to be used for SGD.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.
setGraph(DStreamGraph) - Method in class org.apache.spark.streaming.dstream.DStream
 
setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Set the half life and time unit ("batches" or "points") for forgetful algorithms.
setId(int) - Method in class org.apache.spark.streaming.scheduler.Job
 
setIfMissing(String, String) - Method in class org.apache.spark.SparkConf
Set a parameter if it isn't already configured
setImplicitPrefs(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
 
setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
Sets whether to use implicit preference.
setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Specify initial centers directly.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the initialization algorithm.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Set the initialization mode.
setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the number of steps for the k-means|| initialization mode.
setInitialModel(GaussianMixtureModel) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Set the initial GMM starting point, bypassing the random initialization.
setInitialWeights(Vector) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
Set the initial weights.
setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Set the initial weights.
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
 
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
 
setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Set if the algorithm should add an intercept.
setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
:: DeveloperApi :: Sets storage level for intermediate RDDs (user/product in/out links).
setIsotonic(boolean) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
Sets the isotonic parameter.
setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
 
setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of iterations to run.
setJars(Seq<String>) - Method in class org.apache.spark.SparkConf
Set JAR files to distribute to the cluster.
setJars(String[]) - Method in class org.apache.spark.SparkConf
Set JAR files to distribute to the cluster.
setJobDescription(String) - Method in class org.apache.spark.SparkContext
Set a human readable description of the current job.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setK(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Set the number of Gaussians in the mixture model.
setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the number of clusters to create (k).
setK(int) - Method in class org.apache.spark.mllib.clustering.LDA
Number of topics to infer.
setK(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Set the number of clusters.
setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Set the number of clusters.
setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD
Set key ordering for RDD's shuffle.
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setLabelCol(String) - Method in class org.apache.spark.ml.impl.estimator.Predictor
 
setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
Set the smoothing parameter.
setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the regularization parameter, lambda.
setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets initial learning rate (default: 0.025).
setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setLocalProperties(Properties) - Method in class org.apache.spark.SparkContext
 
setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext
Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLocation(Table, CreateTableDesc) - Static method in class org.apache.spark.sql.hive.HiveShim
 
setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD
Set mapSideCombine flag for RDD's shuffle.
setMaster(String) - Method in class org.apache.spark.SparkConf
The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setMaxIter(int) - Method in class org.apache.spark.ml.recommendation.ALS
 
setMaxIter(int) - Method in class org.apache.spark.ml.regression.LinearRegression
 
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Set the maximum number of iterations to run.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans
Set maximum number of iterations to run.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.LDA
Maximum number of iterations for learning.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
Set maximum number of iterations of the power iteration loop
setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
Deprecated.
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setMinCount(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets minCount, the minimum number of times a token must appear to be included in the word2vec model's vocabulary (default: 5).
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
Set the fraction of each batch to use for updates.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
:: Experimental :: Set fraction of data to be used for each SGD iteration.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Set the fraction of each batch to use for updates.
setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setMinPartitions(JobContext, int) - Method in class org.apache.spark.input.StreamFileInputFormat
Allow minPartitions set by end-user in order to keep compatibility with old Hadoop API which is set through setMaxSplitSize
setMinPartitions(JobContext, int) - Method in class org.apache.spark.input.WholeTextFileInputFormat
Allow minPartitions set by end-user in order to keep compatibility with old Hadoop API, which is set through setMaxSplitSize
setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.FPGrowth
Sets the minimal support level (default: 0.3).
setModifyAcls(Set<String>, String) - Method in class org.apache.spark.SecurityManager
Admin acls should be set before the view or modify acls.
setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaRDD
Assign a name to this RDD
setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
setName(String) - Method in class org.apache.spark.rdd.RDD
Assign a name to this RDD
setNonnegative(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
 
setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
Set whether the least-squares problems solved at each iteration should have nonnegativity constraints.
setNumBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
Sets both numUserBlocks and numItemBlocks to the specific value.
setNumClasses(int) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
:: Experimental :: Set the number of possible outcomes for k classes classification problem in Multinomial Logistic Regression.
setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the number of corrections used in the LBFGS update.
setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
 
setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
setNumItemBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
 
setNumIterations(int) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
Set the number of iterations of gradient descent to run per update.
setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets number of iterations (default: 1), which should be smaller than or equal to number of partitions.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the number of iterations for SGD.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the maximal number of iterations for L-BFGS.
setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Set the number of iterations of gradient descent to run per update.
setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets number of partitions (default: 1).
setNumPartitions(int) - Method in class org.apache.spark.mllib.fpm.FPGrowth
Sets the number of partitions used by parallel FP-growth (default: same as input data).
setNumSplits(int, int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Set number of splits for a continuous feature.
setNumUserBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
 
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
 
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
 
setPredictionCol(String) - Method in class org.apache.spark.ml.impl.estimator.PredictionModel
 
setPredictionCol(String) - Method in class org.apache.spark.ml.impl.estimator.Predictor
 
setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
 
setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
 
setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
 
setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
 
setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of product blocks to parallelize the computation.
setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Initialize random centers, requiring only the number of dimensions.
setRank(int) - Method in class org.apache.spark.ml.recommendation.ALS
 
setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the rank of the feature matrices computed (number of features).
setRatingCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
 
setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.ClassificationModel
 
setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.Classifier
 
setReceiverId(int) - Method in class org.apache.spark.streaming.receiver.Receiver
Set the ID of the DStream that this receiver is associated with.
setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setRegParam(double) - Method in class org.apache.spark.ml.recommendation.ALS
 
setRegParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression
 
setRegParam(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the regularization parameter.
setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans
:: Experimental :: Set the number of runs of the algorithm to execute in parallel.
setSchema(Seq<Attribute>, Configuration) - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
setScoreCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
setSeed(long) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
Set the random seed
setSeed(long) - Method in class org.apache.spark.mllib.clustering.KMeans
Set the random seed for cluster initialization.
setSeed(long) - Method in class org.apache.spark.mllib.clustering.LDA
Random seed
setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets random seed (default: a random long integer).
setSeed(long) - Method in class org.apache.spark.mllib.random.ExponentialGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.GammaGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.LogNormalGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
 
setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
Sets a random seed to have deterministic results.
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
 
setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
 
setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom
Set random seed.
setSeed(long) - Method in class org.apache.spark.util.random.XORShiftRandom
 
setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSerializer(Serializer) - Method in class org.apache.spark.rdd.SubtractedRDD
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSparkHome(String) - Method in class org.apache.spark.SparkConf
Set the location where Spark is installed on worker nodes.
setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
 
setStepSize(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
Set the step size for gradient descent.
setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the initial step size of SGD for the first step.
setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Set the step size for gradient descent.
setStreamingLogLevels() - Static method in class org.apache.spark.examples.streaming.StreamingExamples
Set reasonable logging levels for streaming if the user has not configured log4j.
setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setTableInfo(TableDesc) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
setTaskContext(TaskContext) - Static method in class org.apache.spark.TaskContextHelper
 
setTblNullFormat(CreateTableDesc, Table) - Static method in class org.apache.spark.sql.hive.HiveShim
 
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
:: Experimental :: Sets the threshold that separates positive predictions from negative predictions in Binary Logistic Regression.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel
:: Experimental :: Sets the threshold that separates positive predictions from negative predictions.
setTime(long) - Method in class org.apache.spark.util.ManualClock
 
setTopicConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
setup(int, int, int) - Method in class org.apache.spark.SparkHadoopWriter
 
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent
Set the updater function to actually perform a gradient step in a given direction.
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS
Set the updater function to actually perform a gradient step in a given direction.
setupGroups(int) - Method in class org.apache.spark.rdd.PartitionCoalescer
Initializes targetLen partition groups and assigns a preferredLocation This uses coupon collector to estimate how many preferredLocations it must rotate through until it has seen most of the preferred locations (2 * n log(n))
setupSecureURLConnection(URLConnection, SecurityManager) - Static method in class org.apache.spark.util.Utils
If the given URL connection is HttpsURLConnection, it sets the SSL socket factory and the host verifier from the given security manager.
setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
Set the number of user blocks to parallelize the computation.
setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
 
setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
Set if the algorithm should validate data before training.
setValue(R) - Method in class org.apache.spark.Accumulable
Set the accumulator's value; only allowed on master
setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
Sets vector size (default: 100).
setViewAcls(Set<String>, String) - Method in class org.apache.spark.SecurityManager
Admin acls should be set before the view or modify acls.
setViewAcls(String, String) - Method in class org.apache.spark.SecurityManager
 
setWithMean(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
setWithStd(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
shape() - Method in class org.apache.spark.mllib.random.GammaGenerator
 
shardId() - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
 
ShimFileSinkDesc - Class in org.apache.spark.sql.hive
 
ShimFileSinkDesc(String, TableDesc, boolean) - Constructor for class org.apache.spark.sql.hive.ShimFileSinkDesc
 
shippablePartitionToOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Implicit conversion to allow invoking VertexPartitionBase operations directly on a ShippableVertexPartition.
ShippableVertexPartition<VD> - Class in org.apache.spark.graphx.impl
A map from vertex id to vertex attribute that additionally stores edge partition join sites for each vertex attribute, enabling joining with an EdgeRDD.
ShippableVertexPartition(OpenHashSet<Object>, Object, BitSet, RoutingTablePartition, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartition
 
ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$ - Class in org.apache.spark.graphx.impl
Implicit evidence that ShippableVertexPartition is a member of the VertexPartitionBaseOpsConstructor typeclass.
ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$() - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
 
ShippableVertexPartitionOps<VD> - Class in org.apache.spark.graphx.impl
 
ShippableVertexPartitionOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Generate a VertexAttributeBlock for each edge partition keyed on the edge partition ID.
shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.VertexRDD
Generates an RDD of vertex attributes suitable for shipping to the edge partitions.
shipVertexIds() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Generate a VertexId array for each edge partition keyed on the edge partition ID.
shipVertexIds() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
shipVertexIds() - Method in class org.apache.spark.graphx.VertexRDD
Generates an RDD of vertex IDs suitable for shipping to the edge partitions.
SHORT - Class in org.apache.spark.sql.columnar
 
SHORT() - Constructor for class org.apache.spark.sql.columnar.SHORT
 
SHORT_FORM() - Static method in class org.apache.spark.util.CallSite
 
ShortColumnAccessor - Class in org.apache.spark.sql.columnar
 
ShortColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.ShortColumnAccessor
 
ShortColumnBuilder - Class in org.apache.spark.sql.columnar
 
ShortColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.ShortColumnBuilder
 
ShortColumnStats - Class in org.apache.spark.sql.columnar
 
ShortColumnStats() - Constructor for class org.apache.spark.sql.columnar.ShortColumnStats
 
ShortestPaths - Class in org.apache.spark.graphx.lib
Computes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark.
ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
 
shortForm() - Method in class org.apache.spark.util.CallSite
 
shortParquetCompressionCodecNames() - Static method in class org.apache.spark.sql.parquet.ParquetRelation
 
shouldCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
Check if it's time to checkpoint based on the current time and the derived time for the next checkpoint
shouldRollover(long) - Method in interface org.apache.spark.util.logging.RollingPolicy
Whether rollover should be initiated at this moment
shouldRollover(long) - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Should rollover if the next set of bytes is going to exceed the size limit
shouldRollover(long) - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
Should rollover if current time has exceeded next rollover time
show(int) - Method in class org.apache.spark.sql.DataFrame
Displays the DataFrame in a tabular form.
show() - Method in class org.apache.spark.sql.DataFrame
Displays the top 20 rows of DataFrame in a tabular form.
showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showBytesDistribution(String, Option<Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showBytesDistribution(String, Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Option<Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, Option<Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Option<Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
 
showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
showQuantiles(PrintStream) - Method in class org.apache.spark.util.Distribution
 
showString(int) - Method in class org.apache.spark.sql.DataFrame
Internal API for Python
SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_BLOCK_MANAGER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
 
SHUFFLE_PARTITIONS() - Static method in class org.apache.spark.sql.SQLConf
 
SHUFFLE_READ() - Static method in class org.apache.spark.ui.ToolTips
 
SHUFFLE_READ_BLOCKED_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
SHUFFLE_READ_BLOCKED_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
SHUFFLE_READ_REMOTE_SIZE() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
SHUFFLE_READ_REMOTE_SIZE() - Static method in class org.apache.spark.ui.ToolTips
 
SHUFFLE_WRITE() - Static method in class org.apache.spark.ui.ToolTips
 
ShuffleBlockFetcherIterator - Class in org.apache.spark.storage
An iterator that fetches multiple blocks.
ShuffleBlockFetcherIterator(TaskContext, ShuffleClient, BlockManager, Seq<Tuple2<BlockManagerId, Seq<Tuple2<BlockId, Object>>>>, Serializer, long) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator
 
ShuffleBlockFetcherIterator.FailureFetchResult - Class in org.apache.spark.storage
Result of a fetch from a remote block unsuccessfully.
ShuffleBlockFetcherIterator.FailureFetchResult(BlockId, Throwable) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
ShuffleBlockFetcherIterator.FailureFetchResult$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.FailureFetchResult$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult$
 
ShuffleBlockFetcherIterator.FetchRequest - Class in org.apache.spark.storage
A request to fetch blocks from a remote BlockManager.
ShuffleBlockFetcherIterator.FetchRequest(BlockManagerId, Seq<Tuple2<BlockId, Object>>) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
ShuffleBlockFetcherIterator.FetchRequest$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.FetchRequest$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest$
 
ShuffleBlockFetcherIterator.FetchResult - Interface in org.apache.spark.storage
Result of a fetch from a remote block.
ShuffleBlockFetcherIterator.SuccessFetchResult - Class in org.apache.spark.storage
Result of a fetch from a remote block successfully.
ShuffleBlockFetcherIterator.SuccessFetchResult(BlockId, long, ManagedBuffer) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
ShuffleBlockFetcherIterator.SuccessFetchResult$ - Class in org.apache.spark.storage
 
ShuffleBlockFetcherIterator.SuccessFetchResult$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult$
 
ShuffleBlockId - Class in org.apache.spark.storage
 
ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
 
shuffleCleaned(int) - Method in interface org.apache.spark.CleanerListener
 
shuffleClient() - Method in class org.apache.spark.storage.BlockManager
 
ShuffleCoGroupSplitDep - Class in org.apache.spark.rdd
 
ShuffleCoGroupSplitDep(ShuffleHandle) - Constructor for class org.apache.spark.rdd.ShuffleCoGroupSplitDep
 
ShuffleDataBlockId - Class in org.apache.spark.storage
 
ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
 
ShuffledDStream<K,V,C> - Class in org.apache.spark.streaming.dstream
 
ShuffledDStream(DStream<Tuple2<K, V>>, Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.streaming.dstream.ShuffledDStream
 
shuffleDep() - Method in class org.apache.spark.scheduler.Stage
 
ShuffleDependency<K,V,C> - Class in org.apache.spark
:: DeveloperApi :: Represents a dependency on the output of a shuffle stage.
ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean) - Constructor for class org.apache.spark.ShuffleDependency
 
ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd
:: DeveloperApi :: The resulting RDD from a shuffle (e.g.
ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner) - Constructor for class org.apache.spark.rdd.ShuffledRDD
 
ShuffledRDDPartition - Class in org.apache.spark.rdd
 
ShuffledRDDPartition(int) - Constructor for class org.apache.spark.rdd.ShuffledRDDPartition
 
shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
 
shuffleId() - Method in class org.apache.spark.CleanShuffle
 
shuffleId() - Method in class org.apache.spark.FetchFailed
 
shuffleId() - Method in class org.apache.spark.GetMapOutputStatuses
 
shuffleId() - Method in class org.apache.spark.ShuffleDependency
 
shuffleId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
 
shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
 
ShuffleIndexBlockId - Class in org.apache.spark.storage
 
ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
 
shuffleManager() - Method in class org.apache.spark.SparkEnv
 
ShuffleMapTask - Class in org.apache.spark.scheduler
A ShuffleMapTask divides the elements of an RDD into multiple buckets (based on a partitioner specified in the ShuffleDependency).
ShuffleMapTask(int, Broadcast<byte[]>, Partition, Seq<TaskLocation>) - Constructor for class org.apache.spark.scheduler.ShuffleMapTask
 
ShuffleMapTask(int) - Constructor for class org.apache.spark.scheduler.ShuffleMapTask
A constructor used only in test suites.
shuffleMemoryManager() - Method in class org.apache.spark.SparkEnv
 
shuffleRead() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
shuffleReadMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleReadMetricsToJson(ShuffleReadMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleReadRecords() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
shuffleReadRecords() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
shuffleReadTotalBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
shuffleServerId() - Method in class org.apache.spark.storage.BlockManager
 
shuffleToMapStage() - Method in class org.apache.spark.scheduler.DAGScheduler
 
shuffleWrite() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
shuffleWriteBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
shuffleWriteMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleWriteMetricsToJson(ShuffleWriteMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
shuffleWriteRecords() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
shuffleWriteRecords() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
shutdown(IRecordProcessorCheckpointer, ShutdownReason) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
Kinesis Client Library is shutting down this Worker for 1 of 2 reasons: 1) the stream is resharding by splitting or merging adjacent shards (ShutdownReason.TERMINATE) 2) the failed or latent Worker has stopped sending heartbeats for whatever reason (ShutdownReason.ZOMBIE)
shutdownCallback() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
sigma() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
 
sigmas() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
 
SignalLogger - Class in org.apache.spark.util
Used to log signals received.
SignalLogger() - Constructor for class org.apache.spark.util.SignalLogger
 
SignalLoggerHandler - Class in org.apache.spark.util
 
SignalLoggerHandler(String, Logger) - Constructor for class org.apache.spark.util.SignalLoggerHandler
 
SimpleFutureAction<T> - Class in org.apache.spark
A FutureAction holding the result of an action that triggers a single job.
SimpleFutureAction(JobWaiter<?>, Function0<T>) - Constructor for class org.apache.spark.SimpleFutureAction
 
simpleString() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
SimpleUpdater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: A simple updater for gradient descent *without* any regularization.
SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
 
simpleWritableConverter(Function1<W, T>, ClassTag<W>) - Static method in class org.apache.spark.WritableConverter
 
simpleWritableFactory(Function1<T, W>, ClassTag<T>, ClassTag<W>) - Static method in class org.apache.spark.WritableFactory
 
SimrSchedulerBackend - Class in org.apache.spark.scheduler.cluster
 
SimrSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
SingleItemData<T> - Class in org.apache.spark.streaming.receiver
 
SingleItemData(T) - Constructor for class org.apache.spark.streaming.receiver.SingleItemData
 
SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg
:: Experimental :: Represents singular value decomposition (SVD) factors.
SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
Sink - Interface in org.apache.spark.metrics.sink
 
SINK_REGEX() - Static method in class org.apache.spark.metrics.MetricsSystem
 
size() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
 
size() - Method in class org.apache.spark.graphx.impl.EdgePartition
The number of edges in this partition
size() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
size() - Method in class org.apache.spark.ml.param.ParamMap
Number of param pairs in this set.
size() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
Size of the block.
size() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
Size of the block.
size() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
 
size() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
size() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
size() - Method in interface org.apache.spark.mllib.linalg.Vector
Size of the vector.
size() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
 
size() - Method in class org.apache.spark.rdd.PartitionGroup
 
size() - Method in class org.apache.spark.scheduler.IndirectTaskResult
 
size() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
size() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
size() - Method in class org.apache.spark.storage.BlockInfo
 
size() - Method in class org.apache.spark.storage.MemoryEntry
 
size() - Method in class org.apache.spark.storage.PutResult
 
size() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
size() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
size() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
size() - Method in class org.apache.spark.util.TimeStampedHashMap
 
size() - Method in class org.apache.spark.util.TimeStampedHashSet
 
size() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
SIZE_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
SIZE_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
SizeBasedRollingPolicy - Class in org.apache.spark.util.logging
Defines a RollingPolicy by which files will be rolled over after reaching a particular size.
SizeBasedRollingPolicy(long, boolean) - Constructor for class org.apache.spark.util.logging.SizeBasedRollingPolicy
 
SizeEstimator - Class in org.apache.spark.util
Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in memory-aware caches.
SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
 
sizeInBytes() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
sizeInBytes() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
sizeInBytes() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation
Returns an estimated size of this relation in bytes.
sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
Sketches the input RDD via reservoir sampling on each partition.
skip(long) - Method in class org.apache.spark.util.ByteBufferInputStream
 
skippedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
slack() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
slaveActor() - Method in class org.apache.spark.storage.BlockManagerInfo
 
slaveIdsWithExecutors() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
slaveIdsWithExecutors() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
slaveLost(SchedulerDriver, Protos.SlaveID) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
slaveLost(SchedulerDriver, Protos.SlaveID) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
SlaveLost - Class in org.apache.spark.scheduler
 
SlaveLost(String) - Constructor for class org.apache.spark.scheduler.SlaveLost
 
slaveTimeout() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
slice() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
slice(Seq<T>, int, ClassTag<T>) - Static method in class org.apache.spark.rdd.ParallelCollectionRDD
Slice a collection into numSlices sub-collections.
slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
slice(Interval) - Method in class org.apache.spark.streaming.dstream.DStream
Return all the RDDs defined by the Interval object (both end times included)
slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream
Return all the RDDs between 'fromTime' to 'toTime' (both included)
slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream
Time interval after which the DStream generates a RDD
slideDuration() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
slideDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
Returns a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding window over them.
SlidingRDD<T> - Class in org.apache.spark.mllib.rdd
Represents a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding window over them.
SlidingRDD(RDD<T>, int, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.SlidingRDD
 
SlidingRDDPartition<T> - Class in org.apache.spark.mllib.rdd
 
SlidingRDDPartition(int, Partition, Seq<T>) - Constructor for class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
SnappyCompressionCodec - Class in org.apache.spark.io
:: DeveloperApi :: Snappy implementation of CompressionCodec.
SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
 
SocketInputDStream<T> - Class in org.apache.spark.streaming.dstream
 
SocketInputDStream(StreamingContext, String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.SocketInputDStream
 
SocketReceiver<T> - Class in org.apache.spark.streaming.dstream
 
SocketReceiver(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.SocketReceiver
 
socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from TCP source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream from network source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream from TCP source hostname:port.
solve(ALS.NormalEquation, double) - Method in class org.apache.spark.ml.recommendation.ALS.CholeskySolver
Solves a least squares problem with L2 regularization:
solve(ALS.NormalEquation, double) - Method in interface org.apache.spark.ml.recommendation.ALS.LeastSquaresNESolver
Solves a least squares problem (possibly with other constraints).
solve(ALS.NormalEquation, double) - Method in class org.apache.spark.ml.recommendation.ALS.NNLSSolver
Solves a nonnegative least squares problem with L2 regularizatin:
solve(DoubleMatrix, DoubleMatrix, NNLS.Workspace) - Static method in class org.apache.spark.mllib.optimization.NNLS
Solve a least squares problem, possibly with nonnegativity constraints, by a modified projected gradient method.
Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
sort(String, String...) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame sorted by the specified column, all in ascending order.
sort(Column...) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame sorted by the given expressions.
sort(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame sorted by the specified column, all in ascending order.
sort(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame sorted by the given expressions.
sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD
Return this RDD sorted by the given key function.
sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
Return this RDD sorted by the given key function.
sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements in ascending order.
sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
Sort the RDD by key, so that each partition contains a sorted range of the elements.
Source - Interface in org.apache.spark.metrics.source
 
SOURCE_REGEX() - Static method in class org.apache.spark.metrics.MetricsSystem
 
sourceName() - Method in class org.apache.spark.metrics.source.JvmSource
 
sourceName() - Method in interface org.apache.spark.metrics.source.Source
 
sourceName() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
sourceName() - Method in class org.apache.spark.storage.BlockManagerSource
 
sourceName() - Method in class org.apache.spark.streaming.StreamingSource
 
SPARK_CONTEXT() - Static method in class org.apache.spark.util.MetadataCleanerType
 
SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
 
SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
 
SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
 
SPARK_METADATA_KEY() - Static method in class org.apache.spark.sql.parquet.RowReadSupport
 
SPARK_ROW_REQUESTED_SCHEMA() - Static method in class org.apache.spark.sql.parquet.RowReadSupport
 
SPARK_ROW_SCHEMA() - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
 
SPARK_VERSION_KEY() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
SparkConf - Class in org.apache.spark
Configuration for a Spark application.
SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
 
SparkConf() - Constructor for class org.apache.spark.SparkConf
Create a SparkConf that loads defaults from system properties and the classpath
sparkConf() - Method in class org.apache.spark.streaming.Checkpoint
 
sparkConfPairs() - Method in class org.apache.spark.streaming.Checkpoint
 
sparkContext() - Method in class org.apache.spark.rdd.RDD
The SparkContext that created this RDD.
SparkContext - Class in org.apache.spark
Main entry point for Spark functionality.
SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
 
SparkContext() - Constructor for class org.apache.spark.SparkContext
Create a SparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
:: DeveloperApi :: Alternative constructor for setting preferred locations where Spark will create executors.
SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>) - Constructor for class org.apache.spark.SparkContext
Alternative constructor that allows setting common Spark properties directly
sparkContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sparkContext() - Method in class org.apache.spark.sql.SQLContext
 
sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
The underlying SparkContext
sparkContext() - Method in class org.apache.spark.streaming.StreamingContext
Return the associated Spark context
SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
 
SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
 
SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
 
SparkDeploySchedulerBackend - Class in org.apache.spark.scheduler.cluster
 
SparkDeploySchedulerBackend(TaskSchedulerImpl, SparkContext, String[]) - Constructor for class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
SparkDriverExecutionException - Exception in org.apache.spark
Exception thrown when execution of some user code in the driver process fails, e.g.
SparkDriverExecutionException(Throwable) - Constructor for exception org.apache.spark.SparkDriverExecutionException
 
SparkEnv - Class in org.apache.spark
:: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc.
SparkEnv(String, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, BroadcastManager, BlockTransferService, BlockManager, SecurityManager, HttpFileServer, String, MetricsSystem, ShuffleMemoryManager, OutputCommitCoordinator, SparkConf) - Constructor for class org.apache.spark.SparkEnv
 
sparkEventFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
--------------------------------------------------- * JSON deserialization methods for SparkListenerEvents | ----------------------------------------------------
sparkEventToJson(SparkListenerEvent) - Static method in class org.apache.spark.util.JsonProtocol
------------------------------------------------- * JSON serialization methods for SparkListenerEvents | --------------------------------------------------
SparkException - Exception in org.apache.spark
 
SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
 
SparkException(String) - Constructor for exception org.apache.spark.SparkException
 
SparkExitCode - Class in org.apache.spark.util
 
SparkExitCode() - Constructor for class org.apache.spark.util.SparkExitCode
 
SparkFiles - Class in org.apache.spark
Resolves paths to files added through SparkContext.addFile().
SparkFiles() - Constructor for class org.apache.spark.SparkFiles
 
sparkFilesDir() - Method in class org.apache.spark.SparkEnv
 
SparkFirehoseListener - Class in org.apache.spark
Class that allows users to receive all SparkListener events.
SparkFirehoseListener() - Constructor for class org.apache.spark.SparkFirehoseListener
 
SparkFlumeEvent - Class in org.apache.spark.streaming.flume
A wrapper class for AvroFlumeEvent's with a custom serialization format.
SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
 
SparkHadoopMapReduceUtil - Interface in org.apache.spark.mapreduce
 
SparkHadoopMapRedUtil - Interface in org.apache.spark.mapred
 
SparkHadoopWriter - Class in org.apache.spark
Internal helper class that saves an RDD using a Hadoop OutputFormat.
SparkHadoopWriter(JobConf) - Constructor for class org.apache.spark.SparkHadoopWriter
 
SparkHiveDynamicPartitionWriterContainer - Class in org.apache.spark.sql.hive
 
SparkHiveDynamicPartitionWriterContainer(JobConf, ShimFileSinkDesc, String[]) - Constructor for class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
SparkHiveWriterContainer - Class in org.apache.spark.sql.hive
Internal helper class that saves an RDD using a Hive OutputFormat.
SparkHiveWriterContainer(JobConf, ShimFileSinkDesc) - Constructor for class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
sparkJavaOpts(SparkConf, Function1<String, Object>) - Static method in class org.apache.spark.util.Utils
Convert all spark properties set in the given SparkConf to a sequence of java options.
SparkJobInfo - Interface in org.apache.spark
Exposes information about Spark Jobs.
SparkJobInfoImpl - Class in org.apache.spark
 
SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
 
SparkListener - Interface in org.apache.spark.scheduler
:: DeveloperApi :: Interface for listening to events from the Spark scheduler.
SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
 
SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
SparkListenerApplicationStart - Class in org.apache.spark.scheduler
 
SparkListenerApplicationStart(String, Option<String>, long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
 
SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
 
SparkListenerBlockManagerAdded(long, BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
 
SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
SparkListenerBus - Interface in org.apache.spark.scheduler
A SparkListenerEvent bus that relays SparkListenerEvents to its listeners
SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
 
SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
SparkListenerEvent - Interface in org.apache.spark.scheduler
 
SparkListenerExecutorAdded - Class in org.apache.spark.scheduler
 
SparkListenerExecutorAdded(long, String, ExecutorInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorAdded
 
SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler
Periodic updates from executors.
SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
SparkListenerExecutorRemoved - Class in org.apache.spark.scheduler
 
SparkListenerExecutorRemoved(long, String, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorRemoved
 
SparkListenerJobEnd - Class in org.apache.spark.scheduler
 
SparkListenerJobEnd(int, long, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
 
SparkListenerJobStart - Class in org.apache.spark.scheduler
 
SparkListenerJobStart(int, long, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
 
SparkListenerLogStart - Class in org.apache.spark.scheduler
An internal class that describes the metadata of an event log.
SparkListenerLogStart(String) - Constructor for class org.apache.spark.scheduler.SparkListenerLogStart
 
SparkListenerStageCompleted - Class in org.apache.spark.scheduler
 
SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
 
SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
 
SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
SparkListenerTaskEnd - Class in org.apache.spark.scheduler
 
SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
 
SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
 
SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
SparkListenerTaskStart - Class in org.apache.spark.scheduler
 
SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
 
SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
 
SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
SparkSQLParser - Class in org.apache.spark.sql
The top level Spark SQL parser.
SparkSQLParser(Function1<String, LogicalPlan>) - Constructor for class org.apache.spark.sql.SparkSQLParser
 
SparkStageInfo - Interface in org.apache.spark
Exposes information about Spark Stages.
SparkStageInfoImpl - Class in org.apache.spark
 
SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
 
SparkStatusTracker - Class in org.apache.spark
Low-level status reporting APIs for monitoring job and stage progress.
SparkStatusTracker(SparkContext) - Constructor for class org.apache.spark.SparkStatusTracker
 
SparkUI - Class in org.apache.spark.ui
Top level user interface for a Spark application.
SparkUITab - Class in org.apache.spark.ui
 
SparkUITab(SparkUI, String) - Constructor for class org.apache.spark.ui.SparkUITab
 
SparkUncaughtExceptionHandler - Class in org.apache.spark.util
The default uncaught exception handler for Executors terminates the whole process, to avoid getting into a bad state indefinitely.
SparkUncaughtExceptionHandler() - Constructor for class org.apache.spark.util.SparkUncaughtExceptionHandler
 
sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
 
sparkUser() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
sparkUser() - Method in class org.apache.spark.SparkContext
 
sparkVersion() - Method in class org.apache.spark.scheduler.SparkListenerLogStart
 
sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector providing its index array and value array.
sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector using unordered (index, value) pairs.
sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
SparseMatrix - Class in org.apache.spark.mllib.linalg
Column-major sparse matrix.
SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
 
SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
Column-major sparse matrix.
SparseVector - Class in org.apache.spark.mllib.linalg
A sparse vector represented by an index array and an value array.
SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
 
spdiag(Vector) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
Generate a diagonal matrix in SparseMatrix format from the supplied values.
SpearmanCorrelation - Class in org.apache.spark.mllib.stat.correlation
Compute Spearman's correlation for two RDDs of the type RDD[Double] or the correlation matrix for an RDD of the type RDD[Vector].
SpearmanCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
 
speculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SPECULATION_INTERVAL() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
SPECULATION_MULTIPLIER() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SPECULATION_QUANTILE() - Method in class org.apache.spark.scheduler.TaskSetManager
 
speculative() - Method in class org.apache.spark.scheduler.TaskInfo
 
speye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a sparse Identity Matrix in Matrix format.
speye(int) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
Generate an Identity Matrix in SparseMatrix format.
split() - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
 
split() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
split() - Method in class org.apache.spark.mllib.tree.model.Node
 
Split - Class in org.apache.spark.mllib.tree.model
:: DeveloperApi :: Split applied to a feature
Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
 
split() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
SPLIT_INFO_REFLECTIONS() - Static method in class org.apache.spark.rdd.HadoopRDD
 
splitAndCountPartitions(Iterator<String>) - Static method in class org.apache.spark.streaming.util.RawTextHelper
Splits lines and counts the words.
splitCommandString(String) - Static method in class org.apache.spark.util.Utils
Split a string of potentially quoted arguments from the command line the way that a shell would do it to determine arguments to a command.
splitIdToFile(int) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
splitIndex() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
 
splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
 
SplitInfo - Class in org.apache.spark.scheduler
 
SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
 
splitLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
 
splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a SparseMatrix consisting of i.i.d. gaussian random numbers.
sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
Generate a SparseMatrix consisting of i.i.d.
sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a SparseMatrix consisting of i.i.d. gaussian random numbers.
sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
Generate a SparseMatrix consisting of i.i.d.
sqdist(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors
Returns the squared distance between two Vectors.
sqdist(SparseVector, DenseVector) - Static method in class org.apache.spark.mllib.linalg.Vectors
Returns the squared distance between DenseVector and SparseVector.
sql() - Method in class org.apache.spark.sql.hive.execution.HiveNativeCommand
 
sql(String) - Method in class org.apache.spark.sql.hive.HiveContext
 
sql(String) - Method in class org.apache.spark.sql.SQLContext
Executes a SQL query using Spark, returning the result as a DataFrame.
SQLConf - Class in org.apache.spark.sql
A class that enables the setting and getting of mutable config parameters/hints.
SQLConf() - Constructor for class org.apache.spark.sql.SQLConf
 
SQLConf.Deprecated$ - Class in org.apache.spark.sql
 
SQLConf.Deprecated$() - Constructor for class org.apache.spark.sql.SQLConf.Deprecated$
 
sqlContext() - Method in class org.apache.spark.sql.DataFrame
 
sqlContext() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
 
sqlContext() - Method in class org.apache.spark.sql.json.JSONRelation
 
sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
sqlContext() - Method in interface org.apache.spark.sql.parquet.ParquetTest
 
sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
 
SQLContext - Class in org.apache.spark.sql
The entry point for working with structured data (rows and columns) in Spark.
SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
 
SQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.SQLContext
 
SQLContext.implicits - Class in org.apache.spark.sql
 
SQLContext.implicits() - Constructor for class org.apache.spark.sql.SQLContext.implicits
:: Experimental :: (Scala-specific) Implicit methods available in Scala for converting common Scala objects into DataFrames.
SQLContext.implicits.StringToColumn - Class in org.apache.spark.sql
Converts $"col name" into an Column.
SQLContext.implicits.StringToColumn(StringContext) - Constructor for class org.apache.spark.sql.SQLContext.implicits.StringToColumn
 
sqlType() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
sqlType() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
sqrt(Column) - Static method in class org.apache.spark.sql.functions
Computes the square root of the specified float value.
SQRT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
squaredDist(Vector) - Method in class org.apache.spark.util.Vector
 
SquaredError - Class in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Class for squared error loss calculation.
SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
 
SquaredL2Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Updater for L2 regularized problems.
SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
 
Src - Static variable in class org.apache.spark.graphx.TripletFields
Expose the source and edge fields but not the destination field.
srcAttr() - Method in class org.apache.spark.graphx.EdgeContext
The vertex attribute of the edge's source vertex.
srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
The source vertex attribute
srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
srcId() - Method in class org.apache.spark.graphx.Edge
 
srcId() - Method in class org.apache.spark.graphx.EdgeContext
The vertex id of the edge's source vertex.
srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
srcId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
srcIds() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
 
srcIds() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
 
srcIds() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
 
srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
ssc() - Method in class org.apache.spark.streaming.dstream.DStream
 
ssc() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
ssc() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
SSLOptions - Class in org.apache.spark
SSLOptions class is a common container for SSL configuration options.
SSLOptions(boolean, Option<File>, Option<String>, Option<String>, Option<File>, Option<String>, Option<String>, Set<String>) - Constructor for class org.apache.spark.SSLOptions
 
sslSocketFactory() - Method in class org.apache.spark.SecurityManager
 
stackTrace() - Method in class org.apache.spark.ExceptionFailure
 
stackTrace() - Method in class org.apache.spark.util.ThreadStackTrace
 
stackTraceFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stackTraceToJson(StackTraceElement[]) - Static method in class org.apache.spark.util.JsonProtocol
 
stage() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
 
Stage - Class in org.apache.spark.scheduler
A stage is a set of independent tasks all computing the same function that need to run as part of a Spark job, where all the tasks have the same shuffle dependencies.
Stage(int, RDD<?>, int, Option<ShuffleDependency<?, ?, ?>>, List<Stage>, int, CallSite) - Constructor for class org.apache.spark.scheduler.Stage
 
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
StageCancelled - Class in org.apache.spark.scheduler
 
StageCancelled(int) - Constructor for class org.apache.spark.scheduler.StageCancelled
 
stageCompletedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stageCompletedToJson(SparkListenerStageCompleted) - Static method in class org.apache.spark.util.JsonProtocol
 
stageEnd(int) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
 
stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
 
stageId() - Method in class org.apache.spark.scheduler.Pool
 
stageId() - Method in interface org.apache.spark.scheduler.Schedulable
 
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
stageId() - Method in class org.apache.spark.scheduler.StageCancelled
 
stageId() - Method in class org.apache.spark.scheduler.StageInfo
 
stageId() - Method in class org.apache.spark.scheduler.Task
 
stageId() - Method in class org.apache.spark.scheduler.TaskSet
 
stageId() - Method in class org.apache.spark.scheduler.TaskSetManager
 
stageId() - Method in interface org.apache.spark.SparkStageInfo
 
stageId() - Method in class org.apache.spark.SparkStageInfoImpl
 
stageId() - Method in class org.apache.spark.TaskContext
The ID of the stage that this task belong to.
stageId() - Method in class org.apache.spark.TaskContextImpl
 
stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
stageIds() - Method in interface org.apache.spark.SparkJobInfo
 
stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
 
stageIds() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
stageIdToActiveJobIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToInfo() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
stageIdToStage() - Method in class org.apache.spark.scheduler.DAGScheduler
 
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
 
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
StageInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Stores information about a stage to pass from the scheduler to SparkListeners.
StageInfo(int, int, String, int, Seq<RDDInfo>, String) - Constructor for class org.apache.spark.scheduler.StageInfo
 
stageInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
--------------------------------------------------------------------- * JSON deserialization methods for classes SparkListenerEvents depend on | ----------------------------------------------------------------------
stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
stageInfoToJson(StageInfo) - Static method in class org.apache.spark.util.JsonProtocol
------------------------------------------------------------------- * JSON serialization methods for classes SparkListenerEvents depend on | --------------------------------------------------------------------
StagePage - Class in org.apache.spark.ui.jobs
Page showing statistics and task list for a given stage
StagePage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.StagePage
 
stages() - Method in class org.apache.spark.ml.Pipeline
param for pipeline stages
stages() - Method in class org.apache.spark.ml.PipelineModel
 
StagesTab - Class in org.apache.spark.ui.jobs
Web UI showing progress status of all stages in the given SparkContext.
StagesTab(SparkUI) - Constructor for class org.apache.spark.ui.jobs.StagesTab
 
stageStart(int) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
 
stageSubmittedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
stageSubmittedToJson(SparkListenerStageSubmitted) - Static method in class org.apache.spark.util.JsonProtocol
 
StageTableBase - Class in org.apache.spark.ui.jobs
Page showing list of all ongoing and recently finished stages
StageTableBase(Seq<StageInfo>, String, JobProgressListener, boolean, boolean) - Constructor for class org.apache.spark.ui.jobs.StageTableBase
 
StandardNormalGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
 
StandardScaler - Class in org.apache.spark.ml.feature
:: AlphaComponent :: Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
 
StandardScaler - Class in org.apache.spark.mllib.feature
:: Experimental :: Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set.
StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
 
StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
 
StandardScalerModel - Class in org.apache.spark.ml.feature
:: AlphaComponent :: Model fitted by StandardScaler.
StandardScalerModel(StandardScaler, ParamMap, StandardScalerModel) - Constructor for class org.apache.spark.ml.feature.StandardScalerModel
 
StandardScalerModel - Class in org.apache.spark.mllib.feature
:: Experimental :: Represents a StandardScaler model that can transform vectors.
StandardScalerModel(Vector, Vector, boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
 
StandardScalerModel(Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
 
StandardScalerModel(Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
 
StandardScalerParams - Interface in org.apache.spark.ml.feature
starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
Create a star graph with vertex 0 being the center.
start() - Method in class org.apache.spark.ContextCleaner
Start the cleaner.
start() - Method in class org.apache.spark.ExecutorAllocationManager
Register for scheduler callbacks to decide when to add and remove executors.
start() - Method in class org.apache.spark.HttpServer
 
start() - Method in class org.apache.spark.metrics.MetricsSystem
 
start() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
start() - Method in class org.apache.spark.metrics.sink.CsvSink
 
start() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
start() - Method in class org.apache.spark.metrics.sink.JmxSink
 
start() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
start() - Method in interface org.apache.spark.metrics.sink.Sink
 
start(String) - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
Starts a new timer, or re-starts a stopped timer.
start() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
start() - Method in class org.apache.spark.scheduler.EventLoggingListener
Creates the log file in the configured log directory.
start() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
start() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
start() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
start() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
start() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
start() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Start the execution of the streams.
start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.InputDStream
Method called to start receiving data.
start() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
start(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
start() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
start() - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Start block generating and pushing threads.
start() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Start the supervisor
start() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
Start generation of jobs
start() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
start() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
start() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Start the actor and receiver execution thread.
start() - Method in class org.apache.spark.streaming.StreamingContext
Start the execution of the streams.
start(long) - Method in class org.apache.spark.streaming.util.RecurringTimer
Start at the given start time.
start() - Method in class org.apache.spark.streaming.util.RecurringTimer
Start at the earliest time it can start based on the period.
start() - Method in class org.apache.spark.util.AsynchronousListenerBus
Start sending events to attached listeners.
start() - Method in class org.apache.spark.util.EventLoop
 
Started() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Started() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
startIdx() - Method in class org.apache.spark.util.Distribution
 
startIndex() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
Return the index of the first node in the given level.
startJettyServer(String, int, Seq<ServletContextHandler>, SparkConf, String) - Static method in class org.apache.spark.ui.JettyUtils
Attempt to start a Jetty server bound to the supplied hostName:port using the given context handlers.
startReceiver() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Start receiver
startServiceOnPort(int, Function1<Object, Tuple2<T, Object>>, SparkConf, String) - Static method in class org.apache.spark.util.Utils
Attempt to start a service on the given port, or fail after a number of attempts.
startsWith(Column) - Method in class org.apache.spark.sql.Column
String starts with.
startsWith(String) - Method in class org.apache.spark.sql.Column
String starts with another string literal.
startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
 
startTime() - Method in class org.apache.spark.partial.ApproximateActionListener
 
startTime() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
startTime() - Method in class org.apache.spark.SparkContext
 
startTime() - Method in class org.apache.spark.streaming.DStreamGraph
 
startTime() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
STARVATION_TIMEOUT() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
statCounter() - Method in class org.apache.spark.util.Distribution
 
StatCounter - Class in org.apache.spark.util
A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.
StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
 
StatCounter() - Constructor for class org.apache.spark.util.StatCounter
Initialize the StatCounter with no values.
state() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
state() - Method in class org.apache.spark.streaming.StreamingContext
 
StateDStream<K,V,S> - Class in org.apache.spark.streaming.dstream
 
StateDStream(DStream<Tuple2<K, V>>, Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, Option<RDD<Tuple2<K, S>>>, ClassTag<K>, ClassTag<V>, ClassTag<S>) - Constructor for class org.apache.spark.streaming.dstream.StateDStream
 
STATIC_RESOURCE_DIR() - Static method in class org.apache.spark.ui.SparkUI
 
staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps
Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult
Test statistic.
Statistics - Class in org.apache.spark.mllib.stat
:: Experimental :: API for statistical functions in MLlib.
Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
 
statistics() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
statistics() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
statistics() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
statistics() - Method in class org.apache.spark.sql.sources.LogicalRelation
 
Statistics - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: Statistics for querying the supervisor about state of workers.
Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
 
stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
 
stats() - Method in class org.apache.spark.mllib.tree.model.Node
 
stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.sql.columnar.CachedBatch
 
StatsReportListener - Class in org.apache.spark.scheduler
:: DeveloperApi :: Simple SparkListener that logs a few summary statistics when each stage completes
StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
 
StatsReportListener - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: A simple StreamingListener that logs summary statistics across Spark Streaming batches
StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
 
statsSize() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
 
status() - Method in class org.apache.spark.scheduler.TaskInfo
 
status() - Method in interface org.apache.spark.SparkJobInfo
 
status() - Method in class org.apache.spark.SparkJobInfoImpl
 
status() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
 
statusTracker() - Method in class org.apache.spark.SparkContext
 
statusUpdate(SchedulerDriver, Protos.TaskStatus) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
statusUpdate(SchedulerDriver, Protos.TaskStatus) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
statusUpdate(long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.local.LocalBackend
 
StatusUpdate - Class in org.apache.spark.scheduler.local
 
StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
 
statusUpdate(long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
std() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
std() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
 
stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.util.StatCounter
Return the standard deviation of the values.
stop() - Method in class org.apache.spark.api.java.JavaSparkContext
Shut down the SparkContext.
stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
stop() - Method in class org.apache.spark.broadcast.BroadcastManager
 
stop() - Static method in class org.apache.spark.broadcast.HttpBroadcast
 
stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
 
stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
 
stop() - Method in class org.apache.spark.ContextCleaner
Stop the cleaner.
stop() - Method in class org.apache.spark.HttpFileServer
 
stop() - Method in class org.apache.spark.HttpServer
 
stop() - Method in class org.apache.spark.MapOutputTracker
Stop the tracker.
stop() - Method in class org.apache.spark.MapOutputTrackerMaster
 
stop() - Method in class org.apache.spark.metrics.MetricsSystem
 
stop() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
stop() - Method in class org.apache.spark.metrics.sink.CsvSink
 
stop() - Method in class org.apache.spark.metrics.sink.GraphiteSink
 
stop() - Method in class org.apache.spark.metrics.sink.JmxSink
 
stop() - Method in class org.apache.spark.metrics.sink.MetricsServlet
 
stop() - Method in interface org.apache.spark.metrics.sink.Sink
 
stop(String) - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
Stops a timer and returns the elapsed time in seconds.
stop() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.DAGScheduler
 
stop() - Method in class org.apache.spark.scheduler.EventLoggingListener
Stop logging events.
stop() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
stop() - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
 
stop() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
stop() - Method in class org.apache.spark.scheduler.TaskResultGetter
 
stop() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
stop() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
stop() - Method in class org.apache.spark.SparkContext
Shut down the SparkContext.
stop() - Method in class org.apache.spark.SparkEnv
 
stop() - Method in class org.apache.spark.storage.BlockManager
 
stop() - Method in class org.apache.spark.storage.BlockManagerMaster
Stop the driver actor, called only on the Spark driver node
stop() - Method in class org.apache.spark.storage.DiskBlockManager
Cleanup local dirs and stop shuffle sender.
stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Stop the execution of the streams.
stop() - Method in class org.apache.spark.streaming.CheckpointWriter
 
stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.InputDStream
Method called to stop receiving data.
stop() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
stop() - Method in class org.apache.spark.streaming.DStreamGraph
 
stop() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
stop() - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Stop all threads.
stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver
Stop the receiver completely.
stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
Stop the receiver completely due to an exception
stop(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Mark the supervisor and the receiver for stopping
stop() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.JobGenerator
Stop generation of jobs.
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
stop() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Stop the block tracker.
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
stop(boolean) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Stop the receiver execution thread.
stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext
Stop the execution of the streams immediately (does not wait for all received data to be processed).
stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext
Stop the execution of the streams, with option of ensuring all received data has been processed.
stop(boolean) - Method in class org.apache.spark.streaming.util.RecurringTimer
Stop the timer, and return the last time the callback was made.
stop() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Stop the manager, close any open log writer
stop() - Method in class org.apache.spark.ui.ConsoleProgressBar
Tear down the timer thread.
stop() - Method in class org.apache.spark.ui.SparkUI
Stop the server behind this web interface.
stop() - Method in class org.apache.spark.ui.WebUI
Stop the server behind this web interface.
stop() - Method in class org.apache.spark.util.AsynchronousListenerBus
Stop the listener bus.
stop() - Method in class org.apache.spark.util.EventLoop
 
stop() - Method in class org.apache.spark.util.logging.FileAppender
Stop the appender
stop() - Method in class org.apache.spark.util.logging.RollingFileAppender
Stop the appender
StopCoordinator - Class in org.apache.spark.scheduler
 
StopCoordinator() - Constructor for class org.apache.spark.scheduler.StopCoordinator
 
StopExecutor - Class in org.apache.spark.scheduler.local
 
StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
 
stopExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
StopMapOutputTracker - Class in org.apache.spark
 
StopMapOutputTracker() - Constructor for class org.apache.spark.StopMapOutputTracker
 
Stopped() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
 
Stopped() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
stopping() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
stopReceiver(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Stop receiver
StopReceiver - Class in org.apache.spark.streaming.receiver
 
StopReceiver() - Constructor for class org.apache.spark.streaming.receiver.StopReceiver
 
storageLevel() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
storageLevel() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
storageLevel() - Method in class org.apache.spark.storage.BlockStatus
 
storageLevel() - Method in class org.apache.spark.storage.RDDInfo
 
StorageLevel - Class in org.apache.spark.storage
:: DeveloperApi :: Flags for controlling the storage of an RDD.
StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
 
storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
 
storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
 
storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
storageLevelFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
StorageLevels - Class in org.apache.spark.api.java
Expose some commonly useful storage level constants.
StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
 
storageLevelToJson(StorageLevel) - Static method in class org.apache.spark.util.JsonProtocol
 
storageListener() - Method in class org.apache.spark.ui.SparkUI
 
StorageListener - Class in org.apache.spark.ui.storage
:: DeveloperApi :: A SparkListener that prepares information to be displayed on the BlockManagerUI.
StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
 
StoragePage - Class in org.apache.spark.ui.storage
Page showing list of RDD's currently stored in the cluster
StoragePage(StorageTab) - Constructor for class org.apache.spark.ui.storage.StoragePage
 
StorageStatus - Class in org.apache.spark.storage
:: DeveloperApi :: Storage information for each BlockManager.
StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
 
StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus
Create a storage status with an initial set of blocks, leaving the source unmodified.
storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
 
storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
 
storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
 
StorageStatusListener - Class in org.apache.spark.storage
:: DeveloperApi :: A SparkListener that maintains executor storage status.
StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
 
storageStatusListener() - Method in class org.apache.spark.ui.SparkUI
 
StorageTab - Class in org.apache.spark.ui.storage
Web UI showing storage status of all RDD's in the given SparkContext.
StorageTab(SparkUI) - Constructor for class org.apache.spark.ui.storage.StorageTab
 
StorageUtils - Class in org.apache.spark.storage
Helper methods for storage-related objects.
StorageUtils() - Constructor for class org.apache.spark.storage.StorageUtils
 
store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store the bytes of received data as a data block into Spark's memory.
store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
Store a single item of received data to Spark's memory.
store(T) - Method in class org.apache.spark.streaming.receiver.Receiver
Store a single item of received data to Spark's memory.
store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an ArrayBuffer of received data as a data block into Spark's memory.
store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an ArrayBuffer of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver
Store the bytes of received data as a data block into Spark's memory.
store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
Store the bytes of received data as a data block into Spark's memory.
storeBlock(StreamBlockId, ReceivedBlock) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
storeBlock(StreamBlockId, ReceivedBlock) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
Store a received block with the given block id and return related metadata
storeBlock(StreamBlockId, ReceivedBlock) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
This implementation stores the block into the block manager as well as a write ahead log.
Strategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Stores all the configuration options for tree construction
Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
 
Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
Java-friendly constructor for Strategy
STRATEGY_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
STRATEGY_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
StratifiedSamplingUtils - Class in org.apache.spark.util.random
Auxiliary functions and data structures for the sampleByKey method in PairRDDFunctions.
StratifiedSamplingUtils() - Constructor for class org.apache.spark.util.random.StratifiedSamplingUtils
 
STREAM() - Static method in class org.apache.spark.storage.BlockId
 
StreamBasedRecordReader<T> - Class in org.apache.spark.input
An abstract class of RecordReader to reading files out as streams
StreamBasedRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.StreamBasedRecordReader
 
StreamBlockId - Class in org.apache.spark.storage
 
StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
 
StreamFileInputFormat<T> - Class in org.apache.spark.input
A general format for reading whole files in as streams, byte arrays, or other functions to be added
StreamFileInputFormat() - Constructor for class org.apache.spark.input.StreamFileInputFormat
 
streamId() - Method in class org.apache.spark.storage.StreamBlockId
 
streamId() - Method in class org.apache.spark.streaming.receiver.Receiver
Get the unique identifier the receiver input stream that this receiver is associated with.
streamId() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
streamId() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
streamId() - Method in class org.apache.spark.streaming.scheduler.ReportError
 
streamIdToAllocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
StreamingContext - Class in org.apache.spark.streaming
Main entry point for Spark Streaming functionality.
StreamingContext(SparkContext, Checkpoint, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
 
StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext using an existing SparkContext.
StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext by providing the configuration necessary for a new SparkContext.
StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext
Create a StreamingContext by providing the details necessary for creating a new SparkContext.
StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext
Recreate a StreamingContext from a checkpoint file.
StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext
Recreate a StreamingContext from a checkpoint file.
StreamingContext.StreamingContextState$ - Class in org.apache.spark.streaming
Enumeration to identify current state of the StreamingContext
StreamingContext.StreamingContextState$() - Constructor for class org.apache.spark.streaming.StreamingContext.StreamingContextState$
 
StreamingContextState() - Method in class org.apache.spark.streaming.StreamingContext
Accessor for nested Scala object
StreamingExamples - Class in org.apache.spark.examples.streaming
Utility functions for Spark Streaming examples.
StreamingExamples() - Constructor for class org.apache.spark.examples.streaming.StreamingExamples
 
StreamingJobProgressListener - Class in org.apache.spark.streaming.ui
 
StreamingJobProgressListener(StreamingContext) - Constructor for class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
StreamingKMeans - Class in org.apache.spark.mllib.clustering
:: Experimental ::
StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
 
StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
 
StreamingKMeansModel - Class in org.apache.spark.mllib.clustering
:: Experimental ::
StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression
:: DeveloperApi :: StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data, and using it for prediction on (possibly different) streaming data.
StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
 
StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
:: Experimental :: Train or predict a linear regression model on streaming data.
StreamingLinearRegressionWithSGD(double, int, double) - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
StreamingListener - Interface in org.apache.spark.streaming.scheduler
:: DeveloperApi :: A listener interface for receiving information about an ongoing streaming computation.
StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
StreamingListenerBus - Class in org.apache.spark.streaming.scheduler
Asynchronously passes StreamingListenerEvents to registered StreamingListeners.
StreamingListenerBus() - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBus
 
StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Base trait for events related to StreamingListener
StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
 
StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
StreamingLogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
:: Experimental :: Train or predict a logistic regression model on streaming data.
StreamingLogisticRegressionWithSGD(double, int, double, double) - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
 
StreamingLogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
StreamingPage - Class in org.apache.spark.streaming.ui
Page for Spark Web UI that shows statistics of a streaming job
StreamingPage(StreamingTab) - Constructor for class org.apache.spark.streaming.ui.StreamingPage
 
StreamingSource - Class in org.apache.spark.streaming
 
StreamingSource(StreamingContext) - Constructor for class org.apache.spark.streaming.StreamingSource
 
StreamingTab - Class in org.apache.spark.streaming.ui
Spark Web UI tab that shows statistics of a streaming job.
StreamingTab(StreamingContext) - Constructor for class org.apache.spark.streaming.ui.StreamingTab
 
StreamInputFormat - Class in org.apache.spark.input
The format for the PortableDataStream files
StreamInputFormat() - Constructor for class org.apache.spark.input.StreamInputFormat
 
StreamRecordReader - Class in org.apache.spark.input
Reads the record in directly as a stream for other objects to manipulate and handle
StreamRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.StreamRecordReader
 
STRING - Class in org.apache.spark.sql.columnar
 
STRING() - Constructor for class org.apache.spark.sql.columnar.STRING
 
string() - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type string
StringColumnAccessor - Class in org.apache.spark.sql.columnar
 
StringColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.StringColumnAccessor
 
StringColumnBuilder - Class in org.apache.spark.sql.columnar
 
StringColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.StringColumnBuilder
 
StringColumnStats - Class in org.apache.spark.sql.columnar
 
StringColumnStats() - Constructor for class org.apache.spark.sql.columnar.StringColumnStats
 
StringConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
stringifyPartialValue(Object) - Static method in class org.apache.spark.Accumulators
 
stringifyValue(Object) - Static method in class org.apache.spark.Accumulators
 
stringRddToDataFrameHolder(RDD<String>) - Method in class org.apache.spark.sql.SQLContext.implicits
Creates a single column DataFrame from an RDD[String].
stringToText(String) - Static method in class org.apache.spark.SparkContext
 
stringWritableConverter() - Static method in class org.apache.spark.SparkContext
 
stringWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
stringWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
stripDirectory(String) - Static method in class org.apache.spark.util.Utils
Strip the directory from a path name
stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps
Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
StronglyConnectedComponents - Class in org.apache.spark.graphx.lib
Strongly connected components algorithm implementation.
StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
 
struct(Seq<StructField>) - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type struct
struct(StructType) - Method in class org.apache.spark.sql.ColumnName
 
StudentTCacher - Class in org.apache.spark.partial
A utility class for caching Student's T distribution values for a given confidence level and various sample sizes.
StudentTCacher(double) - Constructor for class org.apache.spark.partial.StudentTCacher
 
subDirsPerLocalDir() - Method in class org.apache.spark.storage.DiskBlockManager
 
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph
Restricts the graph to only the vertices and edges satisfying the predicates.
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
submissionTime() - Method in class org.apache.spark.scheduler.StageInfo
When this stage was submitted from the DAGScheduler to a TaskScheduler.
submissionTime() - Method in interface org.apache.spark.SparkStageInfo
 
submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
 
submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
submissionTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
submitJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, CallSite, boolean, Function2<Object, U, BoxedUnit>, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
Submit a job to the job scheduler and get a JobWaiter object back.
submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext
:: Experimental :: Submit a job for execution and return a FutureJob holding the result.
submitJobSet(JobSet) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
submitTasks(TaskSet) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
submitTasks(TaskSet) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
subProperties(Properties, Regex) - Method in class org.apache.spark.metrics.MetricsConfig
 
subsampleWeights() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
 
subsamplingFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Indicates if feature subsampling is being used.
subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns subset accuracy (for equal sets of labels)
substr(Column, Column) - Method in class org.apache.spark.sql.Column
An expression that returns a substring.
substr(int, int) - Method in class org.apache.spark.sql.Column
An expression that returns a substring.
SUBSTR() - Static method in class org.apache.spark.sql.hive.HiveQl
 
subTestSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
subTestSchemaFieldNames() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD
Return an RDD with the elements from this that are not in other.
subtract(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Subtract the stats from another calculator from this one, modifying and returning this calculator.
subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return an RDD with the elements from this that are not in other.
subtract(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
subtract(Vector) - Method in class org.apache.spark.util.Vector
 
subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the pairs from `this` whose keys are not in `other`.
SubtractedRDD<K,V,W> - Class in org.apache.spark.rdd
An optimized version of cogroup for set difference/subtraction.
SubtractedRDD(RDD<? extends Product2<K, V>>, RDD<? extends Product2<K, W>>, Partitioner, ClassTag<K>, ClassTag<V>, ClassTag<W>) - Constructor for class org.apache.spark.rdd.SubtractedRDD
 
subtreeDepth() - Method in class org.apache.spark.mllib.tree.model.Node
Get depth of tree from this node.
subtreeIterator() - Method in class org.apache.spark.mllib.tree.model.Node
Returns an iterator that traverses (DFS, left to right) the subtree of this node.
subtreeToString(int) - Method in class org.apache.spark.mllib.tree.model.Node
Recursive print function.
succeededTasks() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
success() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
 
Success - Class in org.apache.spark
:: DeveloperApi :: Task succeeded.
Success() - Constructor for class org.apache.spark.Success
 
successful() - Method in class org.apache.spark.scheduler.TaskInfo
 
successful() - Method in class org.apache.spark.scheduler.TaskSetManager
 
SUCCESSFUL_JOB_OUTPUT_DIR_MARKER() - Static method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 
sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Add up the elements in this RDD.
Sum() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
sum() - Method in class org.apache.spark.partial.CountEvaluator
 
sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Add up the elements in this RDD.
sum(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the sum of all values in the expression.
sum(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the sum of all values in the given column.
sum(String...) - Method in class org.apache.spark.sql.GroupedData
Compute the sum for each numeric columns for each group.
sum(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
Compute the sum for each numeric columns for each group.
SUM() - Static method in class org.apache.spark.sql.hive.HiveQl
 
sum() - Method in class org.apache.spark.util.StatCounter
 
sum() - Method in class org.apache.spark.util.Vector
 
sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
:: Experimental :: Approximate operation to return the sum within a timeout.
sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
:: Experimental :: Approximate operation to return the sum within a timeout.
sumDistinct(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the sum of distinct values in the expression.
sumDistinct(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the sum of distinct values in the expression.
SumEvaluator - Class in org.apache.spark.partial
An ApproximateEvaluator for sums.
SumEvaluator(int, double) - Constructor for class org.apache.spark.partial.SumEvaluator
 
summary(PrintStream) - Method in class org.apache.spark.util.Distribution
print a summary of this distribution to the given PrintStream.
sums() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
sums() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
sums() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
supervisorStrategy() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest
List of supported feature subset sampling strategies.
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
supports(ColumnType<?, ?>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
SVDPlusPlus - Class in org.apache.spark.graphx.lib
Implementation of SVD++ algorithm.
SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
 
SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib
Configuration parameters for SVDPlusPlus.
SVDPlusPlus.Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
SVMDataGenerator - Class in org.apache.spark.mllib.util
:: DeveloperApi :: Generate sample data used for SVM.
SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
 
SVMModel - Class in org.apache.spark.mllib.classification
Model for Support Vector Machines (SVMs).
SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
 
SVMWithSGD - Class in org.apache.spark.mllib.classification
Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD
Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
symbolToColumn(Symbol) - Method in class org.apache.spark.sql.SQLContext.implicits
An implicit conversion that turns a Scala `Symbol` into a Column.
symlink(File, File) - Static method in class org.apache.spark.util.Utils
Creates a symlink.
symmetricEigs(Function1<DenseVector<Object>, DenseVector<Object>>, int, int, double, int) - Static method in class org.apache.spark.mllib.linalg.EigenValueDecomposition
Compute the leading k eigenvalues and eigenvectors on a symmetric square matrix using ARPACK.
syr(double, Vector, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS
A := alpha * x * x^T^ + A
SystemClock - Class in org.apache.spark.util
A clock backed by the actual time from the OS as reported by the System API.
SystemClock() - Constructor for class org.apache.spark.util.SystemClock
 
systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
systemProperty(Enumeration.Value) - Static method in class org.apache.spark.util.MetadataCleanerType
 

T

t() - Method in class org.apache.spark.SerializableWritable
 
table() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
table() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
table() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
table() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
 
table() - Method in class org.apache.spark.sql.sources.DescribeCommand
 
table(String) - Method in class org.apache.spark.sql.SQLContext
Returns the specified table as a DataFrame.
TABLE_CLASS_NOT_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
 
TABLE_CLASS_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
 
tableDesc() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
tableExists(Seq<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
tableInfo() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
tableName() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
tableName() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
 
tableName() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
tableName() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
 
tableName() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
tableName() - Method in class org.apache.spark.sql.hive.execution.DropTable
 
tableName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
tableName() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
tableName() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
tableName() - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
 
tableName() - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
 
tableName() - Method in class org.apache.spark.sql.sources.RefreshTable
 
tableNames() - Method in class org.apache.spark.sql.SQLContext
Returns the names of tables in the current database as an array.
tableNames(String) - Method in class org.apache.spark.sql.SQLContext
Returns the names of tables in the given database as an array.
TableReader - Interface in org.apache.spark.sql.hive
A trait for subclasses that handle table scans.
tables() - Method in class org.apache.spark.sql.SQLContext
Returns a DataFrame containing names of existing tables in the current database.
tables(String) - Method in class org.apache.spark.sql.SQLContext
Returns a DataFrame containing names of existing tables in the given database.
TableScan - Interface in org.apache.spark.sql.sources
::DeveloperApi:: A BaseRelation that can produce all of its tuples as an RDD of Row objects.
TachyonBlockManager - Class in org.apache.spark.storage
Creates and maintains the logical mapping between logical blocks and tachyon fs locations.
TachyonBlockManager(BlockManager, String, String) - Constructor for class org.apache.spark.storage.TachyonBlockManager
 
TachyonFileSegment - Class in org.apache.spark.storage
References a particular segment of a file (potentially the entire file), based off an offset and a length.
TachyonFileSegment(TachyonFile, long, long) - Constructor for class org.apache.spark.storage.TachyonFileSegment
 
tachyonFolderName() - Method in class org.apache.spark.SparkContext
 
tachyonSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
tachyonSize() - Method in class org.apache.spark.storage.BlockStatus
 
tachyonSize() - Method in class org.apache.spark.storage.RDDInfo
 
tachyonStore() - Method in class org.apache.spark.storage.BlockManager
 
TachyonStore - Class in org.apache.spark.storage
Stores BlockManager blocks on Tachyon.
TachyonStore(BlockManager, TachyonBlockManager) - Constructor for class org.apache.spark.storage.TachyonStore
 
tail() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
 
take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.rdd.RDD
Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.sql.DataFrame
Returns the first n rows in the DataFrame.
take(int) - Method in interface org.apache.spark.sql.RDDApi
 
takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of the take action, which returns a future for retrieving the first num elements of this RDD.
takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for retrieving the first num elements of the RDD.
takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the first k (smallest) elements from this RDD as defined by the specified Comparator[T] and maintains the order.
takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the first k (smallest) elements from this RDD using the natural ordering for T while maintain the order.
takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Returns the first k (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
Return a fixed-size sampled subset of this RDD in an array
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
task() - Method in class org.apache.spark.CleanupTaskWeakReference
 
task() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
 
task() - Method in class org.apache.spark.scheduler.BeginEvent
 
task() - Method in class org.apache.spark.scheduler.CompletionEvent
 
Task<T> - Class in org.apache.spark.scheduler
A unit of execution.
Task(int, int) - Constructor for class org.apache.spark.scheduler.Task
 
TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
 
TASK_SIZE_TO_WARN_KB() - Static method in class org.apache.spark.scheduler.TaskSetManager
 
taskAttempt() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
 
taskAttemptId() - Method in class org.apache.spark.TaskContext
An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID).
taskAttemptId() - Method in class org.apache.spark.TaskContextImpl
 
taskAttempts() - Method in class org.apache.spark.scheduler.TaskSetManager
 
TaskCommitDenied - Class in org.apache.spark
:: DeveloperApi :: Task requested the driver to commit, but was denied.
TaskCommitDenied(int, int, int) - Constructor for class org.apache.spark.TaskCommitDenied
 
taskCompleted(int, long, long, TaskEndReason) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
 
TaskCompletionListener - Interface in org.apache.spark.util
:: DeveloperApi ::
TaskCompletionListenerException - Exception in org.apache.spark.util
Exception thrown when there is an exception in executing the callback in TaskCompletionListener.
TaskCompletionListenerException(Seq<String>) - Constructor for exception org.apache.spark.util.TaskCompletionListenerException
 
TaskContext - Class in org.apache.spark
Contextual information about a task which can be read or mutated during execution.
TaskContext() - Constructor for class org.apache.spark.TaskContext
 
TaskContextHelper - Class in org.apache.spark
This class exists to restrict the visibility of TaskContext setters.
TaskContextHelper() - Constructor for class org.apache.spark.TaskContextHelper
 
TaskContextImpl - Class in org.apache.spark
 
TaskContextImpl(int, int, long, int, boolean, TaskMetrics) - Constructor for class org.apache.spark.TaskContextImpl
 
taskData() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
TaskDescription - Class in org.apache.spark.scheduler
Description of a task that gets passed onto executors to be executed, usually created by TaskSetManager.resourceOffer.
TaskDescription(long, int, String, String, int, ByteBuffer) - Constructor for class org.apache.spark.scheduler.TaskDescription
 
TaskDetailsClassNames - Class in org.apache.spark.ui.jobs
Names of the CSS classes corresponding to each type of task detail.
TaskDetailsClassNames() - Constructor for class org.apache.spark.ui.jobs.TaskDetailsClassNames
 
taskEnded(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskEndReason - Interface in org.apache.spark
:: DeveloperApi :: Various possible reasons why a task ended.
taskEndReasonFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskEndReasonToJson(TaskEndReason) - Static method in class org.apache.spark.util.JsonProtocol
 
taskEndToJson(SparkListenerTaskEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskFailedReason - Interface in org.apache.spark
:: DeveloperApi :: Various possible reasons why a task failed.
taskGettingResult(TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskGettingResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskGettingResultToJson(SparkListenerTaskGettingResult) - Static method in class org.apache.spark.util.JsonProtocol
 
taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
taskId() - Method in class org.apache.spark.scheduler.local.KillTask
 
taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
 
taskId() - Method in class org.apache.spark.scheduler.TaskDescription
 
taskId() - Method in class org.apache.spark.scheduler.TaskInfo
 
taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
 
taskIdsOnSlave() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
taskIdToExecutorId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
taskIdToSlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
taskIdToSlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
taskIdToTaskSetId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
taskInfo() - Method in class org.apache.spark.scheduler.BeginEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.CompletionEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.GettingResultEvent
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
TaskInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Information about a running task attempt inside a TaskSet.
TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
 
taskInfo() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
taskInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskInfos() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskInfoToJson(TaskInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskKilled - Class in org.apache.spark
:: DeveloperApi :: Task was killed intentionally and needs to be rescheduled.
TaskKilled() - Constructor for class org.apache.spark.TaskKilled
 
TaskKilledException - Exception in org.apache.spark
:: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected).
TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
 
taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
 
TaskLocality - Class in org.apache.spark.scheduler
 
TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
 
TaskLocation - Interface in org.apache.spark.scheduler
A location where a task should run.
taskMetrics() - Method in class org.apache.spark.Heartbeat
 
taskMetrics() - Method in class org.apache.spark.scheduler.CompletionEvent
 
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
taskMetrics() - Method in class org.apache.spark.TaskContext
::DeveloperApi::
taskMetrics() - Method in class org.apache.spark.TaskContextImpl
 
taskMetrics() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
 
taskMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskMetricsToJson(TaskMetrics) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskNotSerializableException - Exception in org.apache.spark
Exception thrown when a task cannot be serialized.
TaskNotSerializableException(Throwable) - Constructor for exception org.apache.spark.TaskNotSerializableException
 
TaskResult<T> - Interface in org.apache.spark.scheduler
 
TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
 
TaskResultBlockId - Class in org.apache.spark.storage
 
TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
 
TaskResultGetter - Class in org.apache.spark.scheduler
Runs a thread pool that deserializes and remotely fetches (if necessary) task results.
TaskResultGetter(SparkEnv, TaskSchedulerImpl) - Constructor for class org.apache.spark.scheduler.TaskResultGetter
 
taskResultGetter() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskResultLost - Class in org.apache.spark
:: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched.
TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
 
taskRow(boolean, boolean, boolean, boolean, boolean, boolean, UIData.TaskUIData) - Method in class org.apache.spark.ui.jobs.StagePage
 
tasks() - Method in class org.apache.spark.scheduler.TaskSet
 
tasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskScheduler() - Method in class org.apache.spark.scheduler.DAGScheduler
 
TaskScheduler - Interface in org.apache.spark.scheduler
Low-level task scheduler interface, currently implemented exclusively by TaskSchedulerImpl.
taskScheduler() - Method in class org.apache.spark.SparkContext
 
TaskSchedulerImpl - Class in org.apache.spark.scheduler
Schedules tasks for multiple types of clusters by acting through a SchedulerBackend.
TaskSchedulerImpl(SparkContext, int, boolean) - Constructor for class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskSchedulerImpl(SparkContext) - Constructor for class org.apache.spark.scheduler.TaskSchedulerImpl
 
TaskSet - Class in org.apache.spark.scheduler
A set of tasks submitted together to the low-level TaskScheduler, usually representing missing partitions of a particular stage.
TaskSet(Task<?>[], int, int, int, Properties) - Constructor for class org.apache.spark.scheduler.TaskSet
 
taskSet() - Method in class org.apache.spark.scheduler.TaskSetFailed
 
taskSet() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskSetFailed(TaskSet, String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
TaskSetFailed - Class in org.apache.spark.scheduler
 
TaskSetFailed(TaskSet, String) - Constructor for class org.apache.spark.scheduler.TaskSetFailed
 
taskSetFinished(TaskSetManager) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
Called to indicate that all task attempts (including speculated tasks) associated with the given TaskSetManager have completed, so state associated with the TaskSetManager should be cleaned up.
TaskSetManager - Class in org.apache.spark.scheduler
Schedules the tasks within a single TaskSet in the TaskSchedulerImpl.
TaskSetManager(TaskSchedulerImpl, TaskSet, int, Clock) - Constructor for class org.apache.spark.scheduler.TaskSetManager
 
taskSetSchedulingAlgorithm() - Method in class org.apache.spark.scheduler.Pool
 
tasksSuccessful() - Method in class org.apache.spark.scheduler.TaskSetManager
 
taskStarted(Task<?>, TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
 
taskStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
taskStartToJson(SparkListenerTaskStart) - Static method in class org.apache.spark.util.JsonProtocol
 
TaskState - Class in org.apache.spark
 
TaskState() - Constructor for class org.apache.spark.TaskState
 
taskSucceeded(int, Object) - Method in class org.apache.spark.partial.ApproximateActionListener
 
taskSucceeded(int, Object) - Method in interface org.apache.spark.scheduler.JobListener
 
taskSucceeded(int, Object) - Method in class org.apache.spark.scheduler.JobWaiter
 
taskTime() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
tellMaster() - Method in class org.apache.spark.storage.BlockInfo
 
TempLocalBlockId - Class in org.apache.spark.storage
Id associated with temporary local data managed as blocks.
TempLocalBlockId(UUID) - Constructor for class org.apache.spark.storage.TempLocalBlockId
 
temporary() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
temporary() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
TempShuffleBlockId - Class in org.apache.spark.storage
Id associated with temporary shuffle data managed as blocks.
TempShuffleBlockId(UUID) - Constructor for class org.apache.spark.storage.TempShuffleBlockId
 
term2index(int) - Static method in class org.apache.spark.mllib.clustering.LDA
Term vertex IDs are {-1, -2, ..., -vocabSize}
TerminalWidth() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
TEST() - Static method in class org.apache.spark.storage.BlockId
 
TestBlockId - Class in org.apache.spark.storage
 
TestBlockId(String) - Constructor for class org.apache.spark.storage.TestBlockId
 
testData() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testFilterDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testFilterSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testGlobDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testGlobSubDir1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testGlobSubDir2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testGlobSubDir3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestGroupWriteSupport - Class in org.apache.spark.sql.parquet
 
TestGroupWriteSupport(MessageType) - Constructor for class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
testNestedData1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedData2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedDir4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testNestedSchema4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestResult<DF> - Interface in org.apache.spark.mllib.stat.test
:: Experimental :: Trait for hypothesis test results.
testSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
testSchemaFieldNames() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
TestSQLContext - Class in org.apache.spark.sql.test
A SQLContext that can be used for local testing.
TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
 
TestUtils - Class in org.apache.spark
Utilities for tests.
TestUtils() - Constructor for class org.apache.spark.TestUtils
 
textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.SparkContext
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textResponderToServlet(Function1<HttpServletRequest, String>) - Static method in class org.apache.spark.ui.JettyUtils
 
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
 
thisClassName() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
 
thisClassName() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
thisFormatVersion() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
 
thisFormatVersion() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
 
thisFormatVersion() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
thisFormatVersion() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
 
thread() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
 
threadDumpEnabled() - Method in class org.apache.spark.ui.exec.ExecutorsTab
 
threadId() - Method in class org.apache.spark.util.ThreadStackTrace
 
threadName() - Method in class org.apache.spark.util.ThreadStackTrace
 
ThreadStackTrace - Class in org.apache.spark.util
Used for shipping per-thread stacktraces from the executors to driver.
ThreadStackTrace(long, String, Thread.State, String) - Constructor for class org.apache.spark.util.ThreadStackTrace
 
threadState() - Method in class org.apache.spark.util.ThreadStackTrace
 
threshold() - Method in interface org.apache.spark.ml.param.HasThreshold
param for threshold in (binary) prediction
threshold() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
 
threshold() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
threshold() - Method in class org.apache.spark.mllib.tree.model.Split
 
thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Returns thresholds in descending order.
threshTime() - Method in class org.apache.spark.streaming.receiver.CleanupOldBlocks
 
THRIFT_ARRAY_ELEMENTS_SCHEMA_NAME_SUFFIX() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
THRIFTSERVER_POOL() - Static method in class org.apache.spark.sql.SQLConf
 
throwBalls() - Method in class org.apache.spark.rdd.PartitionCoalescer
 
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
 
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
 
time() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
time() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
 
time() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
time() - Method in class org.apache.spark.streaming.scheduler.ClearCheckpointData
 
time() - Method in class org.apache.spark.streaming.scheduler.ClearMetadata
 
time() - Method in class org.apache.spark.streaming.scheduler.DoCheckpoint
 
time() - Method in class org.apache.spark.streaming.scheduler.GenerateJobs
 
time() - Method in class org.apache.spark.streaming.scheduler.Job
 
time() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
Time - Class in org.apache.spark.streaming
This is a simple class that represents an absolute instant of time.
Time(long) - Constructor for class org.apache.spark.streaming.Time
 
TimeBasedRollingPolicy - Class in org.apache.spark.util.logging
Defines a RollingPolicy by which files will be rolled over at a fixed interval.
TimeBasedRollingPolicy(long, String, boolean) - Constructor for class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
timeIt(int, Function0<BoxedUnit>, Option<Function0<BoxedUnit>>) - Static method in class org.apache.spark.util.Utils
Timing method based on iterations that permit JVM JIT optimization.
timeout() - Method in class org.apache.spark.storage.BlockManagerMaster
 
timeoutCheckingTask() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
timeRunning(long) - Method in class org.apache.spark.scheduler.TaskInfo
 
times(int) - Method in class org.apache.spark.streaming.Duration
 
times() - Method in class org.apache.spark.streaming.scheduler.BatchCleanupEvent
 
times(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Method executed for repeating a task for side effects.
TIMESTAMP - Class in org.apache.spark.sql.columnar
 
TIMESTAMP() - Constructor for class org.apache.spark.sql.columnar.TIMESTAMP
 
timestamp() - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type timestamp
timestamp() - Method in class org.apache.spark.util.TimeStampedValue
 
TimestampColumnAccessor - Class in org.apache.spark.sql.columnar
 
TimestampColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.TimestampColumnAccessor
 
TimestampColumnBuilder - Class in org.apache.spark.sql.columnar
 
TimestampColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.TimestampColumnBuilder
 
TimestampColumnStats - Class in org.apache.spark.sql.columnar
 
TimestampColumnStats() - Constructor for class org.apache.spark.sql.columnar.TimestampColumnStats
 
TimestampConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
TimeStampedHashMap<A,B> - Class in org.apache.spark.util
This is a custom implementation of scala.collection.mutable.Map which stores the insertion timestamp along with each key-value pair.
TimeStampedHashMap(boolean) - Constructor for class org.apache.spark.util.TimeStampedHashMap
 
TimeStampedHashSet<A> - Class in org.apache.spark.util
 
TimeStampedHashSet() - Constructor for class org.apache.spark.util.TimeStampedHashSet
 
TimeStampedValue<V> - Class in org.apache.spark.util
 
TimeStampedValue(V, long) - Constructor for class org.apache.spark.util.TimeStampedValue
 
TimeStampedWeakValueHashMap<A,B> - Class in org.apache.spark.util
A wrapper of TimeStampedHashMap that ensures the values are weakly referenced and timestamped.
TimeStampedWeakValueHashMap(boolean) - Constructor for class org.apache.spark.util.TimeStampedWeakValueHashMap
 
timeToLogFile(long, long) - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
 
TimeTracker - Class in org.apache.spark.mllib.tree.impl
Time tracker implementation which holds labeled timers.
TimeTracker() - Constructor for class org.apache.spark.mllib.tree.impl.TimeTracker
 
timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
tmpPath() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
 
to(Time, Duration) - Method in class org.apache.spark.streaming.Time
 
toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike
Deprecated.
As of Spark 1.0.0, toArray() is deprecated, use JavaRDDLike.collect() instead
toArray() - Method in class org.apache.spark.input.PortableDataStream
Read the file as a byte array
toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix
Converts to a dense array in column major.
toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toArray() - Method in interface org.apache.spark.mllib.linalg.Vector
Converts the instance to a double array.
toArray() - Method in class org.apache.spark.rdd.RDD
Return an array that contains all of the elements in this RDD.
toArrays() - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
toAttribute() - Method in class org.apache.spark.sql.hive.MetastoreRelation.SchemaAttribute
 
toBatchInfo() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
toBinary() - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 
toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Converts to BlockMatrix.
toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Converts to BlockMatrix.
toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Converts to BlockMatrix.
toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Converts to BlockMatrix.
toBreeze() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Collects data and assembles a local dense breeze matrix (for test only).
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Collects data and assembles a local matrix.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
Collects data and assembles a local dense breeze matrix (for test only).
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
 
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
Converts to a breeze matrix.
toBreeze() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
toBreeze() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
Converts the instance to a breeze vector.
toByteString() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
 
toCatalystDecimal(HiveDecimalObjectInspector, Object) - Static method in class org.apache.spark.sql.hive.HiveShim
 
toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Converts to CoordinateMatrix.
toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Converts this matrix to a CoordinateMatrix.
toDataType(String) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
 
toDataType(Type, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Converts a given Parquet Type into the corresponding DataType.
toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike
A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Print the full model to a string.
toDebugString() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Print the full model to a string.
toDebugString() - Method in class org.apache.spark.rdd.RDD
A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.SparkConf
Return a string listing all keys and values, one per line.
toDense() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
Converts the vector to a dense vector.
toDense() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
Generate a DenseMatrix from the given SparseMatrix.
toDF(String...) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame with columns renamed.
toDF() - Method in class org.apache.spark.sql.DataFrame
Returns the object itself.
toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame with columns renamed.
toDF() - Method in class org.apache.spark.sql.DataFrameHolder
 
toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrameHolder
 
toEdgePartition() - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
 
toEdgePartition() - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
 
toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext
Converts the edge and vertex properties into an EdgeTriplet for convenience.
toErrorString() - Method in class org.apache.spark.ExceptionFailure
 
toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
 
toErrorString() - Method in class org.apache.spark.FetchFailed
 
toErrorString() - Static method in class org.apache.spark.Resubmitted
 
toErrorString() - Method in class org.apache.spark.TaskCommitDenied
 
toErrorString() - Method in interface org.apache.spark.TaskFailedReason
Error message displayed in the web UI.
toErrorString() - Static method in class org.apache.spark.TaskKilled
 
toErrorString() - Static method in class org.apache.spark.TaskResultLost
 
toErrorString() - Static method in class org.apache.spark.UnknownReason
 
toFormattedString() - Method in class org.apache.spark.streaming.Duration
 
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Converts to IndexedRowMatrix.
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Converts to IndexedRowMatrix.
toInspector(DataType) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
toInspector(Expression) - Method in interface org.apache.spark.sql.hive.HiveInspectors
Map the catalyst expression to ObjectInspector, however, if the expression is Literal or foldable, a constant writable object inspector returns; Otherwise, we always get the object inspector according to its data type(in catalyst)
toInt() - Method in class org.apache.spark.storage.StorageLevel
 
toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Convert to a JavaDStream
toJavaRDD() - Method in class org.apache.spark.rdd.RDD
 
toJavaRDD() - Method in class org.apache.spark.sql.DataFrame
Returns the content of the DataFrame as a JavaRDD of Rows.
toJSON() - Method in class org.apache.spark.sql.DataFrame
Returns the content of the DataFrame as a RDD of JSON strings.
tokenize(String) - Static method in class org.apache.spark.rdd.PipedRDD
 
Tokenizer - Class in org.apache.spark.ml.feature
:: AlphaComponent :: A tokenizer that converts the input string to lowercase and then splits it by white spaces.
Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
 
toLocal() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
Convert model to a local model.
toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an iterator that contains all of the elements in this RDD.
toLocalIterator() - Method in class org.apache.spark.rdd.RDD
Return an iterator that contains all of the elements in this RDD.
toLocalMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Collect the distributed matrix on the driver as a `DenseMatrix`.
toLowerCase() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.QualifiedTableName
 
toMap() - Method in class org.apache.spark.util.TimeStampedHashMap
 
toMap() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toMesos(Enumeration.Value) - Static method in class org.apache.spark.TaskState
 
toMetastoreType(DataType) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.ExecutorTable
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.PoolTable
 
toNodeSeq() - Method in class org.apache.spark.ui.jobs.StageTableBase
 
ToolTips - Class in org.apache.spark.ui
 
ToolTips() - Constructor for class org.apache.spark.ui.ToolTips
 
toOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
 
toOps(VertexPartition<VD>, ClassTag<VD>) - Method in class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
 
toOps(T, ClassTag<VD>) - Method in interface org.apache.spark.graphx.impl.VertexPartitionBaseOpsConstructor
 
top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the top k (largest) elements from this RDD as defined by the specified Comparator[T].
top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Returns the top k (largest) elements from this RDD using the natural ordering for T.
top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
 
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.dstream.DStream
 
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
 
topic() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
 
topic() - Method in class org.apache.spark.streaming.kafka.OffsetRange
Kafka topic name
topicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
topicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
For each document in the training set, return the distribution over topics for that document ("theta_doc").
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
Inferred topics, where each topic is represented by a distribution over terms.
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LDAModel
Inferred topics, where each topic is represented by a distribution over terms.
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
 
topK(Iterator<Tuple2<String, Object>>, int) - Static method in class org.apache.spark.streaming.util.RawTextHelper
Gets the top k words in terms of word counts.
topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
toPredict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
 
toPrimitiveDataType(PrimitiveType, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
 
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Converts to RowMatrix, dropping row indices after grouping by row index.
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Drops row indices and converts this matrix to a RowMatrix.
TorrentBroadcast<T> - Class in org.apache.spark.broadcast
A BitTorrent-like implementation of Broadcast.
TorrentBroadcast(T, long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.TorrentBroadcast
 
TorrentBroadcastFactory - Class in org.apache.spark.broadcast
A Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors.
TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
 
toScalaFunction(Function<T, R>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toScalaFunction2(Function2<T1, T2, R>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
toSchemaRDD() - Method in class org.apache.spark.sql.DataFrame
Left here for backward compatibility.
toSeq() - Method in class org.apache.spark.ml.param.ParamMap
Converts this param map to a sequence of param pairs.
toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
 
toSparse() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
Generate a SparseMatrix from the given DenseMatrix.
toSplit() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
 
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
 
toString() - Method in class org.apache.spark.Accumulable
 
toString() - Method in class org.apache.spark.api.java.JavaRDD
 
toString() - Method in class org.apache.spark.broadcast.Broadcast
 
toString() - Method in class org.apache.spark.graphx.EdgeDirection
 
toString() - Method in class org.apache.spark.graphx.EdgeTriplet
 
toString() - Method in class org.apache.spark.ml.param.Param
 
toString() - Method in class org.apache.spark.ml.param.ParamMap
 
toString() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
A human readable representation of the matrix
toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
 
toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult
String explaining the hypothesis test result.
toString() - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
Print all timing results in seconds.
toString() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
 
toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Print a summary of the model.
toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
 
toString() - Method in class org.apache.spark.mllib.tree.model.Node
 
toString() - Method in class org.apache.spark.mllib.tree.model.Predict
 
toString() - Method in class org.apache.spark.mllib.tree.model.Split
 
toString() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Print a summary of the model.
toString() - Method in class org.apache.spark.partial.BoundedDouble
 
toString() - Method in class org.apache.spark.partial.PartialResult
 
toString() - Method in class org.apache.spark.rdd.RDD
 
toString() - Method in class org.apache.spark.scheduler.ExecutorLossReason
 
toString() - Method in class org.apache.spark.scheduler.HDFSCacheTaskLocation
 
toString() - Method in class org.apache.spark.scheduler.HostTaskLocation
 
toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
toString() - Method in class org.apache.spark.scheduler.ResultTask
 
toString() - Method in class org.apache.spark.scheduler.ShuffleMapTask
 
toString() - Method in class org.apache.spark.scheduler.SplitInfo
 
toString() - Method in class org.apache.spark.scheduler.Stage
 
toString() - Method in class org.apache.spark.scheduler.TaskDescription
 
toString() - Method in class org.apache.spark.scheduler.TaskSet
 
toString() - Method in class org.apache.spark.SerializableWritable
 
toString() - Method in class org.apache.spark.sql.Column
 
toString() - Method in class org.apache.spark.sql.columnar.ColumnType
 
toString() - Method in class org.apache.spark.sql.DataFrame
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
toString() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
toString() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
toString() - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 
toString() - Method in class org.apache.spark.SSLOptions
Returns a string representation of this SSLOptions with all the passwords masked.
toString() - Method in class org.apache.spark.storage.BlockId
 
toString() - Method in class org.apache.spark.storage.BlockManagerId
 
toString() - Method in class org.apache.spark.storage.BlockManagerInfo
 
toString() - Method in class org.apache.spark.storage.FileSegment
 
toString() - Method in class org.apache.spark.storage.RDDInfo
 
toString() - Method in class org.apache.spark.storage.StorageLevel
 
toString() - Method in class org.apache.spark.storage.TachyonFileSegment
 
toString() - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
 
toString() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
toString() - Method in class org.apache.spark.streaming.Duration
 
toString() - Method in class org.apache.spark.streaming.Interval
 
toString() - Method in class org.apache.spark.streaming.kafka.Broker
 
toString() - Method in class org.apache.spark.streaming.kafka.OffsetRange
 
toString() - Method in class org.apache.spark.streaming.scheduler.Job
 
toString() - Method in class org.apache.spark.streaming.Time
 
toString() - Method in class org.apache.spark.util.MutablePair
 
toString() - Method in class org.apache.spark.util.StatCounter
 
toString() - Method in class org.apache.spark.util.Vector
 
totalCoreCount() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
 
totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
 
totalCores() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
totalCoresAcquired() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
totalCount() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
Time taken for all the jobs of this batch to finish processing from the time they were submitted.
totalDelay() - Method in class org.apache.spark.streaming.scheduler.JobSet
 
totalDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
totalDuration() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalExpectedCores() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
totalInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalNumNodes() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
Get total number of nodes, summed over all trees in the forest.
totalRegisteredExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
totalResultSize() - Method in class org.apache.spark.scheduler.TaskSetManager
 
totalShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
totalTasks() - Method in class org.apache.spark.partial.ApproximateActionListener
 
totalTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
 
toTuple() - Method in class org.apache.spark.streaming.kafka.OffsetRange
this is to avoid ClassNotFoundException during checkpoint restore
toTypeInfo() - Method in class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
 
toWeakReference(V) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toWeakReferenceFunction(Function1<Tuple2<K, V>, R>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
toWeakReferenceTuple(Tuple2<K, V>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
trackerActor() - Method in class org.apache.spark.MapOutputTracker
Set to the MapOutputTrackerActor living on the driver.
train(RDD<ALS.Rating<ID>>, int, int, int, int, double, boolean, double, boolean, StorageLevel, StorageLevel, long, ClassTag<ID>, Ordering<ID>) - Static method in class org.apache.spark.ml.recommendation.ALS
Implementation of the ALS algorithm.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
Train a SVM model given an RDD of (label, features) pairs.
train(RDD<Vector>, int, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of ratings given by users to some products, in the form of (userID, productID, rating) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a Linear Regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model.
train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
Method to train a gradient boosting model.
train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
Java-friendly API for GradientBoostedTrees$.train(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.BoostingStrategy)
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model for binary or multiclass classification.
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Java-friendly API for RandomForest$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by users to some products, in the form of (userID, productID, rating) pairs.
trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
Update the clustering model by training on batches of data from a DStream.
trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Update the model by training on batches of data from a DStream.
trainOn(JavaDStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
Java-friendly version of `trainOn`.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model for regression.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
Java-friendly API for RandomForest$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
transactions() - Method in class org.apache.spark.mllib.fpm.FPTree
Returns all transactions in an iterator.
transceiver() - Method in class org.apache.spark.streaming.flume.FlumeConnection
 
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.classification.ClassificationModel
Transforms dataset by reading from featuresCol, and appending new columns as specified by parameters: - predicted labels as predictionCol of type Double - raw predictions (confidences) as rawPredictionCol of type Vector.
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
Transforms dataset by reading from featuresCol, and appending new columns as specified by parameters: - predicted labels as predictionCol of type Double - raw predictions (confidences) as rawPredictionCol of type Vector - probability of each class as probabilityCol of type Vector.
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.impl.estimator.PredictionModel
Transforms dataset by reading from featuresCol, calling predict(), and storing the predictions as a new column predictionCol.
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.PipelineModel
 
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
 
transform(DataFrame, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with optional parameters
transform(DataFrame, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with optional parameters
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Transformer
Transforms the dataset with provided parameter map as additional parameters.
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
 
transform(Vector) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
Applies transformation on a vector.
transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF
Transforms the input document into a sparse term frequency vector.
transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF
Transforms the input document into a sparse term frequency vector (Java version).
transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
Transforms the input document to term frequency vectors.
transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
Transforms the input document to term frequency vectors (Java version).
transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
Transforms term frequency (TF) vectors to TF-IDF vectors.
transform(Vector) - Method in class org.apache.spark.mllib.feature.IDFModel
Transforms a term frequency (TF) vector to a TF-IDF vector
transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer
Applies unit length normalization on a vector.
transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
Applies standardization transformation on a vector.
transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
Applies transformation on a vector.
transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
Applies transformation on an RDD[Vector].
transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
Applies transformation on an JavaRDD[Vector].
transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
Transforms a word to its vector representation
transform(PartialFunction<ASTNode, ASTNode>) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Returns a copy of this node where rule has been recursively applied to it and all of its children.
transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformColumnsImpl(DataFrame, ClassificationModel<FeaturesType, ?>, ParamMap) - Static method in class org.apache.spark.ml.classification.ClassificationModel
Added prediction column(s).
TransformedDStream<U> - Class in org.apache.spark.streaming.dstream
 
TransformedDStream(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<U>>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.TransformedDStream
 
Transformer - Class in org.apache.spark.ml
:: AlphaComponent :: Abstract class for transformers that transform one dataset into another.
Transformer() - Constructor for class org.apache.spark.ml.Transformer
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.impl.estimator.PredictionModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.impl.estimator.Predictor
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.Pipeline
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.PipelineModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.PipelineStage
:: DeveloperAPI ::
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
 
transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
translateConfKey(String, boolean) - Static method in class org.apache.spark.SparkConf
Translate the configuration key if it is deprecated and has a replacement, otherwise just returns the provided key.
transpose() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
transpose() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Transpose this BlockMatrix.
transpose() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Transposes this CoordinateMatrix.
transpose() - Method in interface org.apache.spark.mllib.linalg.Matrix
Transpose the Matrix.
transpose() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregates the elements of this RDD in a multi-level tree pattern.
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Aggregates the elements of this RDD in a multi-level tree pattern.
treeAlgo() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
 
TreeEnsembleModel - Class in org.apache.spark.mllib.tree.model
Represents a tree ensemble model.
TreeEnsembleModel(Enumeration.Value, DecisionTreeModel[], double[], Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel
 
TreeEnsembleModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.tree.model
 
TreeEnsembleModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
 
TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData - Class in org.apache.spark.mllib.tree.model
Model data for model import/export.
TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData(int, org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData) - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData
 
TreeEnsembleModel.SaveLoadV1_0$.Metadata - Class in org.apache.spark.mllib.tree.model
 
TreeEnsembleModel.SaveLoadV1_0$.Metadata(String, String, String, double[]) - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
 
treeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
treeId() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData
 
TreePoint - Class in org.apache.spark.mllib.tree.impl
Internal representation of LabeledPoint for DecisionTree.
TreePoint(double, int[]) - Constructor for class org.apache.spark.mllib.tree.impl.TreePoint
 
treeReduce(Function2<T, T, T>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
Reduces the elements of this RDD in a multi-level tree pattern.
treeReduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.rdd.RDD
Reduces the elements of this RDD in a multi-level tree pattern.
trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
treeWeights() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
 
triangleCount() - Method in class org.apache.spark.graphx.GraphOps
Compute the number of triangles passing through each vertex.
TriangleCount - Class in org.apache.spark.graphx.lib
Compute the number of triangles passing through each vertex.
TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
 
triK() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
Number of entries in the upper triangular part of a k-by-k matrix.
TripletFields - Class in org.apache.spark.graphx
Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].
TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields
Constructs a default TripletFields in which all fields are included.
TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
 
tripletIterator(boolean, boolean) - Method in class org.apache.spark.graphx.impl.EdgePartition
Get an iterator over the edge triplets in this partition.
triplets() - Method in class org.apache.spark.graphx.Graph
An RDD containing the edge triplets, which are edges along with the vertex data associated with the adjacent vertices.
triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl
Return a RDD that brings edges together with their source and destination vertices.
TRUE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns true positive rate for a given label (category)
trustStore() - Method in class org.apache.spark.SSLOptions
 
trustStorePassword() - Method in class org.apache.spark.SSLOptions
 
tryLog(Function0<T>) - Static method in class org.apache.spark.util.Utils
Executes the given block in a Try, logging any uncaught exceptions.
tryOrExit(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Execute a block of code that evaluates to Unit, forwarding any uncaught exceptions to the default UncaughtExceptionHandler
tryOrIOException(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Execute a block of code that evaluates to Unit, re-throwing any non-fatal uncaught exceptions as IOException.
tryOrIOException(Function0<T>) - Static method in class org.apache.spark.util.Utils
Execute a block of code that returns a value, re-throwing any non-fatal uncaught exceptions as IOException.
tryUncacheQuery(DataFrame, boolean) - Method in class org.apache.spark.sql.CacheManager
Tries to remove the data for the given DataFrame from the cache if it's cached
TwitterInputDStream - Class in org.apache.spark.streaming.twitter
 
TwitterInputDStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Constructor for class org.apache.spark.streaming.twitter.TwitterInputDStream
 
TwitterReceiver - Class in org.apache.spark.streaming.twitter
 
TwitterReceiver(Authorization, Seq<String>, StorageLevel) - Constructor for class org.apache.spark.streaming.twitter.TwitterReceiver
 
TwitterUtils - Class in org.apache.spark.streaming.twitter
 
TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
 
typ() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
 
typeId() - Method in class org.apache.spark.sql.columnar.ColumnType
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
typeId() - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
typeId() - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 

U

U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
udf(Function0<RT>, TypeTags.TypeTag<RT>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 0 arguments as user-defined function (UDF).
udf(Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 1 arguments as user-defined function (UDF).
udf(Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 2 arguments as user-defined function (UDF).
udf(Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 3 arguments as user-defined function (UDF).
udf(Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 4 arguments as user-defined function (UDF).
udf(Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 5 arguments as user-defined function (UDF).
udf(Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 6 arguments as user-defined function (UDF).
udf(Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 7 arguments as user-defined function (UDF).
udf(Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 8 arguments as user-defined function (UDF).
udf(Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 9 arguments as user-defined function (UDF).
udf(Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Static method in class org.apache.spark.sql.functions
Defines a user-defined function of 10 arguments as user-defined function (UDF).
udf() - Method in class org.apache.spark.sql.SQLContext
A collection of methods for registering user-defined functions (UDF).
UDF1<T1,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 1 arguments.
UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 10 arguments.
UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 11 arguments.
UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 12 arguments.
UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 13 arguments.
UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 14 arguments.
UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 15 arguments.
UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 16 arguments.
UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 17 arguments.
UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 18 arguments.
UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 19 arguments.
UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 2 arguments.
UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 20 arguments.
UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 21 arguments.
UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 22 arguments.
UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 3 arguments.
UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 4 arguments.
UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 5 arguments.
UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 6 arguments.
UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 7 arguments.
UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 8 arguments.
UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java
A Spark SQL UDF that has 9 arguments.
UDFRegistration - Class in org.apache.spark.sql
Functions for registering user-defined functions.
UDFRegistration(SQLContext) - Constructor for class org.apache.spark.sql.UDFRegistration
 
ui() - Method in class org.apache.spark.SparkContext
 
uid() - Method in interface org.apache.spark.ml.Identifiable
A unique id for the object.
UIData - Class in org.apache.spark.ui.jobs
 
UIData() - Constructor for class org.apache.spark.ui.jobs.UIData
 
UIData.ExecutorSummary - Class in org.apache.spark.ui.jobs
 
UIData.ExecutorSummary() - Constructor for class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
UIData.JobUIData - Class in org.apache.spark.ui.jobs
 
UIData.JobUIData(int, Option<Object>, Option<Object>, Seq<Object>, Option<String>, JobExecutionStatus, int, int, int, int, int, int, OpenHashSet<Object>, int, int) - Constructor for class org.apache.spark.ui.jobs.UIData.JobUIData
 
UIData.JobUIData$ - Class in org.apache.spark.ui.jobs
 
UIData.JobUIData$() - Constructor for class org.apache.spark.ui.jobs.UIData.JobUIData$
 
UIData.StageUIData - Class in org.apache.spark.ui.jobs
 
UIData.StageUIData() - Constructor for class org.apache.spark.ui.jobs.UIData.StageUIData
 
UIData.TaskUIData - Class in org.apache.spark.ui.jobs
These are kept mutable and reused throughout a task's lifetime to avoid excessive reallocation.
UIData.TaskUIData(TaskInfo, Option<TaskMetrics>, Option<String>) - Constructor for class org.apache.spark.ui.jobs.UIData.TaskUIData
 
UIData.TaskUIData$ - Class in org.apache.spark.ui.jobs
 
UIData.TaskUIData$() - Constructor for class org.apache.spark.ui.jobs.UIData.TaskUIData$
 
uiRoot() - Static method in class org.apache.spark.ui.UIUtils
 
uiTab() - Method in class org.apache.spark.streaming.StreamingContext
 
UIUtils - Class in org.apache.spark.ui
Utility functions for generating XML pages with spark content.
UIUtils() - Constructor for class org.apache.spark.ui.UIUtils
 
UIWorkloadGenerator - Class in org.apache.spark.ui
Continuously generates jobs that expose various features of the WebUI (internal testing tool).
UIWorkloadGenerator() - Constructor for class org.apache.spark.ui.UIWorkloadGenerator
 
unapply(DenseVector) - Static method in class org.apache.spark.mllib.linalg.DenseVector
Extracts the value array from a dense vector.
unapply(SparseVector) - Static method in class org.apache.spark.mllib.linalg.SparseVector
 
unapply(Column) - Static method in class org.apache.spark.sql.Column
 
unapply(Object) - Method in class org.apache.spark.sql.hive.HiveQl.Token$
 
unapply(Broker) - Static method in class org.apache.spark.streaming.kafka.Broker
 
unapply(String) - Static method in class org.apache.spark.util.IntParam
 
unapply(String) - Static method in class org.apache.spark.util.MemoryParam
 
UnaryTransformer<IN,OUT,T extends UnaryTransformer<IN,OUT,T>> - Class in org.apache.spark.ml
Abstract class for transformers that take one input column, apply transformation, and output the result as a new column.
UnaryTransformer() - Constructor for class org.apache.spark.ml.UnaryTransformer
 
unBlockifyObject(ByteBuffer[], Serializer, Option<CompressionCodec>, ClassTag<T>) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
 
unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
 
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.BroadcastManager
 
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
Remove all persisted state associated with the HTTP broadcast with the given ID.
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
Remove all persisted state associated with the torrent broadcast with the given ID.
uncacheQuery(DataFrame, boolean) - Method in class org.apache.spark.sql.CacheManager
Removes the data for the given DataFrame from the cache
uncacheTable(String) - Method in class org.apache.spark.sql.CacheManager
Removes the specified table from the in-memory cache.
uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext
Removes the specified table from the in-memory cache.
UNCAUGHT_EXCEPTION() - Static method in class org.apache.spark.util.SparkExitCode
The default uncaught exception handler was reached.
UNCAUGHT_EXCEPTION_TWICE() - Static method in class org.apache.spark.util.SparkExitCode
The default uncaught exception handler was called and an exception was encountered while logging the exception.
uncaughtException(Thread, Throwable) - Static method in class org.apache.spark.util.SparkUncaughtExceptionHandler
 
uncaughtException(Throwable) - Static method in class org.apache.spark.util.SparkUncaughtExceptionHandler
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
uncompressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
underlyingBuffer() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
 
underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
 
UniformGenerator - Class in org.apache.spark.mllib.random
:: DeveloperApi :: Generates i.i.d.
UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
 
uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
RandomRDDs.uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.
uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD comprised of i.i.d. samples from the uniform distribution U(0.0, 1.0).
uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the uniform distribution on U(0.0, 1.0).
union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return the union of this RDD and another one.
union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
Return the union of this RDD and another one.
union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
Return the union of this RDD and another one.
union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext
Build the union of two or more RDDs.
union(RDD<T>) - Method in class org.apache.spark.rdd.RDD
Return the union of this RDD and another one.
union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Build the union of a list of RDDs.
union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Build the union of a list of RDDs passed as variable-length arguments.
union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream by unifying data of another DStream with this DStream.
union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by unifying data of another DStream with this DStream.
union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream by unifying data of another DStream with this DStream.
union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create a unified DStream from multiple DStreams of the same type and same slide duration.
unionAll(DataFrame) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame containing union of rows in this frame and another frame.
UnionDStream<T> - Class in org.apache.spark.streaming.dstream
 
UnionDStream(DStream<T>[], ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.UnionDStream
 
UnionPartition<T> - Class in org.apache.spark.rdd
Partition for UnionRDD.
UnionPartition(int, RDD<T>, int, int, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionPartition
 
UnionRDD<T> - Class in org.apache.spark.rdd
 
UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
 
uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
 
UnknownReason - Class in org.apache.spark
:: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.
UnknownReason() - Constructor for class org.apache.spark.UnknownReason
 
unorderedFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.broadcast.Broadcast
Asynchronously delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast
Delete cached copies of this broadcast on the executors.
unpersist(long, boolean, boolean) - Static method in class org.apache.spark.broadcast.HttpBroadcast
Remove all persisted blocks associated with this HTTP broadcast on the executors.
unpersist(long, boolean, boolean) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
Remove all persisted blocks associated with this torrent broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.graphx.Graph
Uncaches both vertices and edges of this graph.
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Unpersist intermediate RDDs used in the computation.
unpersist(boolean) - Method in class org.apache.spark.rdd.RDD
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.DataFrame
 
unpersist() - Method in class org.apache.spark.sql.DataFrame
 
unpersist() - Method in interface org.apache.spark.sql.RDDApi
 
unpersist(boolean) - Method in interface org.apache.spark.sql.RDDApi
 
unpersistRDD(int, boolean) - Method in class org.apache.spark.SparkContext
Unpersist an RDD from memory and/or disk storage
unpersistRDDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
unpersistRDDToJson(SparkListenerUnpersistRDD) - Static method in class org.apache.spark.util.JsonProtocol
 
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.Graph
Uncaches only the vertices of this graph, leaving the edges alone.
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
unregisterAllTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
unregisterMapOutput(int, int, BlockManagerId) - Method in class org.apache.spark.MapOutputTrackerMaster
Unregister map output information of the given shuffle, mapper and block manager
unregisterShuffle(int) - Method in class org.apache.spark.MapOutputTracker
Unregister shuffle data.
unregisterShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster
Unregister shuffle data
unregisterTable(Seq<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
UNIMPLEMENTED: It needs to be decided how we will persist in-memory tables to the metastore.
unrollSafely(BlockId, Iterator<Object>, ArrayBuffer<Tuple2<BlockId, BlockStatus>>) - Method in class org.apache.spark.storage.MemoryStore
Unroll the given block in memory safely.
unset() - Static method in class org.apache.spark.TaskContextHelper
 
unsetConf(String) - Method in class org.apache.spark.sql.SQLConf
 
until(Time, Duration) - Method in class org.apache.spark.streaming.Time
 
untilOffset() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
 
untilOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
exclusive ending offset
unwrap(Object, ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
Converts hive types to native catalyst types.
update(RDD<Vector>, double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
Perform a k-means update on a batch of data.
update(int, int, double) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
update(Function1<Object, Object>) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
update(int, int, double) - Method in interface org.apache.spark.mllib.linalg.Matrix
Update element at (i, j)
update(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix
Update all the values of this matrix using the function f.
update(int, int, double) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
update(Function1<Object, Object>) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
update(int, int, double, double) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
Update the stats for a given (feature, bin) for ordered features, using the given label.
update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.EntropyAggregator
Update stats for one (node, feature, bin) with the given label.
update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.GiniAggregator
Update stats for one (node, feature, bin) with the given label.
update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
Update stats for one (node, feature, bin) with the given label.
update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.VarianceAggregator
Update stats for one (node, feature, bin) with the given label.
update() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
update(Row) - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
update(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
Updates the checkpoint data of the DStream.
update(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
update(Time) - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
 
update(T1, T2) - Method in class org.apache.spark.util.MutablePair
Updates this pair with new values and returns itself
update(A, B) - Method in class org.apache.spark.util.TimeStampedHashMap
 
update(A, B) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
UPDATE_PERIOD() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
updateAggregateMetrics(UIData.StageUIData, String, TaskMetrics, Option<TaskMetrics>) - Method in class org.apache.spark.ui.jobs.JobProgressListener
Upon receiving new metrics for a task, updates the per-stage and per-executor-per-stage aggregate metrics by calculating deltas between the currently recorded metrics and the new metrics.
updateBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus
Update the given block in this storage status.
updateBlockInfo(BlockId, StorageLevel, long, long, long) - Method in class org.apache.spark.storage.BlockManagerInfo
 
updateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Method in class org.apache.spark.storage.BlockManagerMaster
 
updateCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Refresh the list of checkpointed RDDs that will be saved along with checkpoint of this stream.
updateCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
updatedConf(SparkConf, String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.SparkContext
Creates a modified version of a SparkConf with the parameters that can be passed separately to SparkContext, to make it easier to write SparkContext's constructors.
updateEpoch(long) - Method in class org.apache.spark.MapOutputTracker
Called from executors to update the epoch number, potentially clearing old outputs because of a fetch failure.
updateGraph(Graph<VD, ED>) - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
Update currentGraph with a new graph.
updateLastSeenMs() - Method in class org.apache.spark.storage.BlockManagerInfo
 
updateNodeIndex(int[], Bin[][]) - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
Determine a child node index based on the feature value and the split.
updateNodeIndices(RDD<BaggedPoint<TreePoint>>, Map<Object, NodeIndexUpdater>[], Bin[][]) - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
Update the node index values in the cache.
Updater - Class in org.apache.spark.mllib.optimization
:: DeveloperApi :: Class used to perform steps (weight update) using Gradient Descent methods.
Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
 
updateRddInfo(Seq<RDDInfo>, Seq<StorageStatus>) - Static method in class org.apache.spark.storage.StorageUtils
Update the given list of RDDInfo with the given list of storage statuses.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner, JavaPairRDD<K, S>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateVertices(Iterator<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Return a new `EdgePartition` with updates to vertex attributes specified in `iter`.
updateVertices(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Return a new ReplicatedVertexView where vertex attributes in edge partition are updated using updates.
upgrade(VertexRDD<VD>, boolean, boolean) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Upgrade the shipping level in-place to the specified levels by shipping vertex attributes from vertices.
upper() - Method in class org.apache.spark.rdd.JdbcPartition
 
upper(Column) - Static method in class org.apache.spark.sql.functions
Converts a string expression to upper case.
UPPER() - Static method in class org.apache.spark.sql.hive.HiveQl
 
upperBound() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
upperBound() - Method in class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
 
uri() - Method in class org.apache.spark.HttpServer
Get the URI of this HTTP server (http://host:port or https://host:port)
url() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
 
useCachedData(LogicalPlan) - Method in class org.apache.spark.sql.CacheManager
Replaces segments of the given logical plan with cached versions where possible.
useCompression() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
useCompression() - Method in class org.apache.spark.sql.SQLConf
When true tables cached using the in-memory columnar caching will be compressed.
useDisk() - Method in class org.apache.spark.storage.StorageLevel
 
useDst - Variable in class org.apache.spark.graphx.TripletFields
Indicates whether the destination vertex attribute is included.
useEdge - Variable in class org.apache.spark.graphx.TripletFields
Indicates whether the edge attribute is included.
useMemory() - Method in class org.apache.spark.storage.StorageLevel
 
useNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
 
user() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
 
user() - Method in class org.apache.spark.mllib.recommendation.Rating
 
user() - Method in class org.apache.spark.scheduler.JobLogger
 
userClass() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
userClass() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
userCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for the column name for user ids.
UserDefinedFunction - Class in org.apache.spark.sql
A user-defined function.
UserDefinedPythonFunction - Class in org.apache.spark.sql
A user-defined Python function.
UserDefinedPythonFunction(String, byte[], Map<String, String>, List<String>, String, List<Broadcast<PythonBroadcast>>, Accumulator<List<byte[]>>, DataType) - Constructor for class org.apache.spark.sql.UserDefinedPythonFunction
 
userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
 
userSpecifiedSchema() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
userSpecifiedSchema() - Method in class org.apache.spark.sql.json.JSONRelation
 
userSpecifiedSchema() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
userSpecifiedSchema() - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
 
useSrc - Variable in class org.apache.spark.graphx.TripletFields
Indicates whether the source vertex attribute is included.
Utils - Class in org.apache.spark.util
Various utility methods used by Spark.
Utils() - Constructor for class org.apache.spark.util.Utils
 
UUIDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
UUIDToJson(UUID) - Static method in class org.apache.spark.util.JsonProtocol
 

V

V() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
validate(ParamMap) - Method in interface org.apache.spark.ml.param.Params
Validates parameter values stored internally plus the input parameter map.
validate() - Method in interface org.apache.spark.ml.param.Params
Validates parameter values stored internally.
validate() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Validates the block matrix info against the matrix data (blocks) and throws an exception if any error is found.
validate() - Method in class org.apache.spark.streaming.Checkpoint
 
validate() - Method in class org.apache.spark.streaming.dstream.DStream
 
validate() - Method in class org.apache.spark.streaming.DStreamGraph
 
validateAndTransformSchema(StructType, ParamMap, boolean, DataType) - Method in interface org.apache.spark.ml.classification.ClassifierParams
 
validateAndTransformSchema(StructType, ParamMap, boolean, DataType) - Method in interface org.apache.spark.ml.classification.ProbabilisticClassifierParams
 
validateAndTransformSchema(StructType, ParamMap, boolean, DataType) - Method in interface org.apache.spark.ml.impl.estimator.PredictorParams
Validates and transforms the input schema with the provided param map.
validateAndTransformSchema(StructType, ParamMap) - Method in interface org.apache.spark.ml.recommendation.ALSParams
Validates and transforms the input schema.
validateSettings() - Method in class org.apache.spark.SparkConf
Checks for illegal or deprecated config settings.
value() - Method in class org.apache.spark.Accumulable
Access the accumulator's current value; only allowed on master.
value() - Method in class org.apache.spark.broadcast.Broadcast
Get the broadcasted value.
value() - Method in class org.apache.spark.ComplexFutureAction
 
value() - Method in interface org.apache.spark.FutureAction
The value of this Future.
value() - Method in class org.apache.spark.ml.param.ParamPair
 
value() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
value() - Method in class org.apache.spark.scheduler.AccumulableInfo
 
value() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
value() - Method in class org.apache.spark.SerializableWritable
 
value() - Method in class org.apache.spark.SimpleFutureAction
 
value() - Method in class org.apache.spark.sql.sources.EqualTo
 
value() - Method in class org.apache.spark.sql.sources.GreaterThan
 
value() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
 
value() - Method in class org.apache.spark.sql.sources.LessThan
 
value() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
 
value() - Method in class org.apache.spark.storage.MemoryEntry
 
value() - Method in class org.apache.spark.util.SerializableBuffer
 
value() - Method in class org.apache.spark.util.TimeStampedValue
 
value_() - Method in class org.apache.spark.broadcast.HttpBroadcast
 
valueBytes() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
valueClass() - Method in class org.apache.spark.rdd.PairRDDFunctions
 
valueOf(String) - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.JobExecutionStatus
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.sql.SaveMode
Returns the enum constant of this type with the specified name.
values() - Static method in class org.apache.spark.Accumulators
 
values() - Method in class org.apache.spark.api.java.JavaPairRDD
Return an RDD with the values of each tuple.
values() - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
 
values() - Method in class org.apache.spark.graphx.impl.VertexPartition
 
values() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
values() - Static method in enum org.apache.spark.JobExecutionStatus
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
values() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
values() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
values() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
values() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return an RDD with the values of each tuple.
values() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
 
values() - Method in class org.apache.spark.sql.parquet.Partition
 
values() - Static method in enum org.apache.spark.sql.SaveMode
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in class org.apache.spark.sql.sources.In
 
variance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
variance() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample variance vector.
Variance - Class in org.apache.spark.mllib.tree.impurity
:: Experimental :: Class for calculating variance during regression
Variance() - Constructor for class org.apache.spark.mllib.tree.impurity.Variance
 
variance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.util.StatCounter
Return the variance of the values.
VarianceAggregator - Class in org.apache.spark.mllib.tree.impurity
Class for updating views of a vector of sufficient statistics, in order to compute impurity from a sample.
VarianceAggregator() - Constructor for class org.apache.spark.mllib.tree.impurity.VarianceAggregator
 
VarianceCalculator - Class in org.apache.spark.mllib.tree.impurity
Stores statistics for one (node, feature, bin) for calculating impurity.
VarianceCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.VarianceCalculator
 
vClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
 
vClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
 
vClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
vector() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
 
vector() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
Vector - Interface in org.apache.spark.mllib.linalg
Represents a numeric vector, whose index type is Int and value type is Double.
Vector - Class in org.apache.spark.util
 
Vector(double[]) - Constructor for class org.apache.spark.util.Vector
 
Vector.Multiplier - Class in org.apache.spark.util
 
Vector.Multiplier(double) - Constructor for class org.apache.spark.util.Vector.Multiplier
 
Vector.VectorAccumParam$ - Class in org.apache.spark.util
 
Vector.VectorAccumParam$() - Constructor for class org.apache.spark.util.Vector.VectorAccumParam$
 
Vectors - Class in org.apache.spark.mllib.linalg
 
Vectors() - Constructor for class org.apache.spark.mllib.linalg.Vectors
 
VectorTransformer - Interface in org.apache.spark.mllib.feature
:: DeveloperApi :: Trait for transformation of a vector
VectorUDT - Class in org.apache.spark.mllib.linalg
:: DeveloperApi ::
VectorUDT() - Constructor for class org.apache.spark.mllib.linalg.VectorUDT
 
VectorWithNorm - Class in org.apache.spark.mllib.clustering
A vector with its norm for fast distance computation.
VectorWithNorm(Vector, double) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
VectorWithNorm(Vector) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
VectorWithNorm(double[]) - Constructor for class org.apache.spark.mllib.clustering.VectorWithNorm
 
version() - Method in class org.apache.spark.api.java.JavaSparkContext
The version of Spark on which this application is running.
version() - Method in class org.apache.spark.SparkContext
The version of Spark on which this application is running.
version() - Static method in class org.apache.spark.sql.hive.HiveShim
 
vertcat(Matrix[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
Vertically concatenate a sequence of matrices.
vertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet
Get the vertex object for the given vertex in the edge.
VertexAttributeBlock<VD> - Class in org.apache.spark.graphx.impl
Stores vertex attributes to ship to an edge partition.
VertexAttributeBlock(long[], Object, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexAttributeBlock
 
VertexPartition<VD> - Class in org.apache.spark.graphx.impl
A map from vertex id to vertex attribute.
VertexPartition(OpenHashSet<Object>, Object, BitSet, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartition
 
VertexPartition.VertexPartitionOpsConstructor$ - Class in org.apache.spark.graphx.impl
Implicit evidence that VertexPartition is a member of the VertexPartitionBaseOpsConstructor typeclass.
VertexPartition.VertexPartitionOpsConstructor$() - Constructor for class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
 
VertexPartitionBase<VD> - Class in org.apache.spark.graphx.impl
An abstract map from vertex id to vertex attribute.
VertexPartitionBase(ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionBase
 
VertexPartitionBaseOps<VD,Self extends VertexPartitionBase<Object>> - Class in org.apache.spark.graphx.impl
An class containing additional operations for subclasses of VertexPartitionBase that provide implicit evidence of membership in the VertexPartitionBaseOpsConstructor typeclass (for example, VertexPartition.VertexPartitionOpsConstructor).
VertexPartitionBaseOps(Self, ClassTag<VD>, VertexPartitionBaseOpsConstructor<Self>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
VertexPartitionBaseOpsConstructor<T extends VertexPartitionBase<Object>> - Interface in org.apache.spark.graphx.impl
A typeclass for subclasses of VertexPartitionBase representing the ability to wrap them in a VertexPartitionBaseOps.
VertexPartitionOps<VD> - Class in org.apache.spark.graphx.impl
 
VertexPartitionOps(VertexPartition<VD>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexPartitionOps
 
VertexRDD<VD> - Class in org.apache.spark.graphx
Extends RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by pre-indexing the entries for fast, efficient joins.
VertexRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.VertexRDD
 
VertexRDDImpl<VD> - Class in org.apache.spark.graphx.impl
 
VertexRDDImpl(RDD<ShippableVertexPartition<VD>>, StorageLevel, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.VertexRDDImpl
 
vertices() - Method in class org.apache.spark.graphx.Graph
An RDD containing the vertices and their associated attributes.
vertices() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
vids() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
viewAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
visit(int, int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.FieldAccessFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
 
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.ReturnStatementFinder
 
vManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
vocabSize() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
 
vocabSize() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
vocabSize() - Method in class org.apache.spark.mllib.clustering.LDAModel
Vocabulary size (number of terms or terms in the vocabulary)
vocabSize() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
 
VocabWord - Class in org.apache.spark.mllib.feature
Entry in vocabulary
VocabWord(String, int, int[], int[], int) - Constructor for class org.apache.spark.mllib.feature.VocabWord
 
VoidFunction<T> - Interface in org.apache.spark.api.java.function
A function with no return value.
Vote() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 

W

w(boolean) - Method in class org.apache.spark.ml.param.BooleanParam
 
w(double) - Method in class org.apache.spark.ml.param.DoubleParam
 
w(float) - Method in class org.apache.spark.ml.param.FloatParam
 
w(int) - Method in class org.apache.spark.ml.param.IntParam
 
w(long) - Method in class org.apache.spark.ml.param.LongParam
 
w(T) - Method in class org.apache.spark.ml.param.Param
Creates a param pair with the given value (for Java).
waiter() - Method in class org.apache.spark.streaming.StreamingContext
 
waitForAsyncReregister() - Method in class org.apache.spark.storage.BlockManager
For testing.
waitForProcess(Process, long) - Static method in class org.apache.spark.util.Utils
Wait for a process to terminate for at most the specified duration.
waitForReady() - Method in class org.apache.spark.storage.BlockInfo
Wait for this BlockInfo to be marked as ready (i.e.
waitForRegister() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
waitForRegister() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
waitForStopOrError(long) - Method in class org.apache.spark.streaming.ContextWaiter
Return true if it's stopped; or throw the reported error if notifyError has been called; or false if the waiting time detectably elapsed before return from the method.
waitingBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
waitingStages() - Method in class org.apache.spark.scheduler.DAGScheduler
 
waitList() - Method in class org.apache.spark.util.random.AcceptanceResult
 
waitListBound() - Method in class org.apache.spark.util.random.AcceptanceResult
 
waitTillTime(long) - Method in interface org.apache.spark.util.Clock
 
waitTillTime(long) - Method in class org.apache.spark.util.ManualClock
 
waitTillTime(long) - Method in class org.apache.spark.util.SystemClock
 
waitToPush() - Method in class org.apache.spark.streaming.receiver.RateLimiter
 
waitUntilEmpty(int) - Method in class org.apache.spark.util.AsynchronousListenerBus
For testing only.
warmUp(SparkContext) - Static method in class org.apache.spark.streaming.util.RawTextHelper
Warms up the SparkContext in master and slave by running tasks to force JIT kick in before real workload starts.
WebUI - Class in org.apache.spark.ui
The top level component of the UI hierarchy that contains the server.
WebUI(SecurityManager, int, SparkConf, String, String) - Constructor for class org.apache.spark.ui.WebUI
 
WebUIPage - Class in org.apache.spark.ui
A page that represents the leaf node in the UI hierarchy.
WebUIPage(String) - Constructor for class org.apache.spark.ui.WebUIPage
 
WebUITab - Class in org.apache.spark.ui
A tab that represents a collection of pages.
WebUITab(WebUI, String) - Constructor for class org.apache.spark.ui.WebUITab
 
weight() - Method in class org.apache.spark.scheduler.Pool
 
weight() - Method in interface org.apache.spark.scheduler.Schedulable
 
weight() - Method in class org.apache.spark.scheduler.TaskSetManager
 
WEIGHT_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
weightedFalsePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted false positive rate
weightedFMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted averaged f-measure
weightedFMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted averaged f1-measure
weightedPrecision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted averaged precision
weightedRecall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted averaged recall (equals to precision, recall and f-measure)
weightedTruePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns weighted true positive rate (equals to precision, recall and f-measure)
weights() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
weights() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
 
weights() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
 
weights() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
 
weights() - Method in class org.apache.spark.mllib.classification.SVMModel
 
weights() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
 
weights() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
 
weights() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
 
weights() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
 
weights() - Method in class org.apache.spark.mllib.regression.LassoModel
 
weights() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
 
weights() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
 
WHEN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
where(Column) - Method in class org.apache.spark.sql.DataFrame
Filters rows using the given condition.
whereClause() - Method in class org.apache.spark.sql.jdbc.JDBCPartition
 
WholeTextFileInputFormat - Class in org.apache.spark.input
A CombineFileInputFormat for reading whole text files.
WholeTextFileInputFormat() - Constructor for class org.apache.spark.input.WholeTextFileInputFormat
 
WholeTextFileRDD - Class in org.apache.spark.rdd
Analogous to MapPartitionsRDD, but passes in an InputSplit to the given function rather than the index of the partition.
WholeTextFileRDD(SparkContext, Class<? extends WholeTextFileInputFormat>, Class<String>, Class<String>, Configuration, int) - Constructor for class org.apache.spark.rdd.WholeTextFileRDD
 
WholeTextFileRecordReader - Class in org.apache.spark.input
A RecordReader for reading a single whole text file out in a key-value pair, where the key is the file path and the value is the entire content of the file.
WholeTextFileRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.WholeTextFileRecordReader
 
wholeTextFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String, int) - Method in class org.apache.spark.SparkContext
Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
windowDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
windowDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
WindowedDStream<T> - Class in org.apache.spark.streaming.dstream
 
WindowedDStream(DStream<T>, Duration, Duration, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.WindowedDStream
 
windowsDrive() - Static method in class org.apache.spark.util.Utils
Pattern for matching a Windows drive, which contains only a single alphabet character.
windowSize() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
wipe() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
withActiveSet(Iterator<Object>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Return a new `EdgePartition` with the specified active set, provided as an iterator.
withActiveSet(VertexRDD<?>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Return a new ReplicatedVertexView where the activeSet in each edge partition contains only vertex ids present in actives.
withChildren(Seq<ASTNode>) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Returns this ASTNode with the children changed to newChildren.
withColumn(String, Column) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame by adding a column.
withColumnRenamed(String, String) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame with a column renamed.
WithCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
 
withData(Object, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Return a new `EdgePartition` with the specified edge data.
withEdges(EdgeRDDImpl<ED2, VD2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
Return a new ReplicatedVertexView with the specified EdgeRDD, which must have the same shipping level.
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.VertexRDD
Prepares this VertexRDD for efficient joins with the given EdgeRDD.
withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withIndex(OpenHashSet<Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withMask(BitSet) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
withMean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
withOutput(Seq<Attribute>) - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
withoutVertexAttributes(ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Return a new `EdgePartition` without any locally cached vertex attributes.
withParquetDataFrame(Seq<T>, Function1<DataFrame, BoxedUnit>, ClassTag<T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
Writes data to a Parquet file and reads it back as a DataFrame, which is then passed to f.
withParquetFile(Seq<T>, Function1<String, BoxedUnit>, ClassTag<T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
Writes data to a Parquet file, which is then passed to f and will be deleted after f returns.
withParquetTable(Seq<T>, String, Function0<BoxedUnit>, ClassTag<T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
Writes data to a Parquet file, reads it back as a DataFrame and registers it as a temporary table named tableName, then call f.
withPartitionsRDD(RDD<Tuple2<Object, EdgePartition<ED2, VD2>>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
withPartitionsRDD(RDD<ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withPartitionsRDD(RDD<ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Replaces the vertex partitions while preserving all other properties of the VertexRDD.
withRoutingTable(RoutingTablePartition) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Return a new ShippableVertexPartition with the specified routing table.
withSQLConf(Seq<Tuple2<String, String>>, Function0<BoxedUnit>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
Sets all SQL configurations specified in pairs, calls f, and then restore all SQL configurations.
withStd() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.EdgeRDD
Changes the target storage level while preserving all other properties of the EdgeRDD.
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
withTargetStorageLevel(StorageLevel) - Method in class org.apache.spark.graphx.VertexRDD
Changes the target storage level while preserving all other properties of the VertexRDD.
withTempDir(Function1<File, BoxedUnit>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
Creates a temporary directory, which is then passed to f and will be deleted after f returns.
withTempPath(Function1<File, BoxedUnit>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
Generates a temporary path without creating the actual file/directory, then pass it to f.
withTempTable(String, Function0<BoxedUnit>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
Drops temporary table tableName after calling f.
withText(String) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Returns this ASTNode with the text changed to newText.
withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
 
withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
withValues(Object, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionOps
 
word() - Method in class org.apache.spark.mllib.feature.VocabWord
 
Word2Vec - Class in org.apache.spark.mllib.feature
:: Experimental :: Word2Vec creates vector representation of words in a text corpus.
Word2Vec() - Constructor for class org.apache.spark.mllib.feature.Word2Vec
 
Word2VecModel - Class in org.apache.spark.mllib.feature
:: Experimental :: Word2Vec model
Word2VecModel(Map<String, float[]>) - Constructor for class org.apache.spark.mllib.feature.Word2VecModel
 
worker() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
worker() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
workerId() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
WorkerOffer - Class in org.apache.spark.scheduler
Represents free resources available on an executor.
WorkerOffer(String, String, int) - Constructor for class org.apache.spark.scheduler.WorkerOffer
 
wrap(Object, ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
Converts native catalyst types to the types expected by Hive
wrap(Row, Seq<ObjectInspector>, Object[]) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
wrap(Seq<Object>, Seq<ObjectInspector>, Object[]) - Method in interface org.apache.spark.sql.hive.HiveInspectors
 
wrapForCompression(BlockId, OutputStream) - Method in class org.apache.spark.storage.BlockManager
Wrap an output stream for compression if block compression is enabled for its block type
wrapForCompression(BlockId, InputStream) - Method in class org.apache.spark.storage.BlockManager
Wrap an input stream for compression if block compression is enabled for its block type
wrapperClass() - Static method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
wrapperFor(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
Wraps with Hive types based on object inspector.
wrapperToFileSinkDesc(ShimFileSinkDesc) - Static method in class org.apache.spark.sql.hive.HiveShim
 
wrapRDD(RDD<Double>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaPairRDD
 
wrapRDD(RDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
 
wrapRDD(RDD<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
 
wrapRDD(RDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
wrapRDD(RDD<T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
writableClass() - Method in class org.apache.spark.WritableConverter
 
writableClass() - Method in class org.apache.spark.WritableFactory
 
WritableConverter<T> - Class in org.apache.spark
A class encapsulating how to convert some type T to Writable.
WritableConverter(Function1<ClassTag<T>, Class<? extends Writable>>, Function1<Writable, T>) - Constructor for class org.apache.spark.WritableConverter
 
WritableFactory<T> - Class in org.apache.spark
A class encapsulating how to convert some type T to Writable.
WritableFactory(Function1<ClassTag<T>, Class<? extends Writable>>, Function1<T, Writable>) - Constructor for class org.apache.spark.WritableFactory
 
writableWritableConverter() - Static method in class org.apache.spark.SparkContext
 
writableWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
writableWritableFactory(ClassTag<T>) - Static method in class org.apache.spark.WritableFactory
 
write(Kryo, Output, Iterable<?>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
write(Object, Object) - Method in class org.apache.spark.SparkHadoopWriter
 
write(Row) - Method in class org.apache.spark.sql.parquet.MutableRowWriteSupport
 
write(Row) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
write(Group) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
 
write(Object) - Method in class org.apache.spark.storage.BlockObjectWriter
Writes an object.
write(Object) - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
write(Checkpoint) - Method in class org.apache.spark.streaming.CheckpointWriter
 
write(int) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(byte[]) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(byte[], int, int) - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
write(ByteBuffer) - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter
Write the bytebuffer to the log file
write(int) - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
write(byte[], int, int) - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
WriteAheadLogBackedBlockRDD<T> - Class in org.apache.spark.streaming.rdd
This class represents a special case of the BlockRDD where the data blocks in the block manager are also backed by segments in write ahead logs.
WriteAheadLogBackedBlockRDD(SparkContext, BlockId[], WriteAheadLogFileSegment[], boolean, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
 
WriteAheadLogBackedBlockRDDPartition - Class in org.apache.spark.streaming.rdd
Partition class for WriteAheadLogBackedBlockRDD.
WriteAheadLogBackedBlockRDDPartition(int, BlockId, WriteAheadLogFileSegment) - Constructor for class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
WriteAheadLogBasedBlockHandler - Class in org.apache.spark.streaming.receiver
Implementation of a ReceivedBlockHandler which stores the received blocks in both, a write ahead log and a block manager.
WriteAheadLogBasedBlockHandler(BlockManager, int, StorageLevel, SparkConf, Configuration, String, Clock) - Constructor for class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
WriteAheadLogBasedStoreResult - Class in org.apache.spark.streaming.receiver
Implementation of ReceivedBlockStoreResult that stores the metadata related to storage of blocks using WriteAheadLogBasedBlockHandler
WriteAheadLogBasedStoreResult(StreamBlockId, WriteAheadLogFileSegment) - Constructor for class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
WriteAheadLogFileSegment - Class in org.apache.spark.streaming.util
Class for representing a segment of data in a write ahead log file
WriteAheadLogFileSegment(String, long, int) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogFileSegment
 
WriteAheadLogManager - Class in org.apache.spark.streaming.util
This class manages write ahead log files.
WriteAheadLogManager(String, Configuration, int, int, String, Clock) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager
 
WriteAheadLogManager.LogInfo - Class in org.apache.spark.streaming.util
 
WriteAheadLogManager.LogInfo(long, long, String) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
 
WriteAheadLogManager.LogInfo$ - Class in org.apache.spark.streaming.util
 
WriteAheadLogManager.LogInfo$() - Constructor for class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo$
 
WriteAheadLogRandomReader - Class in org.apache.spark.streaming.util
A random access reader for reading write ahead log files written using WriteAheadLogWriter.
WriteAheadLogRandomReader(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
WriteAheadLogReader - Class in org.apache.spark.streaming.util
A reader for reading write ahead log files written using WriteAheadLogWriter.
WriteAheadLogReader(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogReader
 
WriteAheadLogWriter - Class in org.apache.spark.streaming.util
A writer for writing byte-buffers to a write ahead log file.
WriteAheadLogWriter(String, Configuration) - Constructor for class org.apache.spark.streaming.util.WriteAheadLogWriter
 
writeAll(Iterator<T>, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
 
writeArray(ArrayType, Seq<Object>) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeByteBuffer(ByteBuffer, ObjectOutput) - Static method in class org.apache.spark.util.Utils
Primitive often used when writing ByteBuffer to DataOutput
writeDecimal(Decimal, int) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.CompressedMapStatus
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.DirectTaskResult
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.serializer.JavaSerializer
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerId
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.StorageLevel
 
writeExternal(ObjectOutput, Map<CharSequence, CharSequence>, byte[]) - Static method in class org.apache.spark.streaming.flume.EventTransformer
 
writeExternal(ObjectOutput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
 
writeFile() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeFilterFile(int) - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeGlobFiles() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeMap(MapType, Map<?, Object>) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeMetaData(Seq<Attribute>, Path, Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
writeNestedFile1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeNestedFile4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
 
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializationStream
Calling reset to avoid memory leak: http://stackoverflow.com/questions/1281549/memory-leak-traps-in-the-java-standard-api But only call it every 100th time to avoid bloated serialization streams (when the stream 'resets' object class descriptions have to be re-written)
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializationStream
 
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
 
writePrimitive(DataType, Object) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writer() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeStruct(StructType, Row) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeTimestamp(Timestamp) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeToFile(String, Broadcast<SerializableWritable<Configuration>>, int, TaskContext, Iterator<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.CheckpointRDD
 
writeToLog(ByteBuffer) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Write a byte buffer to the log file.
writeValue(DataType, Object) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
writeValue(RecordConsumer) - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 

X

x() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
x() - Method in class org.apache.spark.sql.test.ExamplePoint
 
XORShiftRandom - Class in org.apache.spark.util.random
This class implements a XORShift random number generator algorithm Source: Marsaglia, G.
XORShiftRandom(long) - Constructor for class org.apache.spark.util.random.XORShiftRandom
 
XORShiftRandom() - Constructor for class org.apache.spark.util.random.XORShiftRandom
 

Y

y() - Method in class org.apache.spark.sql.test.ExamplePoint
 
YarnSchedulerBackend - Class in org.apache.spark.scheduler.cluster
Abstract Yarn scheduler backend that contains common logic between the client and cluster Yarn scheduler backends.
YarnSchedulerBackend(TaskSchedulerImpl, SparkContext) - Constructor for class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 

Z

zero() - Method in class org.apache.spark.Accumulable
 
zero(R) - Method in interface org.apache.spark.AccumulableParam
Return the "zero" (identity) value for an accumulator type, given its initial value.
zero(double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
 
zero(float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
 
zero(int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
 
zero(long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
 
zero(R) - Method in class org.apache.spark.GrowableAccumulableParam
 
zero(int, int) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
 
zero(double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
zero(float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
zero(int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
 
zero(long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
 
zero(Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
 
ZeroMQReceiver<T> - Class in org.apache.spark.streaming.zeromq
A receiver to subscribe to ZeroMQ stream.
ZeroMQReceiver(String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQReceiver
 
ZeroMQUtils - Class in org.apache.spark.streaming.zeromq
 
ZeroMQUtils() - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQUtils
 
zeros(int, int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
Generate a DenseMatrix consisting of zeros.
zeros(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a Matrix consisting of zeros.
zeros(int) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a vector of all zeros.
zeros(int) - Static method in class org.apache.spark.util.Vector
 
zeroTime() - Method in class org.apache.spark.streaming.dstream.DStream
 
zeroTime() - Method in class org.apache.spark.streaming.DStreamGraph
 
zip(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zip(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zipPartitions(JavaRDDLike<U, ?>, FlatMapFunction2<Iterator<T>, Iterator<U>, V>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD<B>, boolean, Function2<Iterator<T>, Iterator<B>, Iterator<V>>, ClassTag<B>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD<B>, Function2<Iterator<T>, Iterator<B>, Iterator<V>>, ClassTag<B>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, boolean, Function3<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, Function3<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, RDD<D>, boolean, Function4<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
zipPartitions(RDD<B>, RDD<C>, RDD<D>, Function4<Iterator<T>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
 
ZippedPartitionsBaseRDD<V> - Class in org.apache.spark.rdd
 
ZippedPartitionsBaseRDD(SparkContext, Seq<RDD<?>>, boolean, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
ZippedPartitionsPartition - Class in org.apache.spark.rdd
 
ZippedPartitionsPartition(int, Seq<RDD<?>>, Seq<String>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsPartition
 
ZippedPartitionsRDD2<A,B,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD2(SparkContext, Function2<Iterator<A>, Iterator<B>, Iterator<V>>, RDD<A>, RDD<B>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD2
 
ZippedPartitionsRDD3<A,B,C,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD3(SparkContext, Function3<Iterator<A>, Iterator<B>, Iterator<C>, Iterator<V>>, RDD<A>, RDD<B>, RDD<C>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<C>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD3
 
ZippedPartitionsRDD4<A,B,C,D,V> - Class in org.apache.spark.rdd
 
ZippedPartitionsRDD4(SparkContext, Function4<Iterator<A>, Iterator<B>, Iterator<C>, Iterator<D>, Iterator<V>>, RDD<A>, RDD<B>, RDD<C>, RDD<D>, boolean, ClassTag<A>, ClassTag<B>, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.ZippedPartitionsRDD4
 
ZippedWithIndexRDD<T> - Class in org.apache.spark.rdd
Represents a RDD zipped with its element indices.
ZippedWithIndexRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ZippedWithIndexRDD
 
ZippedWithIndexRDDPartition - Class in org.apache.spark.rdd
 
ZippedWithIndexRDDPartition(Partition, long) - Constructor for class org.apache.spark.rdd.ZippedWithIndexRDDPartition
 
zipWithIndex() - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with its element indices.
zipWithIndex() - Method in class org.apache.spark.rdd.RDD
Zips this RDD with its element indices.
zipWithUniqueId() - Method in interface org.apache.spark.api.java.JavaRDDLike
Zips this RDD with generated unique Long ids.
zipWithUniqueId() - Method in class org.apache.spark.rdd.RDD
Zips this RDD with generated unique Long ids.

_

_1() - Method in class org.apache.spark.util.MutablePair
 
_2() - Method in class org.apache.spark.util.MutablePair
 
_message() - Method in class org.apache.spark.scheduler.SlaveLost
 
_rddInfoMap() - Method in class org.apache.spark.ui.storage.StorageListener
 
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _