A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ 

A

abort(String) - Method in class org.apache.spark.scheduler.TaskSetManager
 
abortStage(Stage, String) - Method in class org.apache.spark.scheduler.DAGScheduler
Aborts all jobs depending on a particular Stage.
abs(Column) - Static method in class org.apache.spark.sql.functions
Computes the absolutle value.
AbsoluteError - Class in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Class for absolute error loss calculation (for regression).
AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
 
AbstractJavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Class in org.apache.spark.streaming.api.java
As a workaround for https://issues.scala-lang.org/browse/SI-8905, implementations of JavaDStreamLike should extend this dummy abstract class instead of directly inheriting from the trait.
AbstractJavaDStreamLike() - Constructor for class org.apache.spark.streaming.api.java.AbstractJavaDStreamLike
 
AbstractJavaRDDLike<T,This extends JavaRDDLike<T,This>> - Class in org.apache.spark.api.java
As a workaround for https://issues.scala-lang.org/browse/SI-8905, implementations of JavaRDDLike should extend this dummy abstract class instead of directly inheriting from the trait.
AbstractJavaRDDLike() - Constructor for class org.apache.spark.api.java.AbstractJavaRDDLike
 
accept(File, String) - Method in class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
 
AcceptanceResult - Class in org.apache.spark.util.random
Object used by seqOp to keep track of the number of items accepted and items waitlisted per stratum, as well as the bounds for accepting and waitlisting items.
AcceptanceResult(long, long) - Constructor for class org.apache.spark.util.random.AcceptanceResult
 
acceptBound() - Method in class org.apache.spark.util.random.AcceptanceResult
 
Accumulable<R,T> - Class in org.apache.spark
A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
 
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
 
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
Create an Accumulable shared variable, to which tasks can add values with +=.
accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
Create an Accumulable shared variable, with a name for display in the Spark UI.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
Create an accumulator from a "mutable collection" type.
AccumulableInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Information about an Accumulable modified during a task or stage.
AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
 
accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
AccumulableParam<R,T> - Interface in org.apache.spark
Helper object defining how to accumulate values of a particular type.
accumulables() - Method in class org.apache.spark.scheduler.StageInfo
Terminal values of accumulables updated during this stage.
accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
Intermediate updates to accumulables during this task.
accumulables() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
Accumulator<T> - Class in org.apache.spark
A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.e.
Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
 
Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
 
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Create an Accumulator variable of a given type, which tasks can "add" values to using the += method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Create an Accumulator variable of a given type, with a name for display in the Spark UI.
accumulator() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
AccumulatorParam<T> - Interface in org.apache.spark
A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.
AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark
 
AccumulatorParam.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
 
AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark
 
AccumulatorParam.FloatAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
 
AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark
 
AccumulatorParam.IntAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
 
AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark
 
AccumulatorParam.LongAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
 
Accumulators - Class in org.apache.spark
 
Accumulators() - Constructor for class org.apache.spark.Accumulators
 
accumUpdates() - Method in class org.apache.spark.scheduler.CompletionEvent
 
accumUpdates() - Method in class org.apache.spark.scheduler.DirectTaskResult
 
accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns accuracy
aclsEnabled() - Method in class org.apache.spark.SecurityManager
Check to see if Acls for the UI are enabled
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
activeExecutorIds() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
ActiveJob - Class in org.apache.spark.scheduler
Tracks information about an active job in the DAGScheduler.
ActiveJob(int, Stage, Function2<TaskContext, Iterator<Object>, ?>, int[], CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.ActiveJob
 
activeJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
activeTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
activeTaskSets() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
 
ActorHelper - Interface in org.apache.spark.streaming.receiver
:: DeveloperApi :: A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.
ActorLogReceive - Interface in org.apache.spark.util
A trait to enable logging all Akka actor messages.
ActorReceiver<T> - Class in org.apache.spark.streaming.receiver
Provides Actors as receivers for receiving stream.
ActorReceiver(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver
 
ActorReceiver.Supervisor - Class in org.apache.spark.streaming.receiver
 
ActorReceiver.Supervisor() - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
 
ActorReceiverData - Interface in org.apache.spark.streaming.receiver
Case class to receive data sent by child actors
actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream with any arbitrary user implemented actor receiver.
ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
:: DeveloperApi :: A helper with set of defaults for supervisor strategy
ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
 
actorSystem() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
actorSystem() - Method in class org.apache.spark.SparkEnv
 
actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Returns the size of the value row(ordinal).
actualSize(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
add(T) - Method in class org.apache.spark.Accumulable
Add more data to this accumulator / accumulable
add(Map<Object, Object>) - Static method in class org.apache.spark.Accumulators
 
add(long, long, ED) - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
Add a new edge to the partition.
add(long, long, int, int, ED) - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
Add a new edge to the partition.
add(float[], double, double) - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
Adds an observation.
add(ALS.Rating<ID>) - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
Adds a rating.
add(int, Object, int[], float[]) - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlockBuilder
Adds a dst block of (srcId, dstLocalIndex, rating) tuples.
add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
 
add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
Adds a new document.
add(Iterable<T>, long) - Method in class org.apache.spark.mllib.fpm.FPTree
Adds a transaction with count.
add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Adds two block matrices together.
add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
Add a new sample to this summarizer, and update the statistical summary.
add(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Add the stats from another calculator into this one, modifying and returning this calculator.
add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
add(Vector) - Method in class org.apache.spark.util.Vector
 
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
 
addAccumulator(R, T) - Method in class org.apache.spark.GrowableAccumulableParam
 
addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
 
addBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus
Add the given block to this storage status.
AddBlock - Class in org.apache.spark.streaming.scheduler
 
AddBlock(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.AddBlock
 
addBlock(ReceivedBlockInfo) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Add received block.
addBoolean(boolean) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addData(Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Push a single data item into the buffer.
addDataWithCallback(Object, Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
Push a single data item into the buffer.
addDouble(double) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addedFiles() - Method in class org.apache.spark.SparkContext
 
addedJars() - Method in class org.apache.spark.SparkContext
 
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(File) - Method in class org.apache.spark.HttpFileServer
 
addFile(String) - Method in class org.apache.spark.SparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(String, boolean) - Method in class org.apache.spark.SparkContext
Add a file to be downloaded with this Spark job on every node.
AddFile - Class in org.apache.spark.sql.hive.execution
 
AddFile(String) - Constructor for class org.apache.spark.sql.hive.execution.AddFile
 
addFileToDir(File, File) - Method in class org.apache.spark.HttpFileServer
 
addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
Add filters, if any, to the given list of ServletContextHandlers
addFloat(float) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a param with multiple values (overwrites if the input param exists).
addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a double param with multiple values.
addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a int param with multiple values.
addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a float param with multiple values.
addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a long param with multiple values.
addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a boolean param with true and false.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
Merge two accumulated values together.
addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
 
addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
 
addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
 
addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
 
addInPlace(R, R) - Method in class org.apache.spark.GrowableAccumulableParam
 
addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
 
addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
 
addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
 
addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
 
addInPlace(Vector) - Method in class org.apache.spark.util.Vector
 
addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
 
addInputStream(InputDStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
 
addInt(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(File) - Method in class org.apache.spark.HttpFileServer
 
addJar(String) - Method in class org.apache.spark.SparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
AddJar - Class in org.apache.spark.sql.hive.execution
 
AddJar(String) - Constructor for class org.apache.spark.sql.hive.execution.AddJar
 
addListener(L) - Method in interface org.apache.spark.util.ListenerBus
Add a listener to listen events.
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
Add Hadoop configuration specific to a single partition and attempt.
addLong(long) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext
Adds a callback function to be executed on task completion.
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
 
addOutputColumn(StructType, String, DataType) - Method in interface org.apache.spark.ml.param.Params
 
addOutputLoc(int, MapStatus) - Method in class org.apache.spark.scheduler.Stage
 
addOutputStream(DStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
 
addPartitioningAttributes(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
 
addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
address() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
address(String, String, String, Object, String) - Static method in class org.apache.spark.util.AkkaUtils
 
addresses() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
 
addRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager
If the given task ID is not in the set of running tasks, adds it.
addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
 
addSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
 
addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
 
addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
Adds a (Java friendly) listener to be executed on task completion.
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext
Adds a listener in the form of a Scala closure to be executed on task completion.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContextImpl
 
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
 
addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
addTaskSetManager(Schedulable, Properties) - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
addURL(URL) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
 
addURL(URL) - Method in class org.apache.spark.util.MutableURLClassLoader
 
addValueFromDictionary(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
 
adminAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
advance(long) - Method in class org.apache.spark.util.ManualClock
 
advanceCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
Advance the checkpoint clock by the checkpoint interval.
agg(Column, Column...) - Method in class org.apache.spark.sql.DataFrame
Aggregates on the entire DataFrame without groups.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.DataFrame
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
(Scala-specific) Aggregates on the entire DataFrame without groups.
agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
(Java-specific) Aggregates on the entire DataFrame without groups.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
Aggregates on the entire DataFrame without groups.
agg(Column, Column...) - Method in class org.apache.spark.sql.GroupedData
Compute aggregates by specifying a series of aggregate columns.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.GroupedData
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
(Java-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.GroupedData
Compute aggregates by specifying a series of aggregate columns.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesEdgeScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Send messages along edges and aggregate them at the receiving vertices.
aggregateMessagesIndexScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
Send messages along edges and aggregate them at the receiving vertices.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
aggregateSizeForNode(DecisionTreeMetadata, Option<int[]>) - Static method in class org.apache.spark.mllib.tree.RandomForest
Get the number of values to be stored for this node in the bin aggregates.
aggregateUsingIndex(Iterator<Product2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
 
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.
AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
 
AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
Aggregator<K,V,C> - Class in org.apache.spark
:: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
 
aggregator() - Method in class org.apache.spark.ShuffleDependency
 
akkaSSLOptions() - Method in class org.apache.spark.SecurityManager
 
AkkaUtils - Class in org.apache.spark.util
Various utility classes for working with Akka.
AkkaUtils() - Constructor for class org.apache.spark.util.AkkaUtils
 
Algo - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
 
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
 
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
alias() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
aliasNames() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
All - Static variable in class org.apache.spark.graphx.TripletFields
Expose all the fields (source, edge, and destination).
all() - Static method in class org.apache.spark.sql.types.NativeType
 
AllCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
 
AllJobsCancelled - Class in org.apache.spark.scheduler
 
AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
 
AllJobsPage - Class in org.apache.spark.ui.jobs
Page showing list of all ongoing and recently finished jobs
AllJobsPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.AllJobsPage
 
allJoinTokens() - Static method in class org.apache.spark.sql.hive.HiveQl
 
allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Allocate all unallocated blocks to the given batch.
allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Allocate all unallocated blocks to the given batch.
AllocatedBlocks - Class in org.apache.spark.streaming.scheduler
Class representing the blocks of all the streams allocated to a batch
AllocatedBlocks(Map<Object, Seq<ReceivedBlockInfo>>) - Constructor for class org.apache.spark.streaming.scheduler.AllocatedBlocks
 
allocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
allowExisting() - Method in class org.apache.spark.sql.sources.CreateTableUsing
 
allowLocal() - Method in class org.apache.spark.scheduler.JobSubmitted
 
allPendingTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
AllStagesPage - Class in org.apache.spark.ui.jobs
Page showing list of all ongoing and recently finished stages and pools
AllStagesPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.AllStagesPage
 
alpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams
Param for the alpha parameter in the implicit preference formulation.
AlphaComponent - Annotation Type in org.apache.spark.annotation
A new component of Spark which may have unstable API's.
ALS - Class in org.apache.spark.ml.recommendation
Alternating Least Squares (ALS) matrix factorization.
ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
 
ALS - Class in org.apache.spark.mllib.recommendation
Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
ALS.CholeskySolver - Class in org.apache.spark.ml.recommendation
Cholesky solver for least square problems.
ALS.CholeskySolver() - Constructor for class org.apache.spark.ml.recommendation.ALS.CholeskySolver
 
ALS.InBlock<ID> - Class in org.apache.spark.ml.recommendation
In-link block for computing src (user/item) factors.
ALS.InBlock(Object, int[], int[], float[], ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.InBlock
 
ALS.InBlock$ - Class in org.apache.spark.ml.recommendation
 
ALS.InBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.InBlock$
 
ALS.LeastSquaresNESolver - Interface in org.apache.spark.ml.recommendation
Trait for least squares solvers applied to the normal equation.
ALS.LocalIndexEncoder - Class in org.apache.spark.ml.recommendation
Encoder for storing (blockId, localIndex) into a single integer.
ALS.LocalIndexEncoder(int) - Constructor for class org.apache.spark.ml.recommendation.ALS.LocalIndexEncoder
 
ALS.NNLSSolver - Class in org.apache.spark.ml.recommendation
NNLS solver.
ALS.NNLSSolver() - Constructor for class org.apache.spark.ml.recommendation.ALS.NNLSSolver
 
ALS.NormalEquation - Class in org.apache.spark.ml.recommendation
Representing a normal equation to solve the following weighted least squares problem:
ALS.NormalEquation(int) - Constructor for class org.apache.spark.ml.recommendation.ALS.NormalEquation
 
ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation
Rating class for better code readability.
ALS.Rating(ID, ID, float) - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating
 
ALS.Rating$ - Class in org.apache.spark.ml.recommendation
 
ALS.Rating$() - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating$
 
ALS.RatingBlock<ID> - Class in org.apache.spark.ml.recommendation
A rating block that contains src IDs, dst IDs, and ratings, stored in primitive arrays.
ALS.RatingBlock(Object, Object, float[], ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlock
 
ALS.RatingBlock$ - Class in org.apache.spark.ml.recommendation
 
ALS.RatingBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlock$
 
ALS.RatingBlockBuilder<ID> - Class in org.apache.spark.ml.recommendation
Builder for ALS.RatingBlock.
ALS.RatingBlockBuilder(ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
 
ALS.UncompressedInBlock<ID> - Class in org.apache.spark.ml.recommendation
A block of (srcId, dstEncodedIndex, rating) tuples stored in primitive arrays.
ALS.UncompressedInBlock(Object, int[], float[], ClassTag<ID>, Ordering<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
 
ALS.UncompressedInBlockBuilder<ID> - Class in org.apache.spark.ml.recommendation
Builder for uncompressed in-blocks of (srcId, dstEncodedIndex, rating) tuples.
ALS.UncompressedInBlockBuilder(ALS.LocalIndexEncoder, ClassTag<ID>, Ordering<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.UncompressedInBlockBuilder
 
ALSModel - Class in org.apache.spark.ml.recommendation
Model fitted by ALS.
ALSModel(ALS, ParamMap, int, RDD<Tuple2<Object, float[]>>, RDD<Tuple2<Object, float[]>>) - Constructor for class org.apache.spark.ml.recommendation.ALSModel
 
ALSParams - Interface in org.apache.spark.ml.recommendation
Common params for ALS.
AnalysisException - Exception in org.apache.spark.sql
:: DeveloperApi :: Thrown when a query fails to analyze, usually because the query itself is invalid.
analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext
Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
AnalyzeTable - Class in org.apache.spark.sql.hive.execution
Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.execution.AnalyzeTable
 
and(Column) - Method in class org.apache.spark.sql.Column
Boolean AND.
AND() - Static method in class org.apache.spark.sql.hive.HiveQl
 
And - Class in org.apache.spark.sql.sources
A filter that evaluates to true iff both left or right evaluate to true.
And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
 
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
 
anyNull() - Method in interface org.apache.spark.sql.Row
Returns true if there are any NULL values in this row.
append(boolean, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
append(byte, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
 
append(byte[], ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
 
append(JvmType, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
Appends the given value v of type T into the given ByteBuffer.
append(Row, int, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
Appends row(ordinal) of type T into the given ByteBuffer.
append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
 
append(double, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
append(float, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
 
append(long, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
 
append(short, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
 
append(String, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
 
append(Timestamp, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
 
append(AvroFlumeEvent) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
 
appendBatch(List<AvroFlumeEvent>) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
 
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
Returns a new vector with 1.0 (bias) appended to the input vector.
appendFrom(Row, int) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Appends row(ordinal) to the column builder.
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
AppendingParquetOutputFormat - Class in org.apache.spark.sql.parquet
TODO: this will be able to append to directories it created itself, not necessarily to imported ones.
AppendingParquetOutputFormat(int) - Constructor for class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
 
appId() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
appId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
appId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
appId() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
ApplicationEventListener - Class in org.apache.spark.scheduler
A simple listener for application events.
ApplicationEventListener() - Constructor for class org.apache.spark.scheduler.ApplicationEventListener
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
applicationId() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
applicationId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
Get an application ID associated with the job.
applicationId() - Method in interface org.apache.spark.scheduler.TaskScheduler
Get an application ID associated with the job.
applicationId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
applicationId() - Method in class org.apache.spark.SparkContext
 
applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
 
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
Construct a graph from a collection of vertices and edges with attributes.
apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from edges, setting referenced vertices to `defaultVertexAttr`.
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`.
apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a `ShippableVertexPartition` from the given vertices without any routing table.
apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a ShippableVertexPartition from the given vertices with the specified routing table, filling in missing vertices mentioned in the routing table using defaultVal.
apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
Construct a ShippableVertexPartition from the given vertices with the specified routing table, filling in missing vertices mentioned in the routing table using defaultVal, and merging duplicate vertex atrribute with mergeFunc.
apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition
Construct a `VertexPartition` from the given vertices.
apply(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
Return the vertex attribute for the given vertex ID.
apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
Execute a Pregel-like iterative vertex-parallel abstraction.
apply(K) - Method in class org.apache.spark.graphx.util.collection.GraphXPrimitiveKeyOpenHashMap
Get the value for a given key
apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a standalone VertexRDD (one that is not set up for efficient joins with an EdgeRDD) from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
Gets the value of the input param or its default value if it does not exist.
apply(BinaryConfusionMatrix) - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryClassificationMetricComputer
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
 
apply(BinaryConfusionMatrix) - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
 
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
 
apply(int, int, int, int) - Static method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
Creates a new GridPartitioner instance.
apply(int, int, int) - Static method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
Creates a new GridPartitioner instance with the input suggested number of partitions.
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
Gets the value of the ith element.
apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
 
apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
 
apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
 
apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
Alternate factory method that takes a ByteBuffer directly for the data field
apply(BlockManagerId, long[]) - Static method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
 
apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
 
apply(String) - Static method in class org.apache.spark.sql.Column
 
apply(Expression) - Static method in class org.apache.spark.sql.Column
 
apply(DataType) - Static method in class org.apache.spark.sql.columnar.ColumnType
 
apply(boolean, int, StorageLevel, SparkPlan, Option<String>) - Static method in class org.apache.spark.sql.columnar.InMemoryRelation
 
apply(String) - Method in class org.apache.spark.sql.DataFrame
Selects column based on the column name and return it as a Column.
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.ParquetConversions
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.DataSinks
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveDDLStrategy
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.Scripts
 
apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.ResolveUdtfsAlias
 
apply(int, long) - Static method in class org.apache.spark.sql.parquet.timestamp.NanoTime
 
apply(int) - Method in interface org.apache.spark.sql.Row
Returns the value at position i.
apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.DataSourceStrategy
 
apply(String, boolean) - Method in class org.apache.spark.sql.sources.DDLParser
 
apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.PreInsertCastAndRename
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.sources.PreWriteCheck
 
apply(SQLContext, Option<StructType>, String, Map<String, String>) - Static method in class org.apache.spark.sql.sources.ResolvedDataSource
Create a ResolvedDataSource for reading data in.
apply(SQLContext, String, SaveMode, Map<String, String>, DataFrame) - Static method in class org.apache.spark.sql.sources.ResolvedDataSource
Create a ResolvedDataSource for saving the content of the given DataFrame.
apply(DataType) - Static method in class org.apache.spark.sql.types.ArrayType
Construct a ArrayType object with the given element type.
apply(double) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(long) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(int) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(String) - Static method in class org.apache.spark.sql.types.Decimal
 
apply() - Static method in class org.apache.spark.sql.types.DecimalType
 
apply(int, int) - Static method in class org.apache.spark.sql.types.DecimalType
 
apply(DataType, DataType) - Static method in class org.apache.spark.sql.types.MapType
Construct a MapType object with the given key type and value type.
apply(String) - Method in class org.apache.spark.sql.types.StructType
Extracts a StructField of the given name.
apply(Set<String>) - Method in class org.apache.spark.sql.types.StructType
Returns a StructType containing StructFields of the given names, preserving the original order of fields.
apply(int) - Method in class org.apache.spark.sql.types.StructType
 
apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedFunction
 
apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedPythonFunction
Returns a Column that will evaluate to calling this UDF with the given input.
apply(String) - Static method in class org.apache.spark.storage.BlockId
Converts a BlockId "name" String back into a BlockId.
apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId
Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
 
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
 
apply(Map<String, String>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig$
Make a consumer config without requiring group.id or zookeeper.connect, since communicating with brokers also needs common settings such as timeout
apply(SparkContext, Map<String, String>, Map<TopicAndPartition, Object>, Map<TopicAndPartition, KafkaCluster.LeaderOffset>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaRDD
 
apply(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
apply(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
apply(Tuple4<String, Object, Object, Object>) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
 
apply(long) - Static method in class org.apache.spark.streaming.Minutes
 
apply(long) - Static method in class org.apache.spark.streaming.Seconds
 
apply(I, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.CompletionIterator
 
apply(Traversable<Object>) - Static method in class org.apache.spark.util.Distribution
 
apply(InputStream, File, SparkConf) - Static method in class org.apache.spark.util.logging.FileAppender
Create the right appender based on Spark configuration
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values passed as variable-length arguments.
apply(A) - Method in class org.apache.spark.util.TimeStampedHashMap
 
apply(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
apply(int) - Method in class org.apache.spark.util.Vector
 
applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Creates a DataFrame from an RDD containing Rows by applying a schema to this RDD.
applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
 
applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Applies a schema to an RDD of Java Beans.
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Applies a schema to an RDD of Java Beans.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
 
appName() - Method in class org.apache.spark.scheduler.ApplicationEventListener
 
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
appName() - Method in class org.apache.spark.SparkContext
 
appName() - Method in class org.apache.spark.ui.SparkUI
 
appName() - Method in class org.apache.spark.ui.SparkUITab
 
approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
ApproximateActionListener<T,U,R> - Class in org.apache.spark.partial
A JobListener for an approximate single-result action, such as count() or non-parallel reduce().
ApproximateActionListener(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Constructor for class org.apache.spark.partial.ApproximateActionListener
 
ApproximateEvaluator<U,R> - Interface in org.apache.spark.partial
An object that computes a function incrementally by merging in results of type U from multiple tasks.
appUIAddress() - Method in class org.apache.spark.ui.SparkUI
 
appUIHostPort() - Method in class org.apache.spark.ui.SparkUI
Return the application UI host:port.
AreaUnderCurve - Class in org.apache.spark.mllib.evaluation
Computes the area under the curve (AUC) using the trapezoidal rule.
AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
 
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the receiver operating characteristic (ROC) curve.
areBoundsEmpty() - Method in class org.apache.spark.util.random.AcceptanceResult
 
argString() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
arr() - Method in class org.apache.spark.rdd.PartitionGroup
 
array(DataType) - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type array
ARRAY() - Static method in class org.apache.spark.sql.hive.HiveQl
 
ARRAY_CONTAINS_NULL_BAG_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
ARRAY_ELEMENTS_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
 
arrayBuffer() - Method in class org.apache.spark.streaming.receiver.ArrayBufferBlock
 
ArrayBufferBlock - Class in org.apache.spark.streaming.receiver
class representing a block received as an ArrayBuffer
ArrayBufferBlock(ArrayBuffer<?>) - Constructor for class org.apache.spark.streaming.receiver.ArrayBufferBlock
 
ArrayType - Class in org.apache.spark.sql.types
:: DeveloperApi :: The data type for collections of multiple values.
ArrayType(DataType, boolean) - Constructor for class org.apache.spark.sql.types.ArrayType
 
arrayType() - Method in interface org.apache.spark.sql.types.DataTypeParser
 
ArrayValues - Class in org.apache.spark.storage
 
ArrayValues(Object[]) - Constructor for class org.apache.spark.storage.ArrayValues
 
as(String) - Method in class org.apache.spark.sql.Column
Gives the column an alias.
as(Symbol) - Method in class org.apache.spark.sql.Column
Gives the column an alias.
as(String) - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame with an alias set.
as(Symbol) - Method in class org.apache.spark.sql.DataFrame
(Scala-specific) Returns a new DataFrame with an alias set.
asc() - Method in class org.apache.spark.sql.Column
Returns an ordering used in sorting.
asc(String) - Static method in class org.apache.spark.sql.functions
Returns a sort expression based on ascending order of the column.
asIntegral() - Method in class org.apache.spark.sql.types.DecimalType
 
asIntegral() - Method in class org.apache.spark.sql.types.DoubleType
 
asIntegral() - Method in class org.apache.spark.sql.types.FloatType
 
asIntegral() - Method in class org.apache.spark.sql.types.FractionalType
 
asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
Read the elements of this stream through an iterator.
AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
 
AskPermissionToCommitOutput(int, long, long) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
 
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
askTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
Returns the default Spark timeout to use for Akka ask operations.
askWithReply(Object, ActorRef, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
Send a message to the given actor and get its result within a default timeout, or throw a SparkException if this fails.
askWithReply(Object, ActorRef, int, int, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
Send a message to the given actor and get its result within a default timeout, or throw a SparkException if this fails even after the specified number of retries.
asNullable() - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
asNullable() - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
asNullable() - Method in class org.apache.spark.sql.types.ArrayType
 
asNullable() - Method in class org.apache.spark.sql.types.BinaryType
 
asNullable() - Method in class org.apache.spark.sql.types.BooleanType
 
asNullable() - Method in class org.apache.spark.sql.types.ByteType
 
asNullable() - Method in class org.apache.spark.sql.types.DataType
Returns the same data type but set all nullability fields are true (StructField.nullable, ArrayType.containsNull, and MapType.valueContainsNull).
asNullable() - Method in class org.apache.spark.sql.types.DateType
 
asNullable() - Method in class org.apache.spark.sql.types.DecimalType
 
asNullable() - Method in class org.apache.spark.sql.types.DoubleType
 
asNullable() - Method in class org.apache.spark.sql.types.FloatType
 
asNullable() - Method in class org.apache.spark.sql.types.IntegerType
 
asNullable() - Method in class org.apache.spark.sql.types.LongType
 
asNullable() - Method in class org.apache.spark.sql.types.MapType
 
asNullable() - Method in class org.apache.spark.sql.types.NullType
 
asNullable() - Method in class org.apache.spark.sql.types.ShortType
 
asNullable() - Method in class org.apache.spark.sql.types.StringType
 
asNullable() - Method in class org.apache.spark.sql.types.StructType
 
asNullable() - Method in class org.apache.spark.sql.types.TimestampType
 
asNullable() - Method in class org.apache.spark.sql.types.UserDefinedType
For UDT, asNullable will not change the nullability of its internal sqlType and just returns itself.
asRDDId() - Method in class org.apache.spark.storage.BlockId
 
assertValid() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Check validity of parameters.
assertValid() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Check validity of parameters.
assertValid() - Method in class org.apache.spark.rdd.BlockRDD
Check if this BlockRDD is valid.
assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
 
AsynchronousListenerBus<L,E> - Class in org.apache.spark.util
Asynchronously passes events to registered listeners.
AsynchronousListenerBus(String) - Constructor for class org.apache.spark.util.AsynchronousListenerBus
 
AsyncRDDActions<T> - Class in org.apache.spark.rdd
A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
 
ata() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
A^T^ * A
atb() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
A^T^ * b
attachExecutor(ReceiverSupervisor) - Method in class org.apache.spark.streaming.receiver.Receiver
Attach Network Receiver executor to this receiver.
attachHandler(ServletContextHandler) - Method in class org.apache.spark.ui.WebUI
Attach a handler to this UI.
attachListener(CleanerListener) - Method in class org.apache.spark.ContextCleaner
Attach a listener object to get information of when objects are cleaned.
attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUI
Attach a page to this UI.
attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUITab
Attach a page to this tab.
attachTab(WebUITab) - Method in class org.apache.spark.ui.WebUI
Attach a tab to this UI, along with all of its attached pages.
attempt() - Method in class org.apache.spark.scheduler.TaskInfo
 
attempt() - Method in class org.apache.spark.scheduler.TaskSet
 
attemptId() - Method in class org.apache.spark.scheduler.Stage
 
attemptId() - Method in class org.apache.spark.scheduler.StageInfo
 
attemptID() - Method in class org.apache.spark.TaskCommitDenied
 
attemptId() - Method in class org.apache.spark.TaskContext
 
attemptId() - Method in class org.apache.spark.TaskContextImpl
 
attemptNumber() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
 
attemptNumber() - Method in class org.apache.spark.scheduler.TaskDescription
 
attemptNumber() - Method in class org.apache.spark.TaskContext
How many times this task has been attempted.
attemptNumber() - Method in class org.apache.spark.TaskContextImpl
 
attr() - Method in class org.apache.spark.graphx.Edge
 
attr() - Method in class org.apache.spark.graphx.EdgeContext
The attribute associated with the edge.
attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
attr() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
attribute() - Method in class org.apache.spark.sql.sources.EqualTo
 
attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
 
attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
 
attribute() - Method in class org.apache.spark.sql.sources.In
 
attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
 
attribute() - Method in class org.apache.spark.sql.sources.IsNull
 
attribute() - Method in class org.apache.spark.sql.sources.LessThan
 
attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
 
attribute() - Method in class org.apache.spark.sql.sources.StringContains
 
attribute() - Method in class org.apache.spark.sql.sources.StringEndsWith
 
attribute() - Method in class org.apache.spark.sql.sources.StringStartsWith
 
attributeMap() - Method in class org.apache.spark.sql.hive.MetastoreRelation
An attribute map that can be used to lookup original attributes based on expression id.
attributeMap() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
attributeMap() - Method in class org.apache.spark.sql.sources.LogicalRelation
Used to lookup original attribute capitalization
attributes() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
attributes() - Method in class org.apache.spark.sql.hive.MetastoreRelation
Non-partitionKey attributes
attributes() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
attributes() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
 
attrs() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
 
AUTO_BROADCASTJOIN_THRESHOLD() - Static method in class org.apache.spark.sql.SQLConf
 
autoBroadcastJoinThreshold() - Method in class org.apache.spark.sql.SQLConf
Upper bound on the sizes (in bytes) of the tables qualified for the auto conversion to a broadcast value during the physical executions of join operations.
Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
avg(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the average of the values in a group.
avg(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the average of the values in a group.
avg(String...) - Method in class org.apache.spark.sql.GroupedData
Compute the mean value for each numeric columns for each group.
avg(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
Compute the mean value for each numeric columns for each group.
AVG() - Static method in class org.apache.spark.sql.hive.HiveQl
 
awaitResult() - Method in class org.apache.spark.partial.ApproximateActionListener
Waits for up to timeout milliseconds since the listener was created and then returns a PartialResult with the result so far.
awaitResult() - Method in class org.apache.spark.scheduler.JobWaiter
 
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
Wait the thread until the supervisor is stopped
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.util.logging.FileAppender
Wait for the appender to stop appending, either because input stream is closed or because of any error in appending
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y += a * x

B

backend() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
BaggedPoint<Datum> - Class in org.apache.spark.mllib.tree.impl
Internal representation of a datapoint which belongs to several subsamples of the same dataset, particularly for bagging (e.g., for random forests).
BaggedPoint(Datum, double[]) - Constructor for class org.apache.spark.mllib.tree.impl.BaggedPoint
 
base() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
 
baseDir() - Method in class org.apache.spark.HttpFileServer
 
baseMap() - Method in class org.apache.spark.sql.sources.CaseInsensitiveMap
 
baseOn(ParamPair<?>...) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
baseOn(ParamMap) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
baseOn(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
basePath() - Method in class org.apache.spark.ui.SparkUI
 
basePath() - Method in class org.apache.spark.ui.WebUITab
 
BaseRelation - Class in org.apache.spark.sql.sources
::DeveloperApi:: Represents a collection of tuples with a known schema.
BaseRelation() - Constructor for class org.apache.spark.sql.sources.BaseRelation
 
baseRelationToDataFrame(BaseRelation) - Method in class org.apache.spark.sql.SQLContext
Convert a BaseRelation created for external data sources into a DataFrame.
BasicColumnAccessor<T extends DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
BasicColumnAccessor(ByteBuffer, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.BasicColumnAccessor
 
BasicColumnBuilder<T extends DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
BasicColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.BasicColumnBuilder
 
basicSparkPage(Function0<Seq<Node>>, String) - Static method in class org.apache.spark.ui.UIUtils
Returns a page with the spark css/js and a simple format.
BatchAllocationEvent - Class in org.apache.spark.streaming.scheduler
 
BatchAllocationEvent(Time, AllocatedBlocks) - Constructor for class org.apache.spark.streaming.scheduler.BatchAllocationEvent
 
BatchCleanupEvent - Class in org.apache.spark.streaming.scheduler
 
BatchCleanupEvent(Seq<Time>) - Constructor for class org.apache.spark.streaming.scheduler.BatchCleanupEvent
 
batchDuration() - Method in class org.apache.spark.streaming.DStreamGraph
 
batchDuration() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
 
BATCHES() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
batchForTime() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
 
BatchInfo - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, ReceivedBlockInfo[]>, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
batchSize() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
batchTimeToSelectedFiles() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
 
BeginEvent - Class in org.apache.spark.scheduler
 
BeginEvent(Task<?>, TaskInfo) - Constructor for class org.apache.spark.scheduler.BeginEvent
 
beginTime() - Method in class org.apache.spark.streaming.Interval
 
benchmark(int) - Static method in class org.apache.spark.util.random.XORShiftRandom
 
BernoulliCellSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials for partitioning a data sequence.
BernoulliCellSampler(double, double, boolean) - Constructor for class org.apache.spark.util.random.BernoulliCellSampler
 
BernoulliSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.BernoulliSampler
 
bestModel() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
beta() - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
 
BETWEEN() - Static method in class org.apache.spark.sql.hive.HiveQl
 
Bin - Class in org.apache.spark.mllib.tree.model
Used for "binning" the feature values for faster best split calculation.
Bin(Split, Split, Enumeration.Value, double) - Constructor for class org.apache.spark.mllib.tree.model.Bin
 
BINARY - Class in org.apache.spark.sql.columnar
 
BINARY() - Constructor for class org.apache.spark.sql.columnar.BINARY
 
binary() - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type binary
BinaryClassificationEvaluator - Class in org.apache.spark.ml.evaluation
:: AlphaComponent ::
BinaryClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
BinaryClassificationMetricComputer - Interface in org.apache.spark.mllib.evaluation.binary
Trait for a binary classification evaluation metric computer.
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation
:: Experimental :: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>, int) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
 
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Defaults numBins to 0.
BinaryColumnAccessor - Class in org.apache.spark.sql.columnar
 
BinaryColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.BinaryColumnAccessor
 
BinaryColumnBuilder - Class in org.apache.spark.sql.columnar
 
BinaryColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.BinaryColumnBuilder
 
BinaryColumnStats - Class in org.apache.spark.sql.columnar
 
BinaryColumnStats() - Constructor for class org.apache.spark.sql.columnar.BinaryColumnStats
 
BinaryConfusionMatrix - Interface in org.apache.spark.mllib.evaluation.binary
Trait for a binary confusion matrix.
BinaryConfusionMatrixImpl - Class in org.apache.spark.mllib.evaluation.binary
Implementation of BinaryConfusionMatrix.
BinaryConfusionMatrixImpl(BinaryLabelCounter, BinaryLabelCounter) - Constructor for class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
BinaryConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
BinaryFileRDD<T> - Class in org.apache.spark.rdd
 
BinaryFileRDD(SparkContext, Class<? extends StreamFileInputFormat<T>>, Class<String>, Class<T>, Configuration, int) - Constructor for class org.apache.spark.rdd.BinaryFileRDD
 
binaryFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext
:: Experimental ::
binaryFiles(String, int) - Method in class org.apache.spark.SparkContext
:: Experimental ::
BinaryLabelCounter - Class in org.apache.spark.mllib.evaluation.binary
A counter for positives and negatives.
BinaryLabelCounter(long, long) - Constructor for class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators
Function to check if labels used for classification are either zero or one.
BinaryLongConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
binaryRecords(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
:: Experimental ::
binaryRecords(String, int, Configuration) - Method in class org.apache.spark.SparkContext
:: Experimental ::
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
:: Experimental ::
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.StreamingContext
:: Experimental ::
BinaryType - Class in org.apache.spark.sql.types
:: DeveloperApi :: The data type representing Array[Byte] values.
BinaryType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the BinaryType object.
bind() - Method in class org.apache.spark.ui.WebUI
Bind to the HTTP server behind this web interface.
binnedFeatures() - Method in class org.apache.spark.mllib.tree.impl.TreePoint
 
BinomialBounds - Class in org.apache.spark.util.random
Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample size with high confidence when sampling without replacement.
BinomialBounds() - Constructor for class org.apache.spark.util.random.BinomialBounds
 
BITS_PER_LONG() - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
BLAS - Class in org.apache.spark.mllib.linalg
BLAS routines for MLlib's vectors and matrices.
BLAS() - Constructor for class org.apache.spark.mllib.linalg.BLAS
 
BLOCK_MANAGER() - Static method in class org.apache.spark.util.MetadataCleanerType
 
BlockAdditionEvent - Class in org.apache.spark.streaming.scheduler
 
BlockAdditionEvent(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.BlockAdditionEvent
 
BlockException - Exception in org.apache.spark.storage
 
BlockException(BlockId, String) - Constructor for exception org.apache.spark.storage.BlockException
 
BlockGenerator - Class in org.apache.spark.streaming.receiver
Generates batches of objects received by a Receiver and puts them into appropriately named blocks at regular intervals.
BlockGenerator(BlockGeneratorListener, int, SparkConf) - Constructor for class org.apache.spark.streaming.receiver.BlockGenerator
 
BlockGeneratorListener - Interface in org.apache.spark.streaming.receiver
Listener object for BlockGenerator events
blockId(int) - Method in class org.apache.spark.ml.recommendation.ALS.LocalIndexEncoder
Gets the block id from an encoded index.
blockId() - Method in class org.apache.spark.rdd.BlockRDDPartition
 
blockId() - Method in class org.apache.spark.scheduler.IndirectTaskResult
 
blockId() - Method in exception org.apache.spark.storage.BlockException
 
BlockId - Class in org.apache.spark.storage
:: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocations
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
blockId() - Method in class org.apache.spark.storage.BlockObjectWriter
 
blockId() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
 
blockId() - Method in interface org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchResult
 
blockId() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
blockId() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
 
blockId() - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedStoreResult
 
blockId() - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockStoreResult
 
blockId() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
 
blockIds() - Method in class org.apache.spark.rdd.BlockRDD
 
blockIds() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
 
blockIdsToBlockManagers(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockIdsToExecutorIds(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockIdsToHosts(BlockId[], SparkEnv, BlockManagerMaster) - Static method in class org.apache.spark.storage.BlockManager
 
blockifyObject(T, int, Serializer, Option<CompressionCodec>, ClassTag<T>) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
 
BlockInfo - Class in org.apache.spark.storage
 
BlockInfo(StorageLevel, boolean) - Constructor for class org.apache.spark.storage.BlockInfo
 
blockManager() - Method in class org.apache.spark.SparkEnv
 
BlockManager - Class in org.apache.spark.storage
Manager running on every node (driver and executors) which provides interfaces for putting and retrieving blocks both locally and remotely into various stores (memory, disk, and off-heap).
BlockManager(String, ActorSystem, BlockManagerMaster, Serializer, long, SparkConf, MapOutputTracker, ShuffleManager, BlockTransferService, SecurityManager, int) - Constructor for class org.apache.spark.storage.BlockManager
 
BlockManager(String, ActorSystem, BlockManagerMaster, Serializer, SparkConf, MapOutputTracker, ShuffleManager, BlockTransferService, SecurityManager, int) - Constructor for class org.apache.spark.storage.BlockManager
Construct a BlockManager with a memory limit set based on system properties.
blockManager() - Method in class org.apache.spark.storage.BlockManagerSource
 
blockManager() - Method in class org.apache.spark.storage.BlockStore
 
blockManagerAddedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerAddedToJson(SparkListenerBlockManagerAdded) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerBasedBlockHandler - Class in org.apache.spark.streaming.receiver
Implementation of a ReceivedBlockHandler which stores the received blocks into a block manager with the specified storage level.
BlockManagerBasedBlockHandler(BlockManager, StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
BlockManagerBasedStoreResult - Class in org.apache.spark.streaming.receiver
Implementation of ReceivedBlockStoreResult that stores the metadata related to storage of blocks using BlockManagerBasedBlockHandler
BlockManagerBasedStoreResult(StreamBlockId) - Constructor for class org.apache.spark.streaming.receiver.BlockManagerBasedStoreResult
 
blockManagerId() - Method in class org.apache.spark.Heartbeat
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManager
 
BlockManagerId - Class in org.apache.spark.storage
:: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerInfo
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetPeers
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
 
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
 
blockManagerIdFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
blockManagerIdToJson(BlockManagerId) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerInfo - Class in org.apache.spark.storage
 
BlockManagerInfo(BlockManagerId, long, long, ActorRef) - Constructor for class org.apache.spark.storage.BlockManagerInfo
 
BlockManagerMaster - Class in org.apache.spark.storage
 
BlockManagerMaster(ActorRef, SparkConf, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMaster
 
BlockManagerMasterActor - Class in org.apache.spark.storage
BlockManagerMasterActor is an actor on the master node to track statuses of all slaves' block managers.
BlockManagerMasterActor(boolean, SparkConf, LiveListenerBus) - Constructor for class org.apache.spark.storage.BlockManagerMasterActor
 
BlockManagerMessages - Class in org.apache.spark.storage
 
BlockManagerMessages() - Constructor for class org.apache.spark.storage.BlockManagerMessages
 
BlockManagerMessages.BlockManagerHeartbeat - Class in org.apache.spark.storage
 
BlockManagerMessages.BlockManagerHeartbeat(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
 
BlockManagerMessages.BlockManagerHeartbeat$ - Class in org.apache.spark.storage
 
BlockManagerMessages.BlockManagerHeartbeat$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
 
BlockManagerMessages.ExpireDeadHosts$ - Class in org.apache.spark.storage
 
BlockManagerMessages.ExpireDeadHosts$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.ExpireDeadHosts$
 
BlockManagerMessages.GetActorSystemHostPortForExecutor - Class in org.apache.spark.storage
 
BlockManagerMessages.GetActorSystemHostPortForExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor
 
BlockManagerMessages.GetActorSystemHostPortForExecutor$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetActorSystemHostPortForExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor$
 
BlockManagerMessages.GetBlockStatus - Class in org.apache.spark.storage
 
BlockManagerMessages.GetBlockStatus(BlockId, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
BlockManagerMessages.GetBlockStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetBlockStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
 
BlockManagerMessages.GetLocations - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocations(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations
 
BlockManagerMessages.GetLocations$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocations$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations$
 
BlockManagerMessages.GetLocationsMultipleBlockIds - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocationsMultipleBlockIds(BlockId[]) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
 
BlockManagerMessages.GetLocationsMultipleBlockIds$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocationsMultipleBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
 
BlockManagerMessages.GetMatchingBlockIds - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMatchingBlockIds(Function1<BlockId, Object>, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
BlockManagerMessages.GetMatchingBlockIds$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMatchingBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
 
BlockManagerMessages.GetMemoryStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMemoryStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
 
BlockManagerMessages.GetPeers - Class in org.apache.spark.storage
 
BlockManagerMessages.GetPeers(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers
 
BlockManagerMessages.GetPeers$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetPeers$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers$
 
BlockManagerMessages.GetStorageStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetStorageStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
 
BlockManagerMessages.RegisterBlockManager - Class in org.apache.spark.storage
 
BlockManagerMessages.RegisterBlockManager(BlockManagerId, long, ActorRef) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
BlockManagerMessages.RegisterBlockManager$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RegisterBlockManager$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
 
BlockManagerMessages.RemoveBlock - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBlock(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
 
BlockManagerMessages.RemoveBlock$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBlock$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
 
BlockManagerMessages.RemoveBroadcast - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBroadcast(long, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
BlockManagerMessages.RemoveBroadcast$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBroadcast$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
 
BlockManagerMessages.RemoveExecutor - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
 
BlockManagerMessages.RemoveExecutor$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
 
BlockManagerMessages.RemoveRdd - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveRdd(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
 
BlockManagerMessages.RemoveRdd$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveRdd$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
 
BlockManagerMessages.RemoveShuffle - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveShuffle(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
 
BlockManagerMessages.RemoveShuffle$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveShuffle$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
 
BlockManagerMessages.StopBlockManagerMaster$ - Class in org.apache.spark.storage
 
BlockManagerMessages.StopBlockManagerMaster$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
 
BlockManagerMessages.ToBlockManagerMaster - Interface in org.apache.spark.storage
 
BlockManagerMessages.ToBlockManagerSlave - Interface in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo - Class in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
BlockManagerMessages.UpdateBlockInfo() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
BlockManagerMessages.UpdateBlockInfo$ - Class in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
 
blockManagerRemovedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerRemovedToJson(SparkListenerBlockManagerRemoved) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerSlaveActor - Class in org.apache.spark.storage
An actor to take commands from the master to execute options.
BlockManagerSlaveActor(BlockManager, MapOutputTracker) - Constructor for class org.apache.spark.storage.BlockManagerSlaveActor
 
BlockManagerSource - Class in org.apache.spark.storage
 
BlockManagerSource(BlockManager) - Constructor for class org.apache.spark.storage.BlockManagerSource
 
BlockMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental ::
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Alternate constructor for BlockMatrix without the input of the number of rows and columns.
BlockNotFoundException - Exception in org.apache.spark.storage
 
BlockNotFoundException(String) - Constructor for exception org.apache.spark.storage.BlockNotFoundException
 
BlockObjectWriter - Class in org.apache.spark.storage
An interface for writing JVM objects to some underlying storage.
BlockObjectWriter(BlockId) - Constructor for class org.apache.spark.storage.BlockObjectWriter
 
blockPushingThread() - Method in class org.apache.spark.streaming.dstream.RawNetworkReceiver
 
BlockRDD<T> - Class in org.apache.spark.rdd
 
BlockRDD(SparkContext, BlockId[], ClassTag<T>) - Constructor for class org.apache.spark.rdd.BlockRDD
 
BlockRDDPartition - Class in org.apache.spark.rdd
 
BlockRDDPartition(BlockId, int) - Constructor for class org.apache.spark.rdd.BlockRDDPartition
 
BlockResult - Class in org.apache.spark.storage
 
BlockResult(Iterator<Object>, Enumeration.Value, long) - Constructor for class org.apache.spark.storage.BlockResult
 
blocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
blocks() - Method in class org.apache.spark.storage.BlockManagerInfo
 
blocks() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
 
blocks() - Method in class org.apache.spark.storage.StorageStatus
Return the blocks stored in this block manager.
BlockStatus - Class in org.apache.spark.storage
 
BlockStatus(StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
 
blockStatusFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockStatusToJson(BlockStatus) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockStore - Class in org.apache.spark.storage
Abstract class to store blocks.
BlockStore(BlockManager) - Constructor for class org.apache.spark.storage.BlockStore
 
blockStoreResult() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
 
blockTransferService() - Method in class org.apache.spark.SparkEnv
 
BlockValues - Interface in org.apache.spark.storage
 
bmAddress() - Method in class org.apache.spark.FetchFailed
 
BOOLEAN - Class in org.apache.spark.sql.columnar
 
BOOLEAN() - Constructor for class org.apache.spark.sql.columnar.BOOLEAN
 
BooleanBitSet - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet() - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
BooleanBitSet.Decoder - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet.Decoder(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet.Decoder
 
BooleanBitSet.Encoder - Class in org.apache.spark.sql.columnar.compression
 
BooleanBitSet.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
BooleanColumnAccessor - Class in org.apache.spark.sql.columnar
 
BooleanColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.BooleanColumnAccessor
 
BooleanColumnBuilder - Class in org.apache.spark.sql.columnar
 
BooleanColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.BooleanColumnBuilder
 
BooleanColumnStats - Class in org.apache.spark.sql.columnar
 
BooleanColumnStats() - Constructor for class org.apache.spark.sql.columnar.BooleanColumnStats
 
BooleanConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
BooleanParam - Class in org.apache.spark.ml.param
Specialized version of Param[Boolean] for Java.
BooleanParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.BooleanParam
 
BooleanParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.BooleanParam
 
BooleanType - Class in org.apache.spark.sql.types
:: DeveloperApi :: The data type representing Boolean values.
BooleanType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the BooleanType object.
booleanWritableConverter() - Static method in class org.apache.spark.SparkContext
 
booleanWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
booleanWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
boolToBoolWritable(boolean) - Static method in class org.apache.spark.SparkContext
 
BoostingStrategy - Class in org.apache.spark.mllib.tree.configuration
:: Experimental :: Configuration options for GradientBoostedTrees.
BoostingStrategy(Strategy, Loss, int, double) - Constructor for class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
Both() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges originating from *and* arriving at a vertex of interest.
boundaries() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
 
BoundedDouble - Class in org.apache.spark.partial
:: Experimental :: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
 
BoundedPriorityQueue<A> - Class in org.apache.spark.util
Bounded priority queue.
BoundedPriorityQueue(int, Ordering<A>) - Constructor for class org.apache.spark.util.BoundedPriorityQueue
 
boundPort() - Method in class org.apache.spark.ui.ServerInfo
 
boundPort() - Method in class org.apache.spark.ui.WebUI
Return the actual port to which this server is bound.
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast
A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
 
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
 
BROADCAST_TIMEOUT() - Static method in class org.apache.spark.sql.SQLConf
 
BROADCAST_VARS() - Static method in class org.apache.spark.util.MetadataCleanerType
 
BroadcastBlockId - Class in org.apache.spark.storage
 
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
 
broadcastCleaned(long) - Method in interface org.apache.spark.CleanerListener
 
broadcastedConf() - Method in class org.apache.spark.rdd.CheckpointRDD
 
BroadcastFactory - Interface in org.apache.spark.broadcast
:: DeveloperApi :: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
broadcastId() - Method in class org.apache.spark.CleanBroadcast
 
broadcastId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
 
BroadcastManager - Class in org.apache.spark.broadcast
 
BroadcastManager(boolean, SparkConf, SecurityManager) - Constructor for class org.apache.spark.broadcast.BroadcastManager
 
broadcastManager() - Method in class org.apache.spark.SparkEnv
 
broadcastTimeout() - Method in class org.apache.spark.sql.SQLConf
Timeout in seconds for the broadcast wait time in hash join
broadcastVars() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
Broker - Class in org.apache.spark.streaming.kafka
:: Experimental :: Represent the host and port info for a Kafka broker.
buf() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
 
buffer() - Method in class org.apache.spark.storage.ArrayValues
 
buffer() - Method in class org.apache.spark.storage.ByteBufferValues
 
buffer() - Method in class org.apache.spark.util.SerializableBuffer
 
buffers() - Method in class org.apache.spark.sql.columnar.CachedBatch
 
build() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
Builds a ALS.RatingBlock.
build() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlockBuilder
build() - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Builds and returns all combinations of parameters specified by the param grid.
build(Node[]) - Method in class org.apache.spark.mllib.tree.model.Node
build the left node and right nodes if not leaf
build() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
build() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Returns the final columnar byte buffer.
build() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
build() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
build() - Method in class org.apache.spark.sql.types.MetadataBuilder
Builds the Metadata instance.
buildFilter() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
 
buildFormattedString(String, StringBuilder) - Method in class org.apache.spark.sql.types.ArrayType
 
buildFormattedString(String, StringBuilder) - Method in class org.apache.spark.sql.types.MapType
 
buildFormattedString(String, StringBuilder) - Method in class org.apache.spark.sql.types.StructField
 
buildFormattedString(String, StringBuilder) - Method in class org.apache.spark.sql.types.StructType
 
buildMetadata(RDD<LabeledPoint>, Strategy, int, String) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
Construct a DecisionTreeMetadata instance for this dataset and parameters.
buildMetadata(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
buildNonNulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
 
buildPools() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
buildPools() - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
 
buildPools() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
 
buildRegistryName(Source) - Method in class org.apache.spark.metrics.MetricsSystem
Build a name that uniquely identifies each metric source.
buildScan(String[], Filter[]) - Method in class org.apache.spark.sql.jdbc.JDBCRelation
 
buildScan() - Method in class org.apache.spark.sql.json.JSONRelation
 
buildScan(Seq<Attribute>, Seq<Expression>) - Method in class org.apache.spark.sql.parquet.ParquetRelation2
 
buildScan(Seq<Attribute>, Seq<Expression>) - Method in interface org.apache.spark.sql.sources.CatalystScan
 
buildScan(String[], Filter[]) - Method in interface org.apache.spark.sql.sources.PrunedFilteredScan
 
buildScan(String[]) - Method in interface org.apache.spark.sql.sources.PrunedScan
 
buildScan() - Method in interface org.apache.spark.sql.sources.TableScan
 
BYTE - Class in org.apache.spark.sql.columnar
 
BYTE() - Constructor for class org.apache.spark.sql.columnar.BYTE
 
ByteArrayChunkOutputStream - Class in org.apache.spark.util.io
An OutputStream that writes to fixed-size chunks of byte arrays.
ByteArrayChunkOutputStream(int) - Constructor for class org.apache.spark.util.io.ByteArrayChunkOutputStream
 
ByteArrayColumnType<T extends DataType> - Class in org.apache.spark.sql.columnar
 
ByteArrayColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ByteArrayColumnType
 
byteBuffer() - Method in class org.apache.spark.streaming.receiver.ByteBufferBlock
 
ByteBufferBlock - Class in org.apache.spark.streaming.receiver
class representing a block received as an ByteBuffer
ByteBufferBlock(ByteBuffer) - Constructor for class org.apache.spark.streaming.receiver.ByteBufferBlock
 
ByteBufferData - Class in org.apache.spark.streaming.receiver
 
ByteBufferData(ByteBuffer) - Constructor for class org.apache.spark.streaming.receiver.ByteBufferData
 
ByteBufferInputStream - Class in org.apache.spark.util
Reads data from a ByteBuffer, and optionally cleans it up using BlockManager.dispose() at the end of the stream (e.g.
ByteBufferInputStream(ByteBuffer, boolean) - Constructor for class org.apache.spark.util.ByteBufferInputStream
 
ByteBufferValues - Class in org.apache.spark.storage
 
ByteBufferValues(ByteBuffer) - Constructor for class org.apache.spark.storage.ByteBufferValues
 
BytecodeUtils - Class in org.apache.spark.graphx.util
Includes an utility function to test whether a function accesses a specific attribute of an object.
BytecodeUtils() - Constructor for class org.apache.spark.graphx.util.BytecodeUtils
 
ByteColumnAccessor - Class in org.apache.spark.sql.columnar
 
ByteColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.ByteColumnAccessor
 
ByteColumnBuilder - Class in org.apache.spark.sql.columnar
 
ByteColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.ByteColumnBuilder
 
ByteColumnStats - Class in org.apache.spark.sql.columnar
 
ByteColumnStats() - Constructor for class org.apache.spark.sql.columnar.ByteColumnStats
 
bytes() - Method in class org.apache.spark.streaming.receiver.ByteBufferData
 
BYTES_FOR_PRECISION() - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
Compute the FIXED_LEN_BYTE_ARRAY length needed to represent a given DECIMAL precision.
bytesToBytesWritable(byte[]) - Static method in class org.apache.spark.SparkContext
 
bytesToLines(InputStream) - Static method in class org.apache.spark.streaming.dstream.SocketReceiver
This methods translates the data from an inputstream (say, from a socket) to '\n' delimited strings and returns an iterator to access the strings.
bytesToString(long) - Static method in class org.apache.spark.util.Utils
Convert a quantity in bytes to a human-readable string such as "4.0 MB".
bytesWritableConverter() - Static method in class org.apache.spark.SparkContext
 
bytesWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
bytesWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
bytesWritten(long) - Method in interface org.apache.spark.util.logging.RollingPolicy
Notify that bytes have been written
bytesWritten(long) - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
Increment the bytes that have been written in the current file
bytesWritten(long) - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
 
ByteType - Class in org.apache.spark.sql.types
:: DeveloperApi :: The data type representing Byte values.
ByteType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the ByteType object.

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaRDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.graphx.Graph
Caches the vertices and edges associated with this graph at the previously-specified target storage levels, which default to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Caches the underlying RDD.
cache() - Method in class org.apache.spark.partial.StudentTCacher
 
cache() - Method in class org.apache.spark.rdd.RDD
Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.DataFrame
 
cache() - Method in interface org.apache.spark.sql.RDDApi
 
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
CachedBatch - Class in org.apache.spark.sql.columnar
 
CachedBatch(byte[][], Row) - Constructor for class org.apache.spark.sql.columnar.CachedBatch
 
cachedColumnBuffers() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
CachedData - Class in org.apache.spark.sql
Holds a cached logical plan and its data
CachedData(LogicalPlan, InMemoryRelation) - Constructor for class org.apache.spark.sql.CachedData
 
cachedRepresentation() - Method in class org.apache.spark.sql.CachedData
 
CacheManager - Class in org.apache.spark
Spark class responsible for passing RDDs partition contents to the BlockManager and making sure a node doesn't load two copies of an RDD at once.
CacheManager(BlockManager) - Constructor for class org.apache.spark.CacheManager
 
cacheManager() - Method in class org.apache.spark.SparkEnv
 
CacheManager - Class in org.apache.spark.sql
Provides support in a SQLContext for caching query results and automatically using these cached results when subsequent queries are executed.
CacheManager(SQLContext) - Constructor for class org.apache.spark.sql.CacheManager
 
cacheQuery(DataFrame, Option<String>, StorageLevel) - Method in class org.apache.spark.sql.CacheManager
Caches the data produced by the logical representation of the given schema rdd.
cacheTable(String) - Method in class org.apache.spark.sql.CacheManager
Caches the specified table in-memory.
cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
Caches the specified table in-memory.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for regression
calculate() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Calculate the impurity from the stored sufficient statistics.
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: variance calculation
calculate() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Calculate the impurity from the stored sufficient statistics.
calculatedTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
 
calculateNumBatchesToRemember(Duration) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
Calculate the number of last batches to remember, such that all the files selected in at least last MIN_REMEMBER_DURATION duration can be remembered.
calculateTotalMemory(SparkContext) - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
 
call(T1) - Method in interface org.apache.spark.api.java.function.Function
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
 
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
 
call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
 
call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
 
call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
 
call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
 
call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
 
call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
 
call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
 
call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
 
callSite() - Method in class org.apache.spark.scheduler.ActiveJob
 
callSite() - Method in class org.apache.spark.scheduler.JobSubmitted
 
callSite() - Method in class org.apache.spark.scheduler.Stage
 
CallSite - Class in org.apache.spark.util
CallSite represents a place in user code.
CallSite(String, String) - Constructor for class org.apache.spark.util.CallSite
 
callUDF(Function0<?>, DataType) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 0 arguments as user-defined function (UDF).
callUDF(Function1<?, ?>, DataType, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 1 arguments as user-defined function (UDF).
callUDF(Function2<?, ?, ?>, DataType, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 2 arguments as user-defined function (UDF).
callUDF(Function3<?, ?, ?, ?>, DataType, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 3 arguments as user-defined function (UDF).
callUDF(Function4<?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 4 arguments as user-defined function (UDF).
callUDF(Function5<?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 5 arguments as user-defined function (UDF).
callUDF(Function6<?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 6 arguments as user-defined function (UDF).
callUDF(Function7<?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 7 arguments as user-defined function (UDF).
callUDF(Function8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 8 arguments as user-defined function (UDF).
callUDF(Function9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 9 arguments as user-defined function (UDF).
callUDF(Function10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
Call a Scala function of 10 arguments as user-defined function (UDF).
cancel() - Method in class org.apache.spark.ComplexFutureAction
 
cancel() - Method in interface org.apache.spark.FutureAction
Cancels the execution of this action.
cancel(boolean) - Method in class org.apache.spark.JavaFutureActionWrapper
 
cancel() - Method in class org.apache.spark.scheduler.JobWaiter
Sends a signal to the DAGScheduler to cancel the job.
cancel() - Method in class org.apache.spark.SimpleFutureAction
 
cancel() - Method in class org.apache.spark.util.MetadataCleaner
 
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel all jobs that are running or waiting in the queue.
cancelAllJobs() - Method in class org.apache.spark.SparkContext
Cancel all jobs that have been scheduled or are running.
cancelJob(int) - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel a job that is running or waiting in the queue.
cancelJob(int) - Method in class org.apache.spark.SparkContext
Cancel a given job if it's scheduled or running
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.scheduler.DAGScheduler
 
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
Cancel active jobs for the specified group.
cancelStage(int) - Method in class org.apache.spark.scheduler.DAGScheduler
Cancel all jobs associated with a running or scheduled stage.
cancelStage(int) - Method in class org.apache.spark.SparkContext
Cancel a given stage and all jobs associated with it
cancelTasks(int, boolean) - Method in interface org.apache.spark.scheduler.TaskScheduler
 
cancelTasks(int, boolean) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
canCommit(int, long, long) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
Called by tasks to ask whether they can commit their output to HDFS.
canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
 
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
 
canFetchMoreResults(long) - Method in class org.apache.spark.scheduler.TaskSetManager
Check whether has enough quota to fetch the result with size bytes
capacity() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
 
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
CartesianPartition - Class in org.apache.spark.rdd
 
CartesianPartition(int, RDD<?>, RDD<?>, int, int) - Constructor for class org.apache.spark.rdd.CartesianPartition
 
CartesianRDD<T,U> - Class in org.apache.spark.rdd
 
CartesianRDD(SparkContext, RDD<T>, RDD<U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.CartesianRDD
 
CASE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
CaseInsensitiveMap - Class in org.apache.spark.sql.sources
Builds a map in which keys are case insensitive
CaseInsensitiveMap(Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CaseInsensitiveMap
 
caseSensitive() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
cast(DataType) - Method in class org.apache.spark.sql.Column
Casts the column to a different data type.
cast(String) - Method in class org.apache.spark.sql.Column
Casts the column to a different data type, using the canonical string representation of the type.
castAndRenameChildOutput(InsertIntoTable, Seq<Attribute>, LogicalPlan) - Static method in class org.apache.spark.sql.sources.PreInsertCastAndRename
If necessary, cast data types and rename fields to the expected types and names.
castChildOutput(InsertIntoTable, MetastoreRelation, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
 
catalog() - Method in class org.apache.spark.sql.sources.PreWriteCheck
 
CatalystArrayContainsNullConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array contains null (see ParquetTypesConverter) into an ArrayType.
CatalystArrayContainsNullConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
CatalystArrayContainsNullConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
 
CatalystArrayConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array (see ParquetTypesConverter) into an ArrayType.
CatalystArrayConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
 
CatalystArrayConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
 
CatalystConverter - Class in org.apache.spark.sql.parquet
 
CatalystConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystConverter
 
CatalystGroupConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that is able to convert a Parquet record to a org.apache.spark.sql.catalyst.expressions.Row object.
CatalystGroupConverter(StructField[], int, CatalystConverter, ArrayBuffer<Object>, ArrayBuffer<Row>) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
 
CatalystGroupConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
 
CatalystGroupConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
This constructor is used for the root converter only!
CatalystMapConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts two-element groups that match the characteristics of a map (see ParquetTypesConverter) into an MapType.
CatalystMapConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystMapConverter
 
CatalystNativeArrayConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that converts a single-element groups that match the characteristics of an array (see ParquetTypesConverter) into an ArrayType.
CatalystNativeArrayConverter(NativeType, int, CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
 
CatalystPrimitiveConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.PrimitiveConverter that converts Parquet types to Catalyst types.
CatalystPrimitiveConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
 
CatalystPrimitiveRowConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.GroupConverter that is able to convert a Parquet record to a org.apache.spark.sql.catalyst.expressions.Row object.
CatalystPrimitiveRowConverter(StructField[], MutableRow) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
CatalystPrimitiveRowConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
 
CatalystPrimitiveStringConverter - Class in org.apache.spark.sql.parquet
A parquet.io.api.PrimitiveConverter that converts Parquet Binary to Catalyst String.
CatalystPrimitiveStringConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
 
CatalystScan - Interface in org.apache.spark.sql.sources
::Experimental:: An interface for experimenting with a more direct connection to the query planner.
CatalystStructConverter - Class in org.apache.spark.sql.parquet
This converter is for multi-element groups of primitive or complex types that have repetition level optional or required (so struct fields).
CatalystStructConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystStructConverter
 
CatalystTimestampConverter - Class in org.apache.spark.sql.parquet
 
CatalystTimestampConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
categories() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
categories() - Method in class org.apache.spark.mllib.tree.model.Split
 
category() - Method in class org.apache.spark.mllib.tree.model.Bin
 
changePrecision(int, int) - Method in class org.apache.spark.sql.types.Decimal
Update precision and scale while keeping our value the same, and return true if successful.
changeValue(K, Function0<V>, Function1<V, V>) - Method in class org.apache.spark.graphx.util.collection.GraphXPrimitiveKeyOpenHashMap
If the key doesn't exist yet in the hash map, set its value to defaultValue; otherwise, set its value to mergeValue(oldValue).
channelFactory() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
channelFactoryExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
 
checkEquals(ASTNode) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
Throws an error if this is not equal to other.
checkHost(String, String) - Static method in class org.apache.spark.util.Utils
 
checkHostPort(String, String) - Static method in class org.apache.spark.util.Utils
 
checkInputColumn(StructType, String, DataType) - Method in interface org.apache.spark.ml.param.Params
Check whether the given schema contains an input column.
checkMinimalPollingPeriod(TimeUnit, int) - Static method in class org.apache.spark.metrics.MetricsSystem
 
checkModifyPermissions(String) - Method in class org.apache.spark.SecurityManager
Checks the given user against the modify acl list to see if they have authorization to modify the application.
checkOutputSpecs(JobContext) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
 
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.Graph
Mark this Graph for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
checkpoint() - Method in class org.apache.spark.rdd.CheckpointRDD
 
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
 
checkpoint() - Method in class org.apache.spark.rdd.RDD
Mark this RDD for checkpointing.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
Checkpoint - Class in org.apache.spark.streaming
 
Checkpoint(StreamingContext, Time) - Constructor for class org.apache.spark.streaming.Checkpoint
 
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Enable periodic checkpointing of RDDs of this DStream
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointBackupFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
Get the checkpoint backup file for the given checkpoint time
checkpointClock() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
 
checkpointData() - Method in class org.apache.spark.rdd.RDD
 
checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
 
checkpointDir() - Method in class org.apache.spark.SparkContext
 
checkpointDir() - Method in class org.apache.spark.streaming.Checkpoint
 
checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
 
checkpointDirToLogDir(String, int) - Static method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
checkpointDirToLogDir(String) - Static method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
 
checkpointDuration() - Method in class org.apache.spark.streaming.Checkpoint
 
checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
 
checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
 
Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
 
checkpointFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
Get the checkpoint file for the given checkpoint time
CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
 
checkpointInProgress() - Method in class org.apache.spark.streaming.DStreamGraph
 
checkpointInterval() - Method in interface org.apache.spark.ml.param.HasCheckpointInterval
param for checkpoint interval
checkpointInterval() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
 
checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
checkpointInterval() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
 
checkpointPath() - Method in class org.apache.spark.rdd.CheckpointRDD
 
CheckpointRDD<T> - Class in org.apache.spark.rdd
This RDD represents a RDD checkpoint file (similar to HadoopRDD).
CheckpointRDD(SparkContext, String, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CheckpointRDD
 
checkpointRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
CheckpointRDDPartition - Class in org.apache.spark.rdd
 
CheckpointRDDPartition(int) - Constructor for class org.apache.spark.rdd.CheckpointRDDPartition
 
CheckpointReader - Class in org.apache.spark.streaming
 
CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
 
CheckpointState - Class in org.apache.spark.rdd
Enumeration to manage state transitions of an RDD through checkpointing [ Initialized --> marked for checkpointing --> checkpointing in progress --> checkpointed ]
CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
 
checkpointTime() - Method in class org.apache.spark.streaming.Checkpoint
 
CheckpointWriter - Class in org.apache.spark.streaming
Convenience class to handle the writing of graph checkpoint to file
CheckpointWriter(JobGenerator, SparkConf, String, Configuration) - Constructor for class org.apache.spark.streaming.CheckpointWriter
 
CheckpointWriter.CheckpointWriteHandler - Class in org.apache.spark.streaming
 
CheckpointWriter.CheckpointWriteHandler(Time, byte[], boolean) - Constructor for class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.Pool
 
checkSpeculatableTasks() - Method in interface org.apache.spark.scheduler.Schedulable
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
Check for tasks to be speculated and return true if there are any.
checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
checkTimeoutInterval() - Method in class org.apache.spark.storage.BlockManagerMasterActor
 
checkUIViewPermissions(String) - Method in class org.apache.spark.SecurityManager
Checks the given user against the view acl list to see if they have authorization to view the UI.
child() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
 
child() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
 
child() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
child() - Method in class org.apache.spark.sql.sources.Not
 
ChildFirstURLClassLoader - Class in org.apache.spark.util
A mutable class loader that gives preference to its own URLs over the parent class loader when loading classes and resources.
ChildFirstURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.ChildFirstURLClassLoader
 
children() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
 
children() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
children() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
 
children() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
children() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
children() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
 
chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
ChiSqSelector - Class in org.apache.spark.mllib.feature
:: Experimental :: Creates a ChiSquared feature selector.
ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
 
ChiSqSelectorModel - Class in org.apache.spark.mllib.feature
:: Experimental :: Chi Squared selector model.
ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
 
chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution.
chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform distribution, with each category having an expected frequency of 1 / observed.size.
chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's independence test on the input contingency matrix, which cannot contain negative entries or columns or rows that sum up to 0.
chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's independence test for every feature against the label across the input RDD.
ChiSqTest - Class in org.apache.spark.mllib.stat.test
Conduct the chi-squared test for the input RDDs using the specified method.
ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
 
ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.Method(String, Function2<Object, Object, Object>) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.Method$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
 
ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.NullHypothesis$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
 
ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
:: Experimental :: Object containing the test results for the chi-squared hypothesis test.
ChiSqTestResult(double, int, double, String, String) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
Conduct Pearson's independence test for each feature against the label across the input RDD.
chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
chmod700(File) - Static method in class org.apache.spark.util.Utils
JDK equivalent of chmod 700 file.
classForName(String) - Static method in class org.apache.spark.util.Utils
Preferred alternative to Class.forName(className)
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
:: AlphaComponent :: Model produced by a Classifier.
ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
 
ClassificationModel - Interface in org.apache.spark.mllib.classification
:: Experimental :: Represents a classification model that predicts to which of a set of categories an example belongs.
Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
:: AlphaComponent :: Single-label binary or multiclass classification.
Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
 
ClassifierParams - Interface in org.apache.spark.ml.classification
:: DeveloperApi :: Params for classification.
classIsLoadable(String) - Static method in class org.apache.spark.util.Utils
Determines whether the provided class is loadable in the current thread.
classLoader() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
className() - Method in class org.apache.spark.ExceptionFailure
 
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
 
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaRDD
 
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
classTag() - Method in class org.apache.spark.sql.types.NativeType
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
clean(F, boolean) - Method in class org.apache.spark.SparkContext
Clean a closure to make it ready to serialized and send to tasks (removes unreferenced variables in $outer's, updates REPL variables) If checkSerializable is set, clean will also proactively check to see if f is serializable and throw a SparkException if not.
clean(Object, boolean) - Static method in class org.apache.spark.util.ClosureCleaner
 
CleanBroadcast - Class in org.apache.spark
 
CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
 
cleaner() - Method in class org.apache.spark.SparkContext
 
CleanerListener - Interface in org.apache.spark
Listener class used for testing when any item has been cleaned by the Cleaner class.
CleanRDD - Class in org.apache.spark
 
CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
 
CleanShuffle - Class in org.apache.spark
 
CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
 
cleanup(long) - Method in class org.apache.spark.SparkContext
Called by MetadataCleaner to clean up the persistentRdds map periodically
cleanup(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
Cleanup old checkpoint data.
cleanup(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
 
cleanup(Time) - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
 
cleanUpAfterSchedulerStop() - Method in class org.apache.spark.scheduler.DAGScheduler
 
cleanupOldBatches(Time, boolean) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
Clean up block information of old batches.
cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
 
CleanupOldBlocks - Class in org.apache.spark.streaming.receiver
 
CleanupOldBlocks(Time) - Constructor for class org.apache.spark.streaming.receiver.CleanupOldBlocks
 
cleanupOldBlocks(long) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
Cleanup old blocks older than the given threshold time
cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
 
cleanupOldBlocksAndBatches(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
Clean up the data and metadata of blocks and batches that are strictly older than the threshold time.
cleanupOldLogs(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
Delete the log files that are older than the threshold time.
CleanupTask - Interface in org.apache.spark
Classes that represent cleaning tasks.
CleanupTaskWeakReference - Class in org.apache.spark
A WeakReference associated with a CleanupTask.
CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
 
clear() - Static method in class org.apache.spark.Accumulators
 
clear() - Method in class org.apache.spark.sql.SQLConf
 
clear() - Method in class org.apache.spark.storage.BlockManagerInfo
 
clear() - Method in class org.apache.spark.storage.BlockStore
 
clear() - Method in class org.apache.spark.storage.MemoryStore
 
clear() - Method in class org.apache.spark.util.BoundedPriorityQueue
 
CLEAR_NULL_VALUES_INTERVAL() - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
 
clearActiveContext() - Static method in class org.apache.spark.SparkContext
Clears the active SparkContext metadata.
clearCache() - Method in class org.apache.spark.sql.CacheManager
Clears all cached tables.
clearCache() - Method in class org.apache.spark.sql.SQLContext
Removes all cached tables from the in-memory cache.
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext
Clear the thread-local property for overriding the call sites of actions and RDDs.
clearCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
 
clearCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
ClearCheckpointData - Class in org.apache.spark.streaming.scheduler
 
ClearCheckpointData(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearCheckpointData
 
clearCheckpointDataLater() - Method in class org.apache.spark.streaming.scheduler.DoCheckpoint
 
clearDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearFiles() - Method in class org.apache.spark.SparkContext
Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.SparkContext
Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext
Clear the current thread's job group ID and its description.
clearMetadata(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Clear metadata that are older than rememberDuration of this DStream.
clearMetadata(Time) - Method in class org.apache.spark.streaming.DStreamGraph
 
ClearMetadata - Class in org.apache.spark.streaming.scheduler
 
ClearMetadata(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearMetadata
 
clearNullValues() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
Remove entries with values that are no longer strongly reachable.
clearOldValues(long, Function2<A, B, BoxedUnit>) - Method in class org.apache.spark.util.TimeStampedHashMap
 
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashMap
Removes old key-value pairs that have timestamp earlier than `threshTime`.
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashSet
Removes old values that have timestamp earlier than threshTime
clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
Remove old key-value pairs with timestamps earlier than `threshTime`.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
:: Experimental :: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
:: Experimental :: Clears the threshold so that predict will output raw prediction scores.
client() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
client() - Method in class org.apache.spark.storage.TachyonBlockManager
 
client() - Method in class org.apache.spark.streaming.flume.FlumeConnection
 
clock() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
 
clock() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
 
Clock - Interface in org.apache.spark.util
An interface to represent clocks, so that they can be mocked out in unit tests.
clone() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
 
clone() - Method in class org.apache.spark.SparkConf
Copy this object
clone(JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType
Creates a duplicated copy of the value.
clone() - Method in class org.apache.spark.sql.types.Decimal
 
clone() - Method in class org.apache.spark.storage.StorageLevel
 
clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
 
clone() - Method in class org.apache.spark.util.random.PoissonSampler
 
clone() - Method in interface org.apache.spark.util.random.RandomSampler
return a copy of the RandomSampler object
clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
Clone an object using a Spark serializer.
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
Return a sampler that is the complement of the range specified of the current sampler.
close() - Method in class org.apache.spark.api.java.JavaSparkContext
 
close() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
 
close() - Method in class org.apache.spark.input.PortableDataStream
Close the file (if it is currently open)
close() - Method in class org.apache.spark.input.StreamBasedRecordReader
 
close() - Method in class org.apache.spark.input.WholeTextFileRecordReader
 
close() - Method in class org.apache.spark.serializer.DeserializationStream
 
close() - Method in class org.apache.spark.serializer.JavaDeserializationStream
 
close() - Method in class org.apache.spark.serializer.JavaSerializationStream
 
close() - Method in class org.apache.spark.serializer.KryoDeserializationStream
 
close() - Method in class org.apache.spark.serializer.KryoSerializationStream
 
close() - Method in class org.apache.spark.serializer.SerializationStream
 
close() - Method in class org.apache.spark.SparkHadoopWriter
 
close() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
close() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
close() - Method in class org.apache.spark.storage.BlockObjectWriter
 
close() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
close() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter
 
closeIfNeeded() - Method in class org.apache.spark.util.NextIterator
Calls the subclass-defined close method, but only once.
ClosureCleaner - Class in org.apache.spark.util
 
ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
 
closureSerializer() - Method in class org.apache.spark.SparkEnv
 
cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
cn() - Method in class org.apache.spark.mllib.feature.VocabWord
 
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(Column...) - Static method in class org.apache.spark.sql.functions
Returns the first column that is not null.
coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions
Returns the first column that is not null.
COALESCE() - Static method in class org.apache.spark.sql.hive.HiveQl
 
CoalescedRDD<T> - Class in org.apache.spark.rdd
Represents a coalesced RDD that has fewer partitions than its parent RDD This class uses the PartitionCoalescer class to find a good partitioning of the parent RDD so that each new partition has roughly the same number of parent partitions and that the preferred location of each new partition overlaps with as many preferred locations of its parent partitions
CoalescedRDD(RDD<T>, int, double, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CoalescedRDD
 
CoalescedRDDPartition - Class in org.apache.spark.rdd
Class that captures a coalesced RDD by essentially keeping track of parent partitions
CoalescedRDDPartition(int, RDD<?>, int[], Option<String>) - Constructor for class org.apache.spark.rdd.CoalescedRDDPartition
 
CoarseGrainedClusterMessage - Interface in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
 
CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
 
CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutors(Seq<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
 
CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
 
CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillTask(long, String, boolean) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
 
CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
 
CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.LaunchTask(SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
 
CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.LaunchTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
 
CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterClusterManager$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
 
CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisteredExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
 
CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutor(String, String, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
 
CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorFailed(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
 
CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorFailed$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
 
CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RemoveExecutor(String, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
 
CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RemoveExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
 
CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RequestExecutors(int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
 
CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RequestExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
 
CoarseGrainedClusterMessages.RetrieveSparkProps$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RetrieveSparkProps$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
 
CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster
Alternate factory method that takes a ByteBuffer directly for the data field
CoarseGrainedClusterMessages.ReviveOffers$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
 
CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StatusUpdate(String, long, Enumeration.Value, SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StatusUpdate$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
 
CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
 
CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
 
CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
 
CoarseGrainedSchedulerBackend - Class in org.apache.spark.scheduler.cluster
A scheduler backend that waits for coarse grained executors to connect to it through Akka.
CoarseGrainedSchedulerBackend(TaskSchedulerImpl, ActorSystem) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
CoarseGrainedSchedulerBackend.DriverActor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedSchedulerBackend.DriverActor(Seq<Tuple2<String, String>>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
 
CoarseMesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos
A SchedulerBackend that runs tasks on Mesos, but uses "coarse-grained" tasks, where it holds onto each Mesos node for the duration of the Spark job instead of relinquishing cores whenever a task is done.
CoarseMesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
code() - Method in class org.apache.spark.mllib.feature.VocabWord
 
CODEGEN_ENABLED() - Static method in class org.apache.spark.sql.SQLConf
 
codegenEnabled() - Method in class org.apache.spark.sql.SQLConf
When set to true, Spark SQL will use the Scala compiler at runtime to generate custom bytecode that evaluates expressions found in queries.
codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
 
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd
:: DeveloperApi :: A RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
 
CoGroupPartition - Class in org.apache.spark.rdd
 
CoGroupPartition(int, CoGroupSplitDep[]) - Constructor for class org.apache.spark.rdd.CoGroupPartition
 
cogroupResult2ToJava(RDD<Tuple2<K, Tuple3<Iterable<V>, Iterable<W1>, Iterable<W2>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
cogroupResult3ToJava(RDD<Tuple2<K, Tuple4<Iterable<V>, Iterable<W1>, Iterable<W2>, Iterable<W3>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
cogroupResultToJava(RDD<Tuple2<K, Tuple2<Iterable<V>, Iterable<W>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
CoGroupSplitDep - Interface in org.apache.spark.rdd
 
col(String) - Method in class org.apache.spark.sql.DataFrame
Selects column based on the column name and return it as a Column.
col(String) - Static method in class org.apache.spark.sql.functions
Returns a Column based on the given column name.
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
collect() - Method in class org.apache.spark.rdd.RDD
Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return an RDD that contains all matching values by applying f.
collect() - Method in class org.apache.spark.sql.DataFrame
Returns an array that contains all of Rows in this DataFrame.
collect() - Method in interface org.apache.spark.sql.RDDApi
 
collectAsList() - Method in class org.apache.spark.sql.DataFrame
Returns a Java list that contains all of Rows in this DataFrame.
collectAsList() - Method in interface org.apache.spark.sql.RDDApi
 
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of collect, which returns a future for retrieving an array containing all of the elements in this RDD.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for retrieving all elements of this RDD.
collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.
collectedStatistics() - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.ByteColumnStats
 
collectedStatistics() - Method in interface org.apache.spark.sql.columnar.ColumnStats
Column statistics represented as a single row, currently including closed lower bound, closed upper bound and null count.
collectedStatistics() - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.FloatColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.GenericColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.IntColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.LongColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.NoopColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.ShortColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.StringColumnStats
 
collectedStatistics() - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
 
CollectionsUtils - Class in org.apache.spark.util
 
CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
 
collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Collect the neighbor vertex ids for each vertex.
collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Collect the neighbor vertex attributes for each vertex.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in a specific partition of this RDD.
collectPartitions() - Method in class org.apache.spark.rdd.RDD
A private method for tests, to look at the contents of each partition
colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
cols() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
colsPerPart() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
 
colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
Computes column-wise summary statistics for the input RDD[Vector].
Column - Class in org.apache.spark.sql
:: Experimental :: A column in a DataFrame.
Column(Expression) - Constructor for class org.apache.spark.sql.Column
 
Column(String) - Constructor for class org.apache.spark.sql.Column
 
column(String) - Static method in class org.apache.spark.sql.functions
Returns a Column based on the given column name.
column() - Method in class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
 
COLUMN_BATCH_SIZE() - Static method in class org.apache.spark.sql.SQLConf
 
COLUMN_NAME_OF_CORRUPT_RECORD() - Static method in class org.apache.spark.sql.SQLConf
 
ColumnAccessor - Interface in org.apache.spark.sql.columnar
An Iterator like trait used to extract values from columnar byte buffer.
columnBatchSize() - Method in class org.apache.spark.sql.SQLConf
The number of rows that will be
ColumnBuilder - Interface in org.apache.spark.sql.columnar
 
ColumnName - Class in org.apache.spark.sql
:: Experimental :: A convenient class used for constructing schema.
ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
 
columnNameOfCorruptRecord() - Method in class org.apache.spark.sql.SQLConf
 
columnNames() - Method in class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues
 
columnOrdinals() - Method in class org.apache.spark.sql.hive.MetastoreRelation
An attribute map for determining the ordinal for non-partition columns.
columnPartition(JDBCPartitioningInfo) - Static method in class org.apache.spark.sql.jdbc.JDBCRelation
Given a partitioning schematic (a column of integral type, a number of partitions, and upper and lower bounds on the column's value), generate WHERE clauses for each partition so that each row in the table appears exactly once.
columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
 
columns() - Method in class org.apache.spark.sql.DataFrame
Returns all column names as an array.
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Compute similarities between columns of this matrix using a sampling approach.
columnSimilaritiesDIMSUM(double[], double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Find all similar columns using the DIMSUM sampling algorithm, described in two papers
ColumnStatisticsSchema - Class in org.apache.spark.sql.columnar
 
ColumnStatisticsSchema(Attribute) - Constructor for class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
columnStats() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
columnStats() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
Column statistics information
ColumnStats - Interface in org.apache.spark.sql.columnar
Used to collect statistical information when building in-memory columns.
columnStats() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
 
columnType() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
 
ColumnType<T extends DataType,JvmType> - Class in org.apache.spark.sql.columnar
An abstract class that represents type of a column.
ColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ColumnType
 
columnType() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Simplified version of combineByKey that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Combine elements of each key in DStream's RDDs using custom functions.
combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
 
combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
combiningStrategy() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
 
command() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
commit() - Method in class org.apache.spark.SparkHadoopWriter
 
commitAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter
Flush the partial writes and commit them as a single atomic block.
commitAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
 
commitJob() - Method in class org.apache.spark.SparkHadoopWriter
 
commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
 
commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
commonHeaderNodes() - Static method in class org.apache.spark.ui.UIUtils
 
comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FairSchedulingAlgorithm
 
comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
 
comparator(Schedulable, Schedulable) - Method in interface org.apache.spark.scheduler.SchedulingAlgorithm
 
compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
 
compare(Decimal) - Method in class org.apache.spark.sql.types.Decimal
 
compare(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
 
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
 
compatibilityBlackList() - Static method in class org.apache.spark.sql.hive.HiveShim
 
compatibleType(DataType, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD
Returns the most general data type for two given data types.
completedIndices() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
completedStageIndices() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
 
completedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
completion() - Method in class org.apache.spark.util.CompletionIterator
 
CompletionEvent - Class in org.apache.spark.scheduler
 
CompletionEvent(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.CompletionEvent
 
CompletionIterator<A,I extends scala.collection.Iterator<A>> - Class in org.apache.spark.util
Wrapper around an iterator which calls a completion method after it successfully iterates through all the elements.
CompletionIterator(I) - Constructor for class org.apache.spark.util.CompletionIterator
 
completionTime() - Method in class org.apache.spark.scheduler.StageInfo
Time when all tasks in the stage completed or when the stage was cancelled.
completionTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
ComplexColumnBuilder<T extends DataType,JvmType> - Class in org.apache.spark.sql.columnar
 
ComplexColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.ComplexColumnBuilder
 
ComplexFutureAction<T> - Class in org.apache.spark
A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
 
compress() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
Compresses the block into an ALS.InBlock.
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
COMPRESS_CACHED() - Static method in class org.apache.spark.sql.SQLConf
 
compressCodec() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compressed() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
CompressedMapStatus - Class in org.apache.spark.scheduler
A MapStatus implementation that tracks the size of each block.
CompressedMapStatus(BlockManagerId, byte[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
 
CompressedMapStatus(BlockManagerId, long[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
 
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
compressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
 
compressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
 
CompressibleColumnAccessor<T extends NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
CompressibleColumnBuilder<T extends NativeType> - Interface in org.apache.spark.sql.columnar.compression
A stackable trait that builds optionally compressed byte buffer for a column.
COMPRESSION_CODEC_KEY() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
CompressionCodec - Interface in org.apache.spark.io
:: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compressionCodec() - Method in class org.apache.spark.streaming.CheckpointWriter
 
compressionEncoders() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
 
compressionRatio() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
 
CompressionScheme - Interface in org.apache.spark.sql.columnar.compression
 
compressType() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
Provides the RDD[(VertexId, VD)] equivalent output.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomVectorRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.BlockRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CartesianRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CheckpointRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoalescedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.EmptyRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MapPartitionsRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PipedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
:: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SampledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SubtractedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Runs the SQL query against the JDBC driver.
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Method that generates a RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Method that generates a RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream
Finds the files that were modified since the last time this method was called and makes a union RDD out of them.
compute(Time) - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
Generates RDDs with blocks received by the receiver of this stream.
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.StateDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
compute(Time) - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.kafka.KafkaRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
Gets the partition data by getting the corresponding block from the block manager.
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes column-wise summary statistics.
computeCorrelation(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Compute correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
Compute Spearman's correlation for two datasets.
computeCorrelationMatrix(RDD<Vector>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation matrix from the covariance matrix.
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the correlation implementation for RDD[Vector].
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the covariance matrix, treating each row as an observation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
Method to calculate loss of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
Method to calculate loss of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
Method to calculate error of the base learner for the gradient boosting calculation.
computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
Method to calculate loss of the base learner for the gradient boosting calculation.
computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils
Returns a sampling rate that guarantees a sample of size >= sampleSizeLowerBound 99.99% of the time.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the Gramian matrix A^T A.
computeOrReadCheckpoint(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
Compute an RDD partition or read it from a checkpoint if the RDD is checkpointing.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the top k principal components.
computeSplitSize(long, long, long) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
This input format overrides computeSplitSize() to make sure that each split only contains full records.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the singular value decomposition of this IndexedRowMatrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes singular value decomposition of this matrix.
computeSVD(int, boolean, double, int, double, String) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
The actual SVD implementation, visible for testing.
computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Given the result returned by getCounts, determine the threshold for accepting items to generate exact sample size.
conf() - Method in interface org.apache.spark.input.Configurable
 
conf() - Method in class org.apache.spark.rdd.RDD
 
conf() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
conf() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
conf() - Method in class org.apache.spark.scheduler.TaskSetManager
 
conf() - Method in class org.apache.spark.SparkContext
 
conf() - Method in class org.apache.spark.SparkEnv
 
conf() - Method in class org.apache.spark.sql.parquet.ParquetRelation
 
conf() - Method in class org.apache.spark.storage.BlockManager
 
conf() - Method in class org.apache.spark.streaming.StreamingContext
 
conf() - Method in class org.apache.spark.ui.SparkUI
 
confidence() - Method in class org.apache.spark.partial.BoundedDouble
 
config() - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
configFile() - Method in class org.apache.spark.metrics.MetricsConfig
 
configTestLog4j(String) - Static method in class org.apache.spark.util.Utils
config a log4j properties used for testsuite
Configurable - Interface in org.apache.spark.input
A trait to implement Configurable interface.
ConfigurableCombineFileRecordReader<K,V> - Class in org.apache.spark.input
A CombineFileRecordReader that can pass Hadoop Configuration to Configurable RecordReaders.
ConfigurableCombineFileRecordReader(InputSplit, TaskAttemptContext, Class<? extends RecordReader<K, V>>) - Constructor for class org.apache.spark.input.ConfigurableCombineFileRecordReader
 
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
configuration() - Method in interface org.apache.spark.sql.parquet.ParquetTest
 
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in "labels"
connect(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
connected(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
ConnectedComponents - Class in org.apache.spark.graphx.lib
Connected components algorithm.
ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
 
connectLeader(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
CONSOLE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
CONSOLE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
 
ConsoleProgressBar - Class in org.apache.spark.ui
ConsoleProgressBar shows the progress of stages in the next line of the console.
ConsoleProgressBar(SparkContext) - Constructor for class org.apache.spark.ui.ConsoleProgressBar
 
ConsoleSink - Class in org.apache.spark.metrics.sink
 
ConsoleSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.ConsoleSink
 
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
An input stream that always returns the same RDD on each timestep.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
 
constructTree(org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData[]) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
Given a list of nodes from a tree, construct the tree.
constructTrees(RDD<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
constructURIForAuthentication(URI, SecurityManager) - Static method in class org.apache.spark.util.Utils
Construct a URI container information used for authentication.
consumerConnector() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
 
contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
Checks whether a parameter is explicitly specified.
contains(String) - Method in class org.apache.spark.SparkConf
Does the configuration contain a given parameter?
contains(Object) - Method in class org.apache.spark.sql.Column
Contains the other element.
contains(String) - Method in class org.apache.spark.sql.types.Metadata
Tests whether this Metadata contains a binding for a key.
contains(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
Check if block manager master has a block.
contains(BlockId) - Method in class org.apache.spark.storage.BlockStore
 
contains(BlockId) - Method in class org.apache.spark.storage.DiskStore
 
contains(BlockId) - Method in class org.apache.spark.storage.MemoryStore
 
contains(BlockId) - Method in class org.apache.spark.storage.TachyonStore
 
contains(A) - Method in class org.apache.spark.util.TimeStampedHashSet
 
containsBlock(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
Check if disk block manager has a block.
containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
Return whether the given block is stored in this block manager in O(1) time.
containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
 
containsNull() - Method in class org.apache.spark.sql.types.ArrayType
 
containsShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster
Check if the given shuffle is being tracked
contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
context() - Method in interface org.apache.spark.api.java.JavaRDDLike
The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
 
context() - Method in class org.apache.spark.rdd.RDD
The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
 
context() - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
 
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream
Return the StreamingContext associated with this DStream
ContextCleaner - Class in org.apache.spark
An asynchronous cleaner for RDD, shuffle, and broadcast state.
ContextCleaner(SparkContext) - Constructor for class org.apache.spark.ContextCleaner
 
ContextWaiter - Class in org.apache.spark.streaming
 
ContextWaiter() - Constructor for class org.apache.spark.streaming.ContextWaiter
 
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
convert() - Method in class org.apache.spark.WritableConverter
 
convert() - Method in class org.apache.spark.WritableFactory
 
convertFromAttributes(Seq<Attribute>, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertFromString(String) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertFromTimestamp(Timestamp) - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
convertJavaToCatalyst(Object, DataType) - Static method in class org.apache.spark.sql.types.DataTypeConversions
Converts Java objects to catalyst rows / types
convertSplitLocationInfo(Object[]) - Static method in class org.apache.spark.rdd.HadoopRDD
 
convertToAttributes(Type, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertToBaggedRDD(RDD<Datum>, double, int, boolean, int) - Static method in class org.apache.spark.mllib.tree.impl.BaggedPoint
Convert an input dataset into its BaggedPoint representation, choosing subsamplingRate counts for each instance.
convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps
Convert bi-directional edges into uni-directional ones.
convertToString(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
 
convertToTimestamp(Binary) - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
 
convertToTreeRDD(RDD<LabeledPoint>, Bin[][], DecisionTreeMetadata) - Static method in class org.apache.spark.mllib.tree.impl.TreePoint
Convert an input dataset into its TreePoint representation, binning feature values in preparation for DecisionTree training.
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
:: Experimental :: Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
coordinatorActor() - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
 
copiesRunning() - Method in class org.apache.spark.scheduler.TaskSetManager
 
copy() - Method in class org.apache.spark.ml.param.ParamMap
Make a copy of this param map.
copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y = x
copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
copy() - Method in interface org.apache.spark.mllib.linalg.Vector
Makes a deep copy of this vector.
copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
 
copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
 
copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
 
copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the class when applicable for non-locking concurrent usage.
copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
 
copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Returns a shallow copy of this instance.
copy() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Make a deep copy of this ImpurityCalculator.
copy() - Method in interface org.apache.spark.sql.Row
Make a copy of the current Row object.
copy() - Method in class org.apache.spark.util.StatCounter
Clone this StatCounter
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
 
copyField(Row, int, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType
Copies from(fromOrdinal) to to(toOrdinal).
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
 
copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.STRING
 
copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils
Copy all data from an InputStream to an OutputStream.
cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
cores() - Method in class org.apache.spark.scheduler.WorkerOffer
 
coresByTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the Pearson correlation matrix for the input RDD of Vectors.
corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the correlation matrix for the input RDD of Vectors using the specified method.
corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the Pearson correlation for the input RDDs.
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the correlation for the input RDDs using the specified method.
Correlation - Interface in org.apache.spark.mllib.stat.correlation
Trait for correlation algorithms.
CorrelationNames - Class in org.apache.spark.mllib.stat.correlation
Maintains supported and default correlation names.
CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
Correlations - Class in org.apache.spark.mllib.stat.correlation
Delegates computation to the specific correlation object based on the input method name.
Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
 
corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
count() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the number of elements in the RDD.
count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
The number of edges in the RDD.
count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
The number of vertices in the RDD.
count() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
 
count() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
 
count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
 
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample size.
count() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
Number of data points accounted for in the sufficient statistics.
count() - Method in class org.apache.spark.rdd.RDD
Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
 
count() - Method in interface org.apache.spark.sql.columnar.ColumnStats
 
count() - Method in class org.apache.spark.sql.DataFrame
Returns the number of rows in the DataFrame.
count(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of items in a group.
count(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of items in a group.
count() - Method in class org.apache.spark.sql.GroupedData
Count the number of rows for each group.
COUNT() - Static method in class org.apache.spark.sql.hive.HiveQl
 
count() - Method in interface org.apache.spark.sql.RDDApi
 
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.util.StatCounter
 
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental ::
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of count, which returns a future for counting the number of elements in this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
Count the number of elements for each key, collecting the results to a local Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental :: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
(Experimental) Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
(Experimental) Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
:: Experimental :: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
counter() - Method in class org.apache.spark.partial.MeanEvaluator
 
counter() - Method in class org.apache.spark.partial.SumEvaluator
 
CountEvaluator - Class in org.apache.spark.partial
An ApproximateEvaluator for counts.
CountEvaluator(int, double) - Constructor for class org.apache.spark.partial.CountEvaluator
 
cpFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
cpRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
cpState() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
CPUS_PER_TASK() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
CR() - Method in class org.apache.spark.ui.ConsoleProgressBar
 
CreatableRelationProvider - Interface in org.apache.spark.sql.sources
 
create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Deprecated.
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Create a new StorageLevel object.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
Create a PartitionPruningRDD.
create(String, LogicalPlan, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
Creates a new ParquetRelation and underlying Parquetfile for the given LogicalPlan.
create(Object...) - Static method in class org.apache.spark.sql.RowFactory
Create a Row from the given arguments.
create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
 
create(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
 
create(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
create(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
createActorSystem(String, String, int, SparkConf, SecurityManager) - Static method in class org.apache.spark.util.AkkaUtils
Creates an ActorSystem ready for remoting, with various Spark features.
createAkkaConfig() - Method in class org.apache.spark.SSLOptions
Creates an Akka configuration object which contains all the SSL settings represented by this object.
createArrayType(DataType) - Static method in class org.apache.spark.sql.types.DataTypes
Creates an ArrayType by specifying the data type of elements (elementType).
createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
Creates an ArrayType by specifying the data type of elements (elementType) and whether the array contains null values (containsNull).
createCombiner() - Method in class org.apache.spark.Aggregator
 
createCommand(Protos.Offer, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
createCompiledClass(String, File, String, String, Seq<URL>) - Static method in class org.apache.spark.TestUtils
Creates a compiled class with the given name.
createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates a DataFrame from an RDD of case classes.
createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates a DataFrame from a local Seq of Product.
createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Creates a DataFrame from an RDD containing Rows using the given schema.
createDataFrame(RDD<Row>, StructType, boolean) - Method in class org.apache.spark.sql.SQLContext
Creates a DataFrame from an RDD[Row].
createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
:: DeveloperApi :: Creates a DataFrame from an JavaRDD containing Rows using the given schema.
createDataFrame(JavaRDD<Row>, List<String>) - Method in class org.apache.spark.sql.SQLContext
Creates a DataFrame from an JavaRDD containing Rows by applying a seq of names of columns to this RDD, the data type for each column will be inferred by the first row.
createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Applies a schema to an RDD of Java Beans.
createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Applies a schema to an RDD of Java Beans.
createDataSourceTable(String, Option<StructType>, String, Map<String, String>, boolean) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
Creates a data source table (a table created with USING clause) in Hive's metastore.
createDecimal(BigDecimal) - Static method in class org.apache.spark.sql.hive.HiveShim
 
createDecimalType(int, int) - Static method in class org.apache.spark.sql.types.DataTypes
 
createDecimalType() - Static method in class org.apache.spark.sql.types.DataTypes
 
createDefaultDBIfNeeded(HiveContext) - Static method in class org.apache.spark.sql.hive.HiveShim
 
createDirectory(String, String) - Static method in class org.apache.spark.util.Utils
Create a directory inside the given parent directory.
createDirectStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(StreamingContext, Map<String, String>, Set<String>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, Map<TopicAndPartition, Long>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, Set<String>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDriverEnv(SparkConf, boolean, LiveListenerBus, Option<OutputCommitCoordinator>) - Static method in class org.apache.spark.SparkEnv
Create a SparkEnv for the driver.
createDriverResultsArray() - Static method in class org.apache.spark.sql.hive.HiveShim
 
createEmpty(String, Seq<Attribute>, boolean, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
Creates an empty ParquetRelation and underlying Parquetfile that only consists of the Metadata for the given schema.
createExecutorEnv(SparkConf, String, String, int, int, boolean) - Static method in class org.apache.spark.SparkEnv
Create a SparkEnv for an executor.
createExecutorInfo(String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates an external table from the given path and returns the corresponding DataFrame.
createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates an external table from the given path based on a data source and returns the corresponding DataFrame.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Creates an external table from the given path based on a data source and a set of options.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: (Scala-specific) Creates an external table from the given path based on a data source and a set of options.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: Create an external table from the given path based on a data source, a schema and a set of options.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
:: Experimental :: (Scala-specific) Create an external table from the given path based on a data source, a schema and a set of options.
createFilter(Expression) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
createFunction() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
createHistoryUI(SparkConf, SparkListenerBus, SecurityManager, String, String) - Static method in class org.apache.spark.ui.SparkUI
 
createJar(Seq<File>, File) - Static method in class org.apache.spark.TestUtils
Create a jar file that contains this set of files.
createJarWithClasses(Seq<String>, String, Seq<Tuple2<String, String>>, Seq<URL>) - Static method in class org.apache.spark.TestUtils
Create a jar that defines classes with the given names.
createJarWithFiles(Map<String, String>, File) - Static method in class org.apache.spark.TestUtils
Create a jar file containing multiple files.
createJDBCTable(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame
Save this DataFrame to a JDBC database at url under the table name table.
createJettySslContextFactory() - Method in class org.apache.spark.SSLOptions
Creates a Jetty SSL context factory according to the SSL settings represented by this object.
createJobID(Date, int) - Static method in class org.apache.spark.SparkHadoopWriter
 
createLiveUI(SparkContext, SparkConf, SparkListenerBus, JobProgressListener, SecurityManager, String) - Static method in class org.apache.spark.ui.SparkUI
 
createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a MapType by specifying the data type of keys (keyType) and values (keyType).
createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a MapType by specifying the data type of keys (keyType), the data type of values (keyType), and whether values contain any null value (valueContainsNull).
createMesosTask(TaskDescription, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
Turn a Spark TaskDescription into a Mesos task
CreateMetastoreDataSource - Class in org.apache.spark.sql.hive.execution
 
CreateMetastoreDataSource(String, Option<StructType>, String, Map<String, String>, boolean, boolean) - Constructor for class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
 
CreateMetastoreDataSourceAsSelect - Class in org.apache.spark.sql.hive.execution
 
CreateMetastoreDataSourceAsSelect(String, String, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
 
createMetricsSystem(String, SparkConf, SecurityManager) - Static method in class org.apache.spark.metrics.MetricsSystem
 
createNewSparkContext(SparkConf) - Static method in class org.apache.spark.streaming.StreamingContext
 
createNewSparkContext(String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.streaming.StreamingContext
 
createPartitioner() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
createPathFromString(String, JobConf) - Static method in class org.apache.spark.SparkHadoopWriter
 
createPathFromString(String, JobConf) - Static method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
createPlan(String) - Static method in class org.apache.spark.sql.hive.HiveQl
Creates LogicalPlan for a given HiveQL string.
createPlanForView(Table, Option<String>) - Static method in class org.apache.spark.sql.hive.HiveQl
 
createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPythonWorker(String, Map<String, String>) - Method in class org.apache.spark.SparkEnv
 
createRDD(SparkContext, Map<String, String>, OffsetRange[], ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(SparkContext, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, OffsetRange[]) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
:: Experimental :: Create a RDD from Kafka using offset ranges for each topic and partition.
createRecordFilter(Seq<Expression>) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
Create a FixedLengthBinaryRecordReader
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamFileInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
 
createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler that always redirects the user to the given path
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.jdbc.DefaultSource
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.json.DefaultSource
Returns a new base relation with the parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in class org.apache.spark.sql.json.DefaultSource
Returns a new base relation with the given schema and parameters.
createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in class org.apache.spark.sql.json.DefaultSource
 
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.parquet.DefaultSource
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in class org.apache.spark.sql.parquet.DefaultSource
Returns a new base relation with the given parameters and schema.
createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in class org.apache.spark.sql.parquet.DefaultSource
Returns a new base relation with the given parameters and save given data into it.
createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider
Creates a relation with the given parameters based on the contents of the given DataFrame.
createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider
Returns a new base relation with the given parameters and user defined schema.
createRoutingTables(EdgeRDD<?>, Partitioner) - Static method in class org.apache.spark.graphx.VertexRDD
 
createServlet(JettyUtils.ServletParams<T>, SecurityManager, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
 
createServletHandler(String, JettyUtils.ServletParams<T>, SecurityManager, String, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
Create a context handler that responds to a request with the given path prefix
createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a context handler that responds to a request with the given path prefix
createSparkConf() - Method in class org.apache.spark.streaming.Checkpoint
 
createSparkEnv(SparkConf, boolean, LiveListenerBus) - Method in class org.apache.spark.SparkContext
 
createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler for serving files from a static directory
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, Map<String, String>, Map<String, Integer>, StorageLevel) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
 
createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Create an InputDStream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Create a Java-friendly InputDStream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
Create an input stream that receives messages pushed by a MQTT publisher.
createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
Create a input stream that returns tweets received from Twitter.
createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
Create an input stream that receives messages pushed by a zeromq publisher.
createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a StructField by specifying the name (name), data type (dataType) and whether values of this field can be null values (nullable).
createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a StructField with empty metadata.
createStructType(List<StructField>) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a StructType with the given list of StructFields (fields).
createStructType(StructField[]) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a StructType with the given StructField array (fields).
createTable(String, String, Seq<Attribute>, boolean, Option<CreateTableDesc>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
Create table with specified database, table name, table description and schema
CreateTableAsSelect - Class in org.apache.spark.sql.hive.execution
Create table and insert the query result into it.
CreateTableAsSelect(String, String, LogicalPlan, boolean, Option<CreateTableDesc>) - Constructor for class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
CreateTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
 
CreateTableUsing - Class in org.apache.spark.sql.sources
Used to represent the operation of create table using a data source.
CreateTableUsing(String, Option<StructType>, String, boolean, Map<String, String>, boolean, boolean) - Constructor for class org.apache.spark.sql.sources.CreateTableUsing
 
CreateTableUsingAsSelect - Class in org.apache.spark.sql.sources
A node used to support CTAS statements and saveAsTable for the data source API.
CreateTableUsingAsSelect(String, String, boolean, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.sources.CreateTableUsingAsSelect
 
createTaskSetManager(TaskSet, int) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
createTempDir(String, String) - Static method in class org.apache.spark.util.Utils
Create a temporary directory inside the given parent directory.
createTempLocalBlock() - Method in class org.apache.spark.storage.DiskBlockManager
Produces a unique block id and File suitable for storing local intermediate results.
createTempShuffleBlock() - Method in class org.apache.spark.storage.DiskBlockManager
Produces a unique block id and File suitable for storing shuffled intermediate results.
CreateTempTableUsing - Class in org.apache.spark.sql.sources
 
CreateTempTableUsing(String, Option<StructType>, String, Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CreateTempTableUsing
 
CreateTempTableUsingAsSelect - Class in org.apache.spark.sql.sources
 
CreateTempTableUsingAsSelect(String, String, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
 
createTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
createUsingIndex(Iterator<Product2<Object, VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Similar effect as aggregateUsingIndex((a, b) => a)
createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
 
creationSite() - Method in class org.apache.spark.rdd.RDD
User code that created this RDD (e.g.
creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
 
credentialsProvider() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
 
CrossValidator - Class in org.apache.spark.ml.tuning
:: AlphaComponent :: K-fold cross validation.
CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
 
CrossValidatorModel - Class in org.apache.spark.ml.tuning
:: AlphaComponent :: Model from k-fold cross validation.
CrossValidatorModel(CrossValidator, ParamMap, Model<?>) - Constructor for class org.apache.spark.ml.tuning.CrossValidatorModel
 
CrossValidatorParams - Interface in org.apache.spark.ml.tuning
CSV_DEFAULT_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CSV_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
 
CsvSink - Class in org.apache.spark.metrics.sink
 
CsvSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.CsvSink
 
currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
 
currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
 
currentGraph() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
 
currentInterval(Duration) - Static method in class org.apache.spark.streaming.Interval
 
currentLocalityIndex() - Method in class org.apache.spark.scheduler.TaskSetManager
 
currentResult() - Method in interface org.apache.spark.partial.ApproximateEvaluator
 
currentResult() - Method in class org.apache.spark.partial.CountEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedCountEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
 
currentResult() - Method in class org.apache.spark.partial.GroupedSumEvaluator
 
currentResult() - Method in class org.apache.spark.partial.MeanEvaluator
 
currentResult() - Method in class org.apache.spark.partial.SumEvaluator
 
currentUnrollMemory() - Method in class org.apache.spark.storage.MemoryStore
Return the amount of memory currently occupied for unrolling blocks across all threads.
currentUnrollMemoryForThisThread() - Method in class org.apache.spark.storage.MemoryStore
Return the amount of memory currently occupied for unrolling blocks by this thread.
currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
 

D

DAGScheduler - Class in org.apache.spark.scheduler
The high-level scheduling layer that implements stage-oriented scheduling.
DAGScheduler(SparkContext, TaskScheduler, LiveListenerBus, MapOutputTrackerMaster, BlockManagerMaster, SparkEnv, Clock) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
DAGScheduler(SparkContext, TaskScheduler) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
DAGScheduler(SparkContext) - Constructor for class org.apache.spark.scheduler.DAGScheduler
 
dagScheduler() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
 
dagScheduler() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
dagScheduler() - Method in class org.apache.spark.SparkContext
 
DAGSchedulerEvent - Interface in org.apache.spark.scheduler
Types of events that can be handled by the DAGScheduler.
DAGSchedulerEventProcessLoop - Class in org.apache.spark.scheduler
 
DAGSchedulerEventProcessLoop(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerEventProcessLoop
 
DAGSchedulerSource - Class in org.apache.spark.scheduler
 
DAGSchedulerSource(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerSource
 
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
 
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
 
data() - Method in class org.apache.spark.storage.BlockResult
 
data() - Method in class org.apache.spark.storage.PutResult
 
data() - Method in class org.apache.spark.util.Distribution
 
data() - Method in class org.apache.spark.util.random.GapSamplingIterator
 
data() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
 
database() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
database() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.QualifiedTableName
 
databaseName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
 
databaseName() - Method in class org.apache.spark.sql.sources.RefreshTable
 
dataDeserialize(BlockId, ByteBuffer, Serializer) - Method in class org.apache.spark.storage.BlockManager
Deserializes a ByteBuffer into an iterator of values and disposes of it when the end of the iterator is reached.
DataFrame - Class in org.apache.spark.sql
:: Experimental :: A distributed collection of data organized into named columns.
DataFrame(SQLContext, SQLContext.QueryExecution) - Constructor for class org.apache.spark.sql.DataFrame
 
DataFrame(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.DataFrame
A constructor that automatically analyzes the logical plan.
DATAFRAME_EAGER_ANALYSIS() - Static method in class org.apache.spark.sql.SQLConf
 
dataFrameEagerAnalysis() - Method in class org.apache.spark.sql.SQLConf
 
DataFrameHolder - Class in org.apache.spark.sql
A container for a DataFrame, used for implicit conversions.
DataFrameHolder(DataFrame) - Constructor for class org.apache.spark.sql.DataFrameHolder
 
DataFrameNaFunctions - Class in org.apache.spark.sql
:: Experimental :: Functionality for working with missing data in DataFrames.
DataFrameNaFunctions(DataFrame) - Constructor for class org.apache.spark.sql.DataFrameNaFunctions
 
dataSerialize(BlockId, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
Serializes into a byte buffer.
dataSerializeStream(BlockId, OutputStream, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
Serializes into a stream.
DataSinks() - Method in interface org.apache.spark.sql.hive.HiveStrategies
 
DataSourceStrategy - Class in org.apache.spark.sql.sources
A Strategy for planning scans over data sources defined using the sources API.
DataSourceStrategy() - Constructor for class org.apache.spark.sql.sources.DataSourceStrategy
 
dataType() - Method in class org.apache.spark.sql.columnar.NativeColumnType
 
dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
 
dataType() - Method in class org.apache.spark.sql.hive.HiveUdaf
 
DataType - Class in org.apache.spark.sql.types
:: DeveloperApi :: The base type of all Spark SQL data types.
DataType() - Constructor for class org.apache.spark.sql.types.DataType
 
dataType() - Method in interface org.apache.spark.sql.types.DataTypeParser
 
dataType() - Method in class org.apache.spark.sql.types.StructField
 
dataType() - Method in class org.apache.spark.sql.UserDefinedFunction
 
dataType() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
 
DataTypeConversions - Class in org.apache.spark.sql.types
 
DataTypeConversions() - Constructor for class org.apache.spark.sql.types.DataTypeConversions
 
DataTypeException - Exception in org.apache.spark.sql.types
The exception thrown from the DataTypeParser.
DataTypeException(String) - Constructor for exception org.apache.spark.sql.types.DataTypeException
 
DataTypeParser - Interface in org.apache.spark.sql.types
This is a data type parser that can be used to parse string representations of data types provided in SQL queries.
DataTypes - Class in org.apache.spark.sql.types
To get/create specific data type, users should use singleton objects and factory methods provided by this class.
DataTypes() - Constructor for class org.apache.spark.sql.types.DataTypes
 
DataValidators - Class in org.apache.spark.mllib.util
:: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
 
DATE - Class in org.apache.spark.sql.columnar
 
DATE() - Constructor for class org.apache.spark.sql.columnar.DATE
 
date() - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type date
DateColumnAccessor - Class in org.apache.spark.sql.columnar
 
DateColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DateColumnAccessor
 
DateColumnBuilder - Class in org.apache.spark.sql.columnar
 
DateColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DateColumnBuilder
 
DateColumnStats - Class in org.apache.spark.sql.columnar
 
DateColumnStats() - Constructor for class org.apache.spark.sql.columnar.DateColumnStats
 
DateConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
DateType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the DateType object.
DateType - Class in org.apache.spark.sql.types
:: DeveloperApi :: The data type representing java.sql.Date values.
DateUtils - Class in org.apache.spark.sql.types
helper function to convert between Int value of days since 1970-01-01 and java.sql.Date
DateUtils() - Constructor for class org.apache.spark.sql.types.DateUtils
 
datum() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
 
DDLException - Exception in org.apache.spark.sql.sources
The exception thrown from the DDL parser.
DDLException(String) - Constructor for exception org.apache.spark.sql.sources.DDLException
 
DDLParser - Class in org.apache.spark.sql.sources
A parser for foreign DDL commands.
DDLParser(Function1<String, LogicalPlan>) - Constructor for class org.apache.spark.sql.sources.DDLParser
 
dead(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
decimal() - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type decimal
decimal(int, int) - Method in class org.apache.spark.sql.ColumnName
Creates a new AttributeReference of type decimal
Decimal - Class in org.apache.spark.sql.types
A mutable implementation of BigDecimal that can hold a Long if values are small enough.
Decimal() - Constructor for class org.apache.spark.sql.types.Decimal
 
Decimal.DecimalAsIfIntegral$ - Class in org.apache.spark.sql.types
A Integral evidence parameter for Decimals.
Decimal.DecimalAsIfIntegral$() - Constructor for class org.apache.spark.sql.types.Decimal.DecimalAsIfIntegral$
 
Decimal.DecimalIsConflicted - Interface in org.apache.spark.sql.types
Common methods for Decimal evidence parameters
Decimal.DecimalIsFractional$ - Class in org.apache.spark.sql.types
A Fractional evidence parameter for Decimals.
Decimal.DecimalIsFractional$() - Constructor for class org.apache.spark.sql.types.Decimal.DecimalIsFractional$
 
DecimalConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
decimalMetadata() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
 
decimalMetastoreString(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
 
DecimalType - Class in org.apache.spark.sql.types
:: DeveloperApi :: The data type representing java.math.BigDecimal values.
DecimalType(Option<PrecisionInfo>) - Constructor for class org.apache.spark.sql.types.DecimalType
 
DecimalType.Expression$ - Class in org.apache.spark.sql.types
 
DecimalType.Expression$() - Constructor for class org.apache.spark.sql.types.DecimalType.Expression$
 
DecimalType.Fixed$ - Class in org.apache.spark.sql.types
 
DecimalType.Fixed$() - Constructor for class org.apache.spark.sql.types.DecimalType.Fixed$
 
decimalTypeInfo(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
 
decimalTypeInfoToCatalyst(PrimitiveObjectInspector) - Static method in class org.apache.spark.sql.hive.HiveShim
 
DecisionTree - Class in org.apache.spark.mllib.tree
:: Experimental :: A class which implements a decision tree learning algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
 
DecisionTreeMetadata - Class in org.apache.spark.mllib.tree.impl
Learning and dataset metadata for DecisionTree.
DecisionTreeMetadata(int, long, int, int, Map<Object, Object>, Set<Object>, int[], Impurity, Enumeration.Value, int, int, double, int, int) - Constructor for class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
 
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
:: Experimental :: Decision tree model for classification or regression.
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
DecisionTreeModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.tree.model
 
DecisionTreeModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
DecisionTreeModel.SaveLoadV1_0$.NodeData - Class in org.apache.spark.mllib.tree.model
Model data for model import/export
DecisionTreeModel.SaveLoadV1_0$.NodeData(int, int, org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.PredictData, double, boolean, Option<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.SplitData>, Option<Object>, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
 
DecisionTreeModel.SaveLoadV1_0$.PredictData - Class in org.apache.spark.mllib.tree.model
 
DecisionTreeModel.SaveLoadV1_0$.PredictData(double, double) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
 
DecisionTreeModel.SaveLoadV1_0$.SplitData - Class in org.apache.spark.mllib.tree.model
 
DecisionTreeModel.SaveLoadV1_0$.SplitData(int, double, int, Seq<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
 
decoder() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
 
decoder(ByteBuffer, NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
 
Decoder<T extends NativeType> - Interface in org.apache.spark.sql.columnar.compression
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
 
decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
 
decreaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
 
deepCopy() - Method in class org.apache.spark.mllib.tree.model.Node
Returns a deep copy of the subtree rooted at this node.
DEFAULT_BUFFER_SIZE() - Static method in class org.apache.spark.util.logging.RollingFileAppender
 
DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
 
DEFAULT_DATA_SOURCE_NAME() - Static method in class org.apache.spark.sql.SQLConf
 
DEFAULT_LOG_DIR() - Static method in class org.apache.spark.scheduler.EventLoggingListener
 
DEFAULT_MINIMUM_SHARE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_PARTITION_NAME() - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
 
DEFAULT_POOL_NAME() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_PORT() - Static method in class org.apache.spark.ui.SparkUI
 
DEFAULT_RETAINED_JOBS() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
 
DEFAULT_SCHEDULER_FILE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_SCHEDULING_MODE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
DEFAULT_SIZE_IN_BYTES() - Static method in class org.apache.spark.sql.SQLConf
 
DEFAULT_WEIGHT() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
 
defaultCorrName() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
defaultDataSourceName() - Method in class org.apache.spark.sql.SQLConf
 
defaultFilter(Path) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
 
defaultFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext
Default min number of partitions for Hadoop RDDs when not given by user Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2.
defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
As of Spark 1.0.0, defaultMinSplits is deprecated, use JavaSparkContext.defaultMinPartitions() instead
defaultMinSplits() - Method in class org.apache.spark.SparkContext
Default min number of partitions for Hadoop RDDs when not given by user
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
defaultParallelism() - Method in class org.apache.spark.scheduler.local.LocalBackend
 
defaultParallelism() - Method in interface org.apache.spark.scheduler.SchedulerBackend
 
defaultParallelism() - Method in interface org.apache.spark.scheduler.TaskScheduler
 
defaultParallelism() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
 
defaultParallelism() - Method in class org.apache.spark.SparkContext
Default level of parallelism to use when not given by user (e.g.
defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Returns default configuration for the boosting algorithm
defaultParams(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Returns default configuration for the boosting algorithm
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultPartitioner(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
 
defaultProbabilities() - Method in class org.apache.spark.util.Distribution
 
defaultSize() - Method in class org.apache.spark.sql.columnar.ColumnType
 
defaultSize() - Method in class org.apache.spark.sql.types.ArrayType
The default size of a value of the ArrayType is 100 * the default size of the element type.
defaultSize() - Method in class org.apache.spark.sql.types.BinaryType
The default size of a value of the BinaryType is 4096 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.BooleanType
The default size of a value of the BooleanType is 1 byte.
defaultSize() - Method in class org.apache.spark.sql.types.ByteType
The default size of a value of the ByteType is 1 byte.
defaultSize() - Method in class org.apache.spark.sql.types.DataType
The default size of a value of this data type.
defaultSize() - Method in class org.apache.spark.sql.types.DateType
The default size of a value of the DateType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.DecimalType
The default size of a value of the DecimalType is 4096 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.DoubleType
The default size of a value of the DoubleType is 8 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.FloatType
The default size of a value of the FloatType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.IntegerType
The default size of a value of the IntegerType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.LongType
The default size of a value of the LongType is 8 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.MapType
The default size of a value of the MapType is 100 * (the default size of the key type + the default size of the value type).
defaultSize() - Method in class org.apache.spark.sql.types.NullType
 
defaultSize() - Method in class org.apache.spark.sql.types.ShortType
The default size of a value of the ShortType is 2 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.StringType
The default size of a value of the StringType is 4096 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.StructType
The default size of a value of the StructType is the total default sizes of all field types.
defaultSize() - Method in class org.apache.spark.sql.types.TimestampType
The default size of a value of the TimestampType is 12 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.UserDefinedType
The default size of a value of the UserDefinedType is 4096 bytes.
defaultSizeInBytes() - Method in class org.apache.spark.sql.SQLConf
The default size in bytes to assign to a logical operator's estimation statistics.
DefaultSource - Class in org.apache.spark.sql.jdbc
Given a partitioning schematic (a column of integral type, a number of partitions, and upper and lower bounds on the column's value), generate WHERE clauses for each partition so that each row in the table appears exactly once.
DefaultSource() - Constructor for class org.apache.spark.sql.jdbc.DefaultSource
 
DefaultSource - Class in org.apache.spark.sql.json
 
DefaultSource() - Constructor for class org.apache.spark.sql.json.DefaultSource
 
DefaultSource - Class in org.apache.spark.sql.parquet
Allows creation of Parquet based tables using the syntax:
DefaultSource() - Constructor for class org.apache.spark.sql.parquet.DefaultSource
 
defaultStategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
Construct a default set of parameters for DecisionTree
defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
Construct a default set of parameters for DecisionTree
defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
 
defaultValue() - Method in class org.apache.spark.ml.param.Param
 
DeferredObjectAdapter - Class in org.apache.spark.sql.hive
 
DeferredObjectAdapter(ObjectInspector) - Constructor for class org.apache.spark.sql.hive.DeferredObjectAdapter
 
degrees() - Method in class org.apache.spark.graphx.GraphOps
The degree of each vertex in the graph.
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
 
degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult
Returns the degree(s) of freedom of the hypothesis test.
delaySeconds() - Method in class org.apache.spark.streaming.Checkpoint
 
delegate() - Method in class org.apache.spark.InterruptibleIterator
 
deleteAllCheckpoints() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
Call this at the end to delete any remaining checkpoint files.
deleteAllCheckpoints() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
Call this after training is finished to delete any remaining checkpoints.
deleteOldFiles() - Method in class org.apache.spark.util.logging.RollingFileAppender
Retain only last few files
deleteRecursively(File) - Static method in class org.apache.spark.util.Utils
Delete a file or directory and its contents recursively.
deleteRecursively(TachyonFile, TachyonFS) - Static method in class org.apache.spark.util.Utils
Delete a file or directory and its contents recursively.
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
Creates a column-major dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
Creates a dense vector from a double array.
DenseMatrix - Class in org.apache.spark.mllib.linalg
Column-major dense matrix.
DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
 
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
Column-major dense matrix.
DenseVector - Class in org.apache.spark.mllib.linalg
A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
 
dependencies() - Method in class org.apache.spark.rdd.RDD
Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MappedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.StateDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.UnionDStream
 
dependencies() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
 
Dependency<T> - Class in org.apache.spark
:: DeveloperApi :: Base class for dependencies.
Dependency() - Constructor for class org.apache.spark.Dependency
 
deps() - Method in class org.apache.spark.rdd.CoGroupPartition
 
depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Get depth of tree.
DeregisterReceiver - Class in org.apache.spark.streaming.scheduler
 
DeregisterReceiver(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.DeregisterReceiver
 
desc() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
 
desc() - Method in class org.apache.spark.sql.Column
Returns an ordering used in sorting.
desc(String) - Static method in class org.apache.spark.sql.functions
Returns a sort expression based on the descending order of the column.
desc() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
 
describe(String...) - Method in class org.apache.spark.sql.DataFrame
Computes statistics for numeric columns, including count, mean, stddev, min, and max.
describe(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
Computes statistics for numeric columns, including count, mean, stddev, min, and max.
DescribeCommand - Class in org.apache.spark.sql.sources
Returned for the "DESCRIBE [EXTENDED] [dbName.]tableName" command.
DescribeCommand(LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.sources.DescribeCommand
 
DescribeHiveTableCommand - Class in org.apache.spark.sql.hive.execution
Implementation for "describe [extended] table".
DescribeHiveTableCommand(MetastoreRelation, Seq<Attribute>, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
 
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
 
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LDAModel
Return the topics described by weighted terms.
describeTopics() - Method in class org.apache.spark.mllib.clustering.LDAModel
Return the topics described by weighted terms.
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
 
description() - Method in class org.apache.spark.ExceptionFailure
 
description() - Method in class org.apache.spark.storage.StorageLevel
 
description() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
DeserializationStream - Class in org.apache.spark.serializer
:: DeveloperApi :: A stream for reading serialized objects.
DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
 
deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
 
deserialize(Object) - Method in class org.apache.spark.sql.types.UserDefinedType
Convert a SQL datum to the user type
deserialize(byte[]) - Static method in class org.apache.spark.util.Utils
Deserialize an object using Java serialization
deserialize(byte[], ClassLoader) - Static method in class org.apache.spark.util.Utils
Deserialize an object using Java serialization and the given ClassLoader
deserialized() - Method in class org.apache.spark.storage.MemoryEntry
 
deserialized() - Method in class org.apache.spark.storage.StorageLevel
 
deserializeFilterExpressions(Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
Note: Inside the Hadoop API we only have access to Configuration, not to SparkContext, so we cannot use broadcasts to convey the actual filter predicate.
deserializeLongValue(byte[]) - Static method in class org.apache.spark.util.Utils
Deserialize a Long value (used for PythonPartitioner)
deserializeMapStatuses(byte[]) - Static method in class org.apache.spark.MapOutputTracker
 
deserializePlan(InputStream, Class<?>) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserializeStream(InputStream, ClassLoader) - Method in class org.apache.spark.serializer.JavaSerializerInstance
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
 
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
 
deserializeViaNestedStream(InputStream, SerializerInstance, Function1<DeserializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
Deserialize via nested stream using specific serializer
deserializeWithDependencies(ByteBuffer) - Static method in class org.apache.spark.scheduler.Task
Deserialize the list of dependencies in a task serialized with serializeWithDependencies, and return the task itself as a serialized ByteBuffer.
destinationToken() - Static method in class org.apache.spark.sql.hive.HiveQl
 
destroy() - Method in class org.apache.spark.broadcast.Broadcast
Destroy all data and metadata related to this broadcast variable.
destroy(boolean) - Method in class org.apache.spark.broadcast.Broadcast
Destroy all data and metadata related to this broadcast variable.
destroyPythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
 
destTableId() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
detach() - Method in class org.apache.spark.streaming.ui.StreamingTab
 
detachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUI
 
detachTab(WebUITab) - Method in class org.apache.spark.ui.WebUI
 
details() - Method in class org.apache.spark.scheduler.Stage
 
details() - Method in class org.apache.spark.scheduler.StageInfo
 
determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
Determines the bounds for range partitioning from candidates with weights indicating how many items each represents.
DeveloperApi - Annotation Type in org.apache.spark.annotation
A lower-level, unstable API intended for developers.
df() - Method in class org.apache.spark.sql.DataFrameHolder
 
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
Generate a diagonal matrix in DenseMatrix format from the supplied values.
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices
Generate a diagonal matrix in Matrix format from the supplied values.
DIALECT() - Static method in class org.apache.spark.sql.SQLConf
 
dialect() - Method in class org.apache.spark.sql.SQLConf
The SQL dialect that is used when parsing queries.
DictionaryEncoding - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding
 
DictionaryEncoding.Decoder<T extends NativeType> - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
 
DictionaryEncoding.Encoder<T extends NativeType> - Class in org.apache.spark.sql.columnar.compression
 
DictionaryEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
 
diff(Self) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
Hides vertices that are the same between this and other.
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
Hides vertices that are the same between this and other; for vertices that are different, keeps the values from other.
dir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
 
dir() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
 
DirectKafkaInputDStream<K,V,U extends kafka.serializer.Decoder<K>,T extends kafka.serializer.Decoder<V>,R> - Class in org.apache.spark.streaming.kafka
A stream of KafkaRDD where each given Kafka topic/partition corresponds to an RDD partition.
DirectKafkaInputDStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>, ClassTag<R>) - Constructor for class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
 
DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData - Class in org.apache.spark.streaming.kafka
 
DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData() - Constructor for class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
 
DirectTaskResult<T> - Class in org.apache.spark.scheduler
A TaskResult that contains the task's return value and accumulator updates.
DirectTaskResult(ByteBuffer, Map<Object, Object>, TaskMetrics) - Constructor for class org.apache.spark.scheduler.DirectTaskResult
 
DirectTaskResult() - Constructor for class org.apache.spark.scheduler.DirectTaskResult
 
disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions
Allows for the spark.hadoop.validateOutputSpecs checks to be disabled on a case-by-case basis; see SPARK-4835 for more details.
disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
disconnected() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
 
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
 
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
 
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
 
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
 
diskBlockManager() - Method in class org.apache.spark.storage.BlockManager
 
DiskBlockManager - Class in org.apache.spark.storage
Creates and maintains the logical mapping between logical blocks and physical on-disk locations.
DiskBlockManager(BlockManager, SparkConf) - Constructor for class org.apache.spark.storage.DiskBlockManager
 
DiskBlockObjectWriter - Class in org.apache.spark.storage
BlockObjectWriter which writes directly to a file on disk.
DiskBlockObjectWriter(BlockId, File, Serializer, int, Function1<OutputStream, OutputStream>, boolean, ShuffleWriteMetrics) - Constructor for class org.apache.spark.storage.DiskBlockObjectWriter
 
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
 
diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
diskSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
diskSize() - Method in class org.apache.spark.storage.BlockStatus
 
diskSize() - Method in class org.apache.spark.storage.RDDInfo
 
diskStore() - Method in class org.apache.spark.storage.BlockManager
 
DiskStore - Class in org.apache.spark.storage
Stores BlockManager blocks on disk.
DiskStore(BlockManager, DiskBlockManager) - Constructor for class org.apache.spark.storage.DiskStore
 
diskUsed() - Method in class org.apache.spark.storage.StorageStatus
Return the disk space used by this block manager.
diskUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
 
diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
Return the disk space used by the given RDD in this block manager in O(1) time.
dispose(ByteBuffer) - Static method in class org.apache.spark.storage.BlockManager
Attempt to clean up a ByteBuffer if it is memory-mapped.
dist(Vector) - Method in class org.apache.spark.util.Vector
 
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD
Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.DataFrame
Returns a new DataFrame that contains only the unique rows from this DataFrame.
distinct() - Method in interface org.apache.spark.sql.RDDApi
 
DistributedLDAModel - Class in org.apache.spark.mllib.clustering
:: Experimental ::
DistributedLDAModel(LDA.EMOptimizer, double[]) - Constructor for class org.apache.spark.mllib.clustering.DistributedLDAModel
 
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
Represents a distributively stored matrix backed by one or more RDDs.
Distribution - Class in org.apache.spark.util
Util for getting some stats from a small sample of numeric values, with some handy summary functions.
Distribution(double[], int, int) - Constructor for class org.apache.spark.util.Distribution
 
Distribution(Traversable<Object>) - Constructor for class org.apache.spark.util.Distribution
 
DIV() - Static method in class org.apache.spark.sql.hive.HiveQl
 
div(Decimal, Decimal) - Method in class org.apache.spark.sql.types.Decimal.DecimalIsFractional$
 
div(Duration) - Method in class org.apache.spark.streaming.Duration
 
divide(Object) - Method in class org.apache.spark.sql.Column
Division this expression by another expression.
divide(double) - Method in class org.apache.spark.util.Vector
 
doc() - Method in class org.apache.spark.ml.param.Param
 
doCancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
 
docConcentration() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
 
doCheckpoint() - Method in class org.apache.spark.rdd.RDD
Performs the checkpointing of this RDD by saving this.
doCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
 
DoCheckpoint - Class in org.apache.spark.streaming.scheduler
 
DoCheckpoint(Time, boolean) - Constructor for class org.apache.spark.streaming.scheduler.DoCheckpoint
 
doCleanupBroadcast(long, boolean) - Method in class org.apache.spark.ContextCleaner
Perform broadcast cleanup.
doCleanupRDD(int, boolean) - Method in class org.apache.spark.ContextCleaner
Perform RDD cleanup.
doCleanupShuffle(int, boolean) - Method in class org.apache.spark.ContextCleaner
Perform shuffle cleanup, asynchronously.
doesDirectoryContainAnyNewFiles(File, long) - Static method in class org.apache.spark.util.Utils
Determines if a directory contains any files newer than cutoff seconds.
doKillExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
Request that the ApplicationMaster kill the specified executors.
doRequestTotalExecutors(int) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
Request executors from the ApplicationMaster by specifying the total number desired.
dot(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
dot(x, y)
dot(Vector) - Method in class org.apache.spark.util.Vector
 
DOUBLE - Class in org.apache.spark.sql.columnar
 
DOUBLE() - Constructor for class org.apache.spark.sql.columnar.DOUBLE
 
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Create an Accumulator double variable, which tasks can "add" values to using the add method.
DoubleColumnAccessor - Class in org.apache.spark.sql.columnar
 
DoubleColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DoubleColumnAccessor
 
DoubleColumnBuilder - Class in org.apache.spark.sql.columnar
 
DoubleColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnBuilder
 
DoubleColumnStats - Class in org.apache.spark.sql.columnar
 
DoubleColumnStats() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnStats
 
DoubleConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
Accessor for nested Scala object
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function
A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleParam - Class in org.apache.spark.ml.param
Specialized version of Param[Double] for Java.
DoubleParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
 
DoubleParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
 
DoubleRDDFunctions - Class in org.apache.spark.rdd
Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
 
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.rdd.RDD
 
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
 
doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
 
doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
 
DoubleType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the DoubleType object.
DoubleType - Class in org.apache.spark.sql.types
:: DeveloperApi :: The data type representing Double values.
doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
 
doubleWritableConverter() - Static method in class org.apache.spark.WritableConverter
 
doubleWritableFactory() - Static method in class org.apache.spark.WritableFactory
 
driver() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
 
driver() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
 
DRIVER_AKKA_ACTOR_NAME() - Method in class org.apache.spark.storage.BlockManagerMaster
 
DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
 
driverActor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
 
driverActor() - Method in class org.apache.spark.storage.BlockManagerMaster
 
driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
 
DriverQuirks - Class in org.apache.spark.sql.jdbc
Encapsulates workarounds for the extensions, quirks, and bugs in various databases.
DriverQuirks() - Constructor for class org.apache.spark.sql.jdbc.DriverQuirks
 
driverSideSetup() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
 
drop() - Method in class org.apache.spark.sql.DataFrameNaFunctions
Returns a new DataFrame that drops rows containing any null values.
drop(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions
Returns a new DataFrame that drops rows containing null values.
drop(String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
Returns a new DataFrame that drops rows containing any null values in the specified columns.
drop(Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
(Scala-specific) Returns a new DataFrame that drops rows containing any null values in the specified columns.
drop(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
Returns a new DataFrame that drops rows containing null values in the specified columns.
drop(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
(Scala-specific) Returns a new DataFrame that drops rows containing null values in the specified columns.
drop(int) - Method in class org.apache.spark.sql.DataFrameNaFunctions
Returns a new DataFrame that drops rows containing less than minNonNulls non-null values.
drop(int, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
Returns a new DataFrame that drops rows containing less than minNonNulls non-null values in the specified columns.
drop(int, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
(Scala-specific) Returns a new DataFrame that drops rows containing less than minNonNulls non-null values in the specified columns.
dropFromMemory(BlockId, Either<Object[], ByteBuffer>) - Method in class org.apache.spark.storage.BlockManager
Drop a block from memory, possibly putting it on disk if applicable.
droppedBlocks() - Method in class org.apache.spark.storage.PutResult
 
droppedBlocks() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
 
DropTable - Class in org.apache.spark.sql.hive.execution
Drops a table from the metastore and removes it if it is cached.
DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DropTable
 
dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
Drops the temporary table with the given table name in the catalog.
Dst - Static variable in class org.apache.spark.graphx.TripletFields
Expose the destination and edge fields but not the source field.
dstAttr() - Method in class org.apache.spark.graphx.EdgeContext
The vertex attribute of the edge's destination vertex.
dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
The destination vertex attribute
dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
dstEncodedIndices() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
 
dstEncodedIndices() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
 
dstId() - Method in class org.apache.spark.graphx.Edge
 
dstId() - Method in class org.apache.spark.graphx.EdgeContext
The vertex id of the edge's destination vertex.
dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
dstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
 
dstIds() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
 
dstPtrs() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
 
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
DStream<T> - Class in org.apache.spark.streaming.dstream
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
 
DStreamCheckpointData<T> - Class in org.apache.spark.streaming.dstream
 
DStreamCheckpointData(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStreamCheckpointData
 
DStreamGraph - Class in org.apache.spark.streaming
 
DStreamGraph() - Constructor for class org.apache.spark.streaming.DStreamGraph
 
DTStatsAggregator - Class in org.apache.spark.mllib.tree.impl
DecisionTree statistics aggregator for a node.
DTStatsAggregator(DecisionTreeMetadata, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.impl.