- abort(String) - Method in class org.apache.spark.scheduler.TaskSetManager
-
- abortStage(Stage, String) - Method in class org.apache.spark.scheduler.DAGScheduler
-
Aborts all jobs depending on a particular Stage.
- AbsoluteError - Class in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Class for absolute error loss calculation (for regression).
- AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
-
- accept(File, String) - Method in class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
-
- AcceptanceResult - Class in org.apache.spark.util.random
-
Object used by seqOp to keep track of the number of items accepted and items waitlisted per
stratum, as well as the bounds for accepting and waitlisting items.
- AcceptanceResult(long, long) - Constructor for class org.apache.spark.util.random.AcceptanceResult
-
- acceptBound() - Method in class org.apache.spark.util.random.AcceptanceResult
-
- Accumulable<R,T> - Class in org.apache.spark
-
A data type that can be accumulated, ie has an commutative and associative "add" operation,
but where the result type, R
, may be different from the element type being added, T
.
- Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
-
- Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
-
- accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulable
shared variable of the given type, to which tasks
can "add" values with
add
.
- accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulable
shared variable of the given type, to which tasks
can "add" values with
add
.
- accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulable
shared variable, to which tasks can add values
with
+=
.
- accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulable
shared variable, with a name for display in the
Spark UI.
- accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
-
Create an accumulator from a "mutable collection" type.
- AccumulableInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Information about an
Accumulable
modified during a task or stage.
- AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
-
- accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
- AccumulableParam<R,T> - Interface in org.apache.spark
-
Helper object defining how to accumulate values of a particular type.
- accumulables() - Method in class org.apache.spark.scheduler.StageInfo
-
Terminal values of accumulables updated during this stage.
- accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
-
Intermediate updates to accumulables during this task.
- accumulables() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- Accumulator<T> - Class in org.apache.spark
-
A simpler value of
Accumulable
where the result type being accumulated is the same
as the types of elements being merged, i.e.
- Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
-
- Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
-
- accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
add
method.
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
add
method.
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
+=
method.
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulator
variable of a given type, with a name for display
in the Spark UI.
- accumulator() - Method in class org.apache.spark.sql.execution.PythonUDF
-
- AccumulatorParam<T> - Interface in org.apache.spark
-
A simpler version of
AccumulableParam
where the only data type you can add
in is the same type as the accumulated value.
- Accumulators - Class in org.apache.spark
-
- Accumulators() - Constructor for class org.apache.spark.Accumulators
-
- accumUpdates() - Method in class org.apache.spark.scheduler.CompletionEvent
-
- accumUpdates() - Method in class org.apache.spark.scheduler.DirectTaskResult
-
- accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns accuracy
- aclsEnabled() - Method in class org.apache.spark.SecurityManager
-
Check to see if Acls for the UI are enabled
- active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- activeExecutorIds() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- ActiveJob - Class in org.apache.spark.scheduler
-
Tracks information about an active job in the DAGScheduler.
- ActiveJob(int, Stage, Function2<TaskContext, Iterator<Object>, ?>, int[], CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.ActiveJob
-
- activeJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- activeTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- activeTaskSets() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
-
- ActorHelper - Interface in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
A receiver trait to be mixed in with your Actor to gain access to
the API for pushing received data into Spark Streaming for being processed.
- ActorLogReceive - Interface in org.apache.spark.util
-
A trait to enable logging all Akka actor messages.
- ActorReceiver<T> - Class in org.apache.spark.streaming.receiver
-
Provides Actors as receivers for receiving stream.
- ActorReceiver(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver
-
- ActorReceiver.Supervisor - Class in org.apache.spark.streaming.receiver
-
- ActorReceiver.Supervisor() - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
-
- ActorReceiverData - Interface in org.apache.spark.streaming.receiver
-
Case class to receive data sent by child actors
- actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
A helper with set of defaults for supervisor strategy
- ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-
- actorSystem() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- actorSystem() - Method in class org.apache.spark.SparkEnv
-
- actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
-
- actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Returns the size of the value row(ordinal)
.
- actualSize(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
-
- add(T) - Method in class org.apache.spark.Accumulable
-
Add more data to this accumulator / accumulable
- add(Map<Object, Object>) - Static method in class org.apache.spark.Accumulators
-
- add(long, long, ED) - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
-
Add a new edge to the partition.
- add(long, long, int, int, ED) - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
-
Add a new edge to the partition.
- add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Adds a new document.
- add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Add a new sample to this summarizer, and update the statistical summary.
- add(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Add the stats from another calculator into this one, modifying and returning this calculator.
- add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
- add(Vector) - Method in class org.apache.spark.util.Vector
-
- addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
-
Add additional data to the accumulator value.
- addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
-
- addAccumulator(R, T) - Method in class org.apache.spark.GrowableAccumulableParam
-
- addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus
-
Add the given block to this storage status.
- AddBlock - Class in org.apache.spark.streaming.scheduler
-
- AddBlock(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.AddBlock
-
- addBlock(ReceivedBlockInfo) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Add received block.
- addBoolean(boolean) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addData(Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
-
Push a single data item into the buffer.
- addDataWithCallback(Object, Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
-
Push a single data item into the buffer.
- addDouble(double) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addedFiles() - Method in class org.apache.spark.SparkContext
-
- addedJars() - Method in class org.apache.spark.SparkContext
-
- AddExchange - Class in org.apache.spark.sql.execution
-
Ensures that the
Partitioning
of input data meets the
Distribution
requirements for
each operator by inserting
Exchange
Operators where required.
- AddExchange(SQLContext) - Constructor for class org.apache.spark.sql.execution.AddExchange
-
- addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(File) - Method in class org.apache.spark.HttpFileServer
-
- addFile(String) - Method in class org.apache.spark.SparkContext
-
Add a file to be downloaded with this Spark job on every node.
- AddFile - Class in org.apache.spark.sql.hive
-
- AddFile(String) - Constructor for class org.apache.spark.sql.hive.AddFile
-
- AddFile - Class in org.apache.spark.sql.hive.execution
-
:: DeveloperApi ::
- AddFile(String) - Constructor for class org.apache.spark.sql.hive.execution.AddFile
-
- addFileToDir(File, File) - Method in class org.apache.spark.HttpFileServer
-
- addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
-
Add filters, if any, to the given list of ServletContextHandlers
- addFloat(float) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a param with multiple values (overwrites if the input param exists).
- addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a double param with multiple values.
- addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a int param with multiple values.
- addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a float param with multiple values.
- addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a long param with multiple values.
- addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a boolean param with true and false.
- addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
-
Merge two accumulated values together.
- addInPlace(R, R) - Method in class org.apache.spark.GrowableAccumulableParam
-
- addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
- addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
- addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
-
- addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
-
- addInPlace(Vector) - Method in class org.apache.spark.util.Vector
-
- addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
-
- addInputStream(InputDStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
-
- addInt(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- addJar(File) - Method in class org.apache.spark.HttpFileServer
-
- addJar(String) - Method in class org.apache.spark.SparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- AddJar - Class in org.apache.spark.sql.hive
-
- AddJar(String) - Constructor for class org.apache.spark.sql.hive.AddJar
-
- AddJar - Class in org.apache.spark.sql.hive.execution
-
:: DeveloperApi ::
- AddJar(String) - Constructor for class org.apache.spark.sql.hive.execution.AddJar
-
- addListener(SparkListener) - Method in interface org.apache.spark.scheduler.SparkListenerBus
-
- addListener(StreamingListener) - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBus
-
- addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
-
Add Hadoop configuration specific to a single partition and attempt.
- addLong(long) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addOnCompleteCallback(Function0<Unit>) - Method in class org.apache.spark.TaskContext
-
Deprecated.
- addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
-
- addOutputLoc(int, MapStatus) - Method in class org.apache.spark.scheduler.Stage
-
- addOutputStream(DStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
-
- addPartitioningAttributes(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
-
- addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- address() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
-
- addresses() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
-
- addRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager
-
If the given task ID is not in the set of running tasks, adds it.
- addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
-
- addSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
-
- addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
-
- addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Register a listener to receive up-calls from events that happen during execution.
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
-
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
-
Add a (Java friendly) listener to be executed on task completion.
- addTaskCompletionListener(Function1<TaskContext, Unit>) - Method in class org.apache.spark.TaskContext
-
Add a listener in the form of a Scala closure to be executed on task completion.
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContextImpl
-
- addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
-
- addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
-
- addTaskSetManager(Schedulable, Properties) - Method in interface org.apache.spark.scheduler.SchedulableBuilder
-
- addToTime(long) - Method in class org.apache.spark.streaming.util.ManualClock
-
- adminAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
-
- advanceCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
-
Advance the checkpoint clock by the checkpoint interval.
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- Aggregate - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
Groups input data by groupingExpressions
and computes the aggregateExpressions
for each
group.
- Aggregate(boolean, Seq<Expression>, Seq<NamedExpression>, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Aggregate
-
- aggregate() - Method in class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
-
- aggregate(Seq<Expression>) - Method in class org.apache.spark.sql.SchemaRDD
-
Performs an aggregation over all Rows in this RDD.
- Aggregate.ComputedAggregate - Class in org.apache.spark.sql.execution
-
An aggregate that needs to be computed for each row in a group.
- Aggregate.ComputedAggregate(AggregateExpression, AggregateExpression, AttributeReference) - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate
-
- Aggregate.ComputedAggregate$ - Class in org.apache.spark.sql.execution
-
- Aggregate.ComputedAggregate$() - Constructor for class org.apache.spark.sql.execution.Aggregate.ComputedAggregate$
-
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- AggregateEvaluation - Class in org.apache.spark.sql.execution
-
- AggregateEvaluation(Seq<Attribute>, Seq<Expression>, Seq<Expression>, Expression) - Constructor for class org.apache.spark.sql.execution.AggregateEvaluation
-
- aggregateExpressions() - Method in class org.apache.spark.sql.execution.Aggregate
-
- aggregateExpressions() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
-
- aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
-
Aggregates values from the neighboring edges and vertices of each vertex.
- aggregateMessagesEdgeScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Send messages along edges and aggregate them at the receiving vertices.
- aggregateMessagesIndexScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Send messages along edges and aggregate them at the receiving vertices.
- aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
-
Aggregates values from the neighboring edges and vertices of each vertex.
- aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- aggregateSizeForNode(DecisionTreeMetadata, Option<int[]>) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Get the number of values to be stored for this node in the bin aggregates.
- aggregateUsingIndex(Iterator<Product2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Aggregates vertices in messages
that have the same ids using reduceFunc
, returning a
VertexRDD co-indexed with this
.
- AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
-
- AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- Aggregator<K,V,C> - Class in org.apache.spark
-
:: DeveloperApi ::
A set of functions used to aggregate data.
- Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
-
- aggregator() - Method in class org.apache.spark.ShuffleDependency
-
- AkkaUtils - Class in org.apache.spark.util
-
Various utility classes for working with Akka.
- AkkaUtils() - Constructor for class org.apache.spark.util.AkkaUtils
-
- Algo - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Enum to select the algorithm for the decision tree
- Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
-
- algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
- alias() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- aliasNames() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
-
- All - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose all the fields (source, edge, and destination).
- allAggregates(Seq<Expression>) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
-
- AllCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
-
- AllJobsCancelled - Class in org.apache.spark.scheduler
-
- AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
-
- AllJobsPage - Class in org.apache.spark.ui.jobs
-
Page showing list of all ongoing and recently finished jobs
- AllJobsPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.AllJobsPage
-
- allJoinTokens() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Allocate all unallocated blocks to the given batch.
- allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
-
Allocate all unallocated blocks to the given batch.
- AllocatedBlocks - Class in org.apache.spark.streaming.scheduler
-
Class representing the blocks of all the streams allocated to a batch
- AllocatedBlocks(Map<Object, Seq<ReceivedBlockInfo>>) - Constructor for class org.apache.spark.streaming.scheduler.AllocatedBlocks
-
- allocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
-
- allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- allowLocal() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- allPendingTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- AllStagesPage - Class in org.apache.spark.ui.jobs
-
Page showing list of all ongoing and recently finished stages and pools
- AllStagesPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.AllStagesPage
-
- AlphaComponent - Annotation Type in org.apache.spark.annotation
-
A new component of Spark which may have unstable API's.
- alreadyPlanned() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
-
- ALS - Class in org.apache.spark.mllib.recommendation
-
Alternating Least Squares matrix factorization.
- ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
-
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10,
lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
- ALS.BlockStats - Class in org.apache.spark.mllib.recommendation
-
:: DeveloperApi ::
Statistics of a block in ALS computation.
- ALS.BlockStats(String, int, long, long, long, long) - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats
-
- ALS.BlockStats$ - Class in org.apache.spark.mllib.recommendation
-
- ALS.BlockStats$() - Constructor for class org.apache.spark.mllib.recommendation.ALS.BlockStats$
-
- ALSPartitioner - Class in org.apache.spark.mllib.recommendation
-
Partitioner for ALS.
- ALSPartitioner(int) - Constructor for class org.apache.spark.mllib.recommendation.ALSPartitioner
-
- analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
Analyzes the given table in the current database to generate statistics, which will be
used in query optimizations.
- analyzeBlocks(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
:: DeveloperApi ::
Given an RDD of ratings, number of user blocks, and number of product blocks, computes the
statistics of each block in ALS computation.
- analyzed() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.QueryExecution
-
- AnalyzeTable - Class in org.apache.spark.sql.hive
-
- AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.AnalyzeTable
-
- AnalyzeTable - Class in org.apache.spark.sql.hive.execution
-
:: DeveloperApi ::
Analyzes the given table in the current database to generate statistics, which will be
used in query optimizations.
- AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.execution.AnalyzeTable
-
- AND() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- append(boolean, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
-
- append(byte, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
-
- append(byte[], ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
-
- append(JvmType, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Appends the given value v of type T into the given ByteBuffer.
- append(Row, int, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Appends row(ordinal)
of type T into the given ByteBuffer.
- append(Date, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
-
- append(double, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
-
- append(float, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
-
- append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
-
- append(long, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
-
- append(short, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
-
- append(String, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
-
- append(Timestamp, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
-
- append(AvroFlumeEvent) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
-
- appendBatch(List<AvroFlumeEvent>) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
-
- appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Returns a new vector with 1.0
(bias) appended to the input vector.
- appendFrom(Row, int) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
-
- appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
-
Appends row(ordinal)
to the column builder.
- appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
-
- appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
-
- AppendingParquetOutputFormat - Class in org.apache.spark.sql.parquet
-
TODO: this will be able to append to directories it created itself, not necessarily
to imported ones.
- AppendingParquetOutputFormat(int) - Constructor for class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
-
- appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- appId() - Method in class org.apache.spark.scheduler.ApplicationEventListener
-
- appId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- appId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- appId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- appId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appId() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- APPLICATION_COMPLETE() - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
- applicationComplete() - Method in class org.apache.spark.scheduler.EventLoggingInfo
-
- applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
-
- ApplicationEventListener - Class in org.apache.spark.scheduler
-
A simple listener for application events.
- ApplicationEventListener() - Constructor for class org.apache.spark.scheduler.ApplicationEventListener
-
- applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- applicationId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- applicationId() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- applicationId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
Get an application ID associated with the job.
- applicationId() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
Get an application ID associated with the job.
- applicationId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- applicationId() - Method in class org.apache.spark.SparkContext
-
- applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
-
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of vertices and
edges with attributes.
- apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from edges, setting referenced vertices to `defaultVertexAttr`.
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`.
- apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
- apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Construct a `ShippableVertexPartition` from the given vertices without any routing table.
- apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Construct a ShippableVertexPartition
from the given vertices with the specified routing
table, filling in missing vertices mentioned in the routing table using defaultVal
.
- apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Construct a ShippableVertexPartition
from the given vertices with the specified routing
table, filling in missing vertices mentioned in the routing table using defaultVal
,
and merging duplicate vertex atrribute with mergeFunc.
- apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition
-
Construct a `VertexPartition` from the given vertices.
- apply(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
Return the vertex attribute for the given vertex ID.
- apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
-
Execute a Pregel-like iterative vertex-parallel abstraction.
- apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a standalone
VertexRDD
(one that is not set up for efficient joins with an
EdgeRDD
) from an RDD of vertex-attribute pairs.
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
from an RDD of vertex-attribute pairs.
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
from an RDD of vertex-attribute pairs.
- apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Gets the value of the input param or its default value if it does not exist.
- apply(BinaryConfusionMatrix) - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryClassificationMetricComputer
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
-
- apply(BinaryConfusionMatrix) - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
-
- apply(int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Gets the (i, j)-th element.
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Gets the value of the ith element.
- apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
- apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
-
- apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
-
Alternate factory method that takes a ByteBuffer directly for the data field
- apply(BlockManagerId, long[]) - Static method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
-
- apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
-
- apply(DataType) - Static method in class org.apache.spark.sql.columnar.ColumnType
-
- apply(boolean, int, StorageLevel, SparkPlan, Option<String>) - Static method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- apply(SparkPlan) - Method in class org.apache.spark.sql.execution.AddExchange
-
- apply(PythonUDF, LogicalPlan) - Static method in class org.apache.spark.sql.execution.EvaluatePython
-
- apply(LogicalPlan) - Static method in class org.apache.spark.sql.execution.ExtractPythonUdfs
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.BasicOperators
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.BroadcastNestedLoopJoin
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.CartesianProduct
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.CommandStrategy
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashJoin
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.InMemoryScans
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.LeftSemiJoin
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.ParquetOperations
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.execution.SparkStrategies.TakeOrdered
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.DataSinks
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.Scripts
-
- apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.DataSourceStrategy
-
- apply(String) - Method in class org.apache.spark.sql.sources.DDLParser
-
- apply(String) - Static method in class org.apache.spark.storage.BlockId
-
Converts a BlockId "name" String back into a BlockId.
- apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object without setting useOffHeap.
- apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object.
- apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object from its integer representation.
- apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Read StorageLevel object from ObjectInput stream.
- apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
-
- apply(long) - Static method in class org.apache.spark.streaming.Minutes
-
- apply(long) - Static method in class org.apache.spark.streaming.Seconds
-
- apply(I, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.CompletionIterator
-
- apply(Traversable<Object>) - Static method in class org.apache.spark.util.Distribution
-
- apply(InputStream, File, SparkConf) - Static method in class org.apache.spark.util.logging.FileAppender
-
Create the right appender based on Spark configuration
- apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values.
- apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values passed as variable-length arguments.
- apply(A) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- apply(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- apply(int) - Method in class org.apache.spark.util.Vector
-
- applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
Applies a schema to an RDD of Java Beans.
- applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
:: DeveloperApi ::
Creates a JavaSchemaRDD from an RDD containing Rows by applying a schema to this RDD.
- applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
:: DeveloperApi ::
Creates a
SchemaRDD
from an
RDD
containing
Row
s by applying a schema to this RDD.
- applySchemaToPythonRDD(RDD<Object[]>, String) - Method in class org.apache.spark.sql.SQLContext
-
Apply a schema defined by the schemaString to an RDD.
- applySchemaToPythonRDD(RDD<Object[]>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
Apply a schema defined by the schema to an RDD.
- appName() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- appName() - Method in class org.apache.spark.scheduler.ApplicationEventListener
-
- appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appName() - Method in class org.apache.spark.SparkContext
-
- appName() - Method in class org.apache.spark.ui.SparkUI
-
- appName() - Method in class org.apache.spark.ui.SparkUITab
-
- ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- ApproximateActionListener<T,U,R> - Class in org.apache.spark.partial
-
A JobListener for an approximate single-result action, such as count() or non-parallel reduce().
- ApproximateActionListener(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Constructor for class org.apache.spark.partial.ApproximateActionListener
-
- ApproximateEvaluator<U,R> - Interface in org.apache.spark.partial
-
An object that computes a function incrementally by merging in results of type U from multiple
tasks.
- appUIAddress() - Method in class org.apache.spark.ui.SparkUI
-
- appUIHostPort() - Method in class org.apache.spark.ui.SparkUI
-
Return the application UI host:port.
- AreaUnderCurve - Class in org.apache.spark.mllib.evaluation
-
Computes the area under the curve (AUC) using the trapezoidal rule.
- AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
-
- areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the precision-recall curve.
- areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the receiver operating characteristic (ROC) curve.
- areBoundsEmpty() - Method in class org.apache.spark.util.random.AcceptanceResult
-
- argString() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- arr() - Method in class org.apache.spark.rdd.PartitionGroup
-
- ARRAY() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- ARRAY_CONTAINS_NULL_BAG_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
-
- ARRAY_ELEMENTS_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
-
- arrayBuffer() - Method in class org.apache.spark.streaming.receiver.ArrayBufferBlock
-
- ArrayBufferBlock - Class in org.apache.spark.streaming.receiver
-
class representing a block received as an ArrayBuffer
- ArrayBufferBlock(ArrayBuffer<?>) - Constructor for class org.apache.spark.streaming.receiver.ArrayBufferBlock
-
- ArrayType - Class in org.apache.spark.sql.api.java
-
The data type representing Lists.
- ArrayValues - Class in org.apache.spark.storage
-
- ArrayValues(Object[]) - Constructor for class org.apache.spark.storage.ArrayValues
-
- as(Symbol) - Method in class org.apache.spark.sql.SchemaRDD
-
Applies a qualifier to the attributes of this relation.
- asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
-
Read the elements of this stream through an iterator.
- asJavaDataType(DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
-
Returns the equivalent DataType in Java for the given DataType in Scala.
- asJavaStructField(StructField) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
-
Returns the equivalent StructField in Scala for the given StructField in Java.
- askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
-
- askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
-
- askTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
-
Returns the default Spark timeout to use for Akka ask operations.
- askWithReply(Object, ActorRef, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
-
Send a message to the given actor and get its result within a default timeout, or
throw a SparkException if this fails.
- askWithReply(Object, ActorRef, int, int, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
-
Send a message to the given actor and get its result within a default timeout, or
throw a SparkException if this fails even after the specified number of retries.
- asRDDId() - Method in class org.apache.spark.storage.BlockId
-
- asScalaDataType(DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
-
Returns the equivalent DataType in Scala for the given DataType in Java.
- asScalaStructField(StructField) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
-
Returns the equivalent StructField in Scala for the given StructField in Java.
- assertValid() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
Check validity of parameters.
- assertValid() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Check validity of parameters.
- assertValid() - Method in class org.apache.spark.rdd.BlockRDD
-
Check if this BlockRDD is valid.
- AsyncRDDActions<T> - Class in org.apache.spark.rdd
-
A set of asynchronous RDD actions available through an implicit conversion.
- AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
-
- attachExecutor(ReceiverSupervisor) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Attach Network Receiver executor to this receiver.
- attachHandler(ServletContextHandler) - Method in class org.apache.spark.ui.WebUI
-
Attach a handler to this UI.
- attachListener(CleanerListener) - Method in class org.apache.spark.ContextCleaner
-
Attach a listener object to get information of when objects are cleaned.
- attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUI
-
Attach a page to this UI.
- attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUITab
-
Attach a page to this tab.
- attachTab(WebUITab) - Method in class org.apache.spark.ui.WebUI
-
Attach a tab to this UI, along with all of its attached pages.
- attempt() - Method in class org.apache.spark.scheduler.TaskInfo
-
- attempt() - Method in class org.apache.spark.scheduler.TaskSet
-
- attemptId() - Method in class org.apache.spark.scheduler.Stage
-
- attemptId() - Method in class org.apache.spark.scheduler.StageInfo
-
- attemptId() - Method in class org.apache.spark.TaskContext
-
- attemptId() - Method in class org.apache.spark.TaskContextImpl
-
- attr() - Method in class org.apache.spark.graphx.Edge
-
- attr() - Method in class org.apache.spark.graphx.EdgeContext
-
The attribute associated with the edge.
- attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- attr() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
-
- attribute() - Method in class org.apache.spark.sql.sources.EqualTo
-
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
-
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- attribute() - Method in class org.apache.spark.sql.sources.In
-
- attribute() - Method in class org.apache.spark.sql.sources.LessThan
-
- attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
-
- attributeMap() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
An attribute map that can be used to lookup original attributes based on expression id.
- attributeMap() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
- attributeMap() - Method in class org.apache.spark.sql.sources.LogicalRelation
-
Used to lookup original attribute capitalization
- attributes() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
-
- attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- attributes() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
Non-partitionKey attributes
- attributes() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- attributes() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
-
- attrs() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
-
- autoBroadcastJoinThreshold() - Method in interface org.apache.spark.sql.SQLConf
-
Upper bound on the sizes (in bytes) of the tables qualified for the auto conversion to
a broadcast value during the physical executions of join operations.
- Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
-
- AVG() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- awaitResult() - Method in class org.apache.spark.partial.ApproximateActionListener
-
Waits for up to timeout milliseconds since the listener was created and then returns a
PartialResult with the result so far.
- awaitResult() - Method in class org.apache.spark.scheduler.JobWaiter
-
- awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTermination() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Wait the thread until the supervisor is stopped
- awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- awaitTermination() - Method in class org.apache.spark.util.logging.FileAppender
-
Wait for the appender to stop appending, either because input stream is closed
or because of any error in appending
- axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
y += a * x
- cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.api.java.JavaRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.graphx.Graph
-
Caches the vertices and edges associated with this graph at the previously-specified target
storage levels, which default to MEMORY_ONLY
.
- cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY.
- cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY.
- cache() - Method in class org.apache.spark.partial.StudentTCacher
-
- cache() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.sql.SchemaRDD
-
Overridden cache function will always use the in-memory columnar caching.
- cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- CachedBatch - Class in org.apache.spark.sql.columnar
-
- CachedBatch(byte[][], Row) - Constructor for class org.apache.spark.sql.columnar.CachedBatch
-
- cachedColumnBuffers() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- CachedData - Class in org.apache.spark.sql
-
Holds a cached logical plan and its data
- CachedData(LogicalPlan, InMemoryRelation) - Constructor for class org.apache.spark.sql.CachedData
-
- cachedData() - Method in interface org.apache.spark.sql.CacheManager
-
- cachedRepresentation() - Method in class org.apache.spark.sql.CachedData
-
- cacheLock() - Method in interface org.apache.spark.sql.CacheManager
-
- CacheManager - Class in org.apache.spark
-
Spark class responsible for passing RDDs partition contents to the BlockManager and making
sure a node doesn't load two copies of an RDD at once.
- CacheManager(BlockManager) - Constructor for class org.apache.spark.CacheManager
-
- cacheManager() - Method in class org.apache.spark.SparkEnv
-
- CacheManager - Interface in org.apache.spark.sql
-
Provides support in a SQLContext for caching query results and automatically using these cached
results when subsequent queries are executed.
- cacheQuery(SchemaRDD, Option<String>, StorageLevel) - Method in interface org.apache.spark.sql.CacheManager
-
Caches the data produced by the logical representation of the given schema rdd.
- cacheTable(String) - Method in interface org.apache.spark.sql.CacheManager
-
Caches the specified table in-memory.
- CacheTableCommand - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- CacheTableCommand(String, Option<LogicalPlan>, boolean) - Constructor for class org.apache.spark.sql.execution.CacheTableCommand
-
- cacheTables() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
:: DeveloperApi ::
variance calculation
- calculate() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
Calculate the impurity from the stored sufficient statistics.
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
:: DeveloperApi ::
variance calculation
- calculate() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
Calculate the impurity from the stored sufficient statistics.
- calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
:: DeveloperApi ::
information calculation for regression
- calculate() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Calculate the impurity from the stored sufficient statistics.
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
:: DeveloperApi ::
variance calculation
- calculate() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
-
Calculate the impurity from the stored sufficient statistics.
- calculatedTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- calculateNumBatchesToRemember(Duration) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
-
Calculate the number of last batches to remember, such that all the files selected in
at least last MIN_REMEMBER_DURATION duration can be remembered.
- calculateTotalMemory(SparkContext) - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
-
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
-
- call(T1) - Method in interface org.apache.spark.api.java.function.Function
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
-
- call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
-
- call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
-
- call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
-
- call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
-
- call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
-
- call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
-
- call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
-
- call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
-
- call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
-
- callSite() - Method in class org.apache.spark.scheduler.ActiveJob
-
- callSite() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- callSite() - Method in class org.apache.spark.scheduler.Stage
-
- CallSite - Class in org.apache.spark.util
-
CallSite represents a place in user code.
- CallSite(String, String) - Constructor for class org.apache.spark.util.CallSite
-
- canBeCodeGened(Seq<AggregateExpression>) - Method in class org.apache.spark.sql.execution.SparkStrategies.HashAggregation
-
- cancel() - Method in class org.apache.spark.ComplexFutureAction
-
- cancel() - Method in interface org.apache.spark.FutureAction
-
Cancels the execution of this action.
- cancel(boolean) - Method in class org.apache.spark.JavaFutureActionWrapper
-
- cancel() - Method in class org.apache.spark.scheduler.JobWaiter
-
Sends a signal to the DAGScheduler to cancel the job.
- cancel() - Method in class org.apache.spark.SimpleFutureAction
-
- cancel() - Method in class org.apache.spark.util.MetadataCleaner
-
- cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
-
Cancel all jobs that are running or waiting in the queue.
- cancelAllJobs() - Method in class org.apache.spark.SparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelJob(int) - Method in class org.apache.spark.scheduler.DAGScheduler
-
Cancel a job that is running or waiting in the queue.
- cancelJob(int) - Method in class org.apache.spark.SparkContext
-
Cancel a given job if it's scheduled or running
- cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel active jobs for the specified group.
- cancelJobGroup(String) - Method in class org.apache.spark.scheduler.DAGScheduler
-
- cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
-
Cancel active jobs for the specified group.
- cancelStage(int) - Method in class org.apache.spark.scheduler.DAGScheduler
-
Cancel all jobs associated with a running or scheduled stage.
- cancelStage(int) - Method in class org.apache.spark.SparkContext
-
Cancel a given stage and all jobs associated with it
- cancelTasks(int, boolean) - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- cancelTasks(int, boolean) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- canEqual(Object) - Method in class org.apache.spark.sql.api.java.Row
-
- canEqual(Object) - Method in class org.apache.spark.util.MutablePair
-
- canFetchMoreResults(long) - Method in class org.apache.spark.scheduler.TaskSetManager
-
Check whether has enough quota to fetch the result with size
bytes
- capacity() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
- cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- CartesianPartition - Class in org.apache.spark.rdd
-
- CartesianPartition(int, RDD<?>, RDD<?>, int, int) - Constructor for class org.apache.spark.rdd.CartesianPartition
-
- CartesianProduct - Class in org.apache.spark.sql.execution.joins
-
:: DeveloperApi ::
- CartesianProduct(SparkPlan, SparkPlan) - Constructor for class org.apache.spark.sql.execution.joins.CartesianProduct
-
- CartesianProduct() - Method in class org.apache.spark.sql.execution.SparkStrategies
-
- CartesianRDD<T,U> - Class in org.apache.spark.rdd
-
- CartesianRDD(SparkContext, RDD<T>, RDD<U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.CartesianRDD
-
- CASE() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- caseSensitive() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- castChildOutput(InsertIntoTable, MetastoreRelation, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
-
- CatalystArrayContainsNullConverter - Class in org.apache.spark.sql.parquet
-
A
parquet.io.api.GroupConverter
that converts a single-element groups that
match the characteristics of an array contains null (see
ParquetTypesConverter
) into an
ArrayType
.
- CatalystArrayContainsNullConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
-
- CatalystArrayContainsNullConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
-
- CatalystArrayConverter - Class in org.apache.spark.sql.parquet
-
A
parquet.io.api.GroupConverter
that converts a single-element groups that
match the characteristics of an array (see
ParquetTypesConverter
) into an
ArrayType
.
- CatalystArrayConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- CatalystArrayConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- CatalystConverter - Class in org.apache.spark.sql.parquet
-
- CatalystConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystConverter
-
- CatalystGroupConverter - Class in org.apache.spark.sql.parquet
-
A parquet.io.api.GroupConverter
that is able to convert a Parquet record
to a Row
object.
- CatalystGroupConverter(StructField[], int, CatalystConverter, ArrayBuffer<Object>, ArrayBuffer<Row>) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
-
- CatalystGroupConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
-
- CatalystGroupConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
-
This constructor is used for the root converter only!
- CatalystMapConverter - Class in org.apache.spark.sql.parquet
-
A
parquet.io.api.GroupConverter
that converts two-element groups that
match the characteristics of a map (see
ParquetTypesConverter
) into an
MapType
.
- CatalystMapConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystMapConverter
-
- CatalystNativeArrayConverter - Class in org.apache.spark.sql.parquet
-
A
parquet.io.api.GroupConverter
that converts a single-element groups that
match the characteristics of an array (see
ParquetTypesConverter
) into an
ArrayType
.
- CatalystNativeArrayConverter(NativeType, int, CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
-
- CatalystPrimitiveConverter - Class in org.apache.spark.sql.parquet
-
A parquet.io.api.PrimitiveConverter
that converts Parquet types to Catalyst types.
- CatalystPrimitiveConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- CatalystPrimitiveRowConverter - Class in org.apache.spark.sql.parquet
-
A parquet.io.api.GroupConverter
that is able to convert a Parquet record
to a Row
object.
- CatalystPrimitiveRowConverter(StructField[], MutableRow) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- CatalystPrimitiveRowConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- CatalystScan - Class in org.apache.spark.sql.sources
-
::Experimental::
An interface for experimenting with a more direct connection to the query planner.
- CatalystScan() - Constructor for class org.apache.spark.sql.sources.CatalystScan
-
- CatalystStructConverter - Class in org.apache.spark.sql.parquet
-
This converter is for multi-element groups of primitive or complex types
that have repetition level optional or required (so struct fields).
- CatalystStructConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystStructConverter
-
- Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- categories() - Method in class org.apache.spark.mllib.tree.model.Split
-
- category() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
-
- category() - Method in class org.apache.spark.mllib.tree.model.Bin
-
- channelFactory() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
-
- channelFactoryExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
-
- checkEquals(ASTNode) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
-
Throws an error if this is not equal to other.
- checkHost(String, String) - Static method in class org.apache.spark.util.Utils
-
- checkHostPort(String, String) - Static method in class org.apache.spark.util.Utils
-
- checkMinimalPollingPeriod(TimeUnit, int) - Static method in class org.apache.spark.metrics.MetricsSystem
-
- checkModifyPermissions(String) - Method in class org.apache.spark.SecurityManager
-
Checks the given user against the modify acl list to see if they have
authorization to modify the application.
- checkOutputSpecs(JobContext) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
-
- checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Mark this RDD for checkpointing.
- checkpoint() - Method in class org.apache.spark.graphx.Graph
-
Mark this Graph for checkpointing.
- checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- checkpoint() - Method in class org.apache.spark.rdd.CheckpointRDD
-
- checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
-
- checkpoint() - Method in class org.apache.spark.rdd.RDD
-
Mark this RDD for checkpointing.
- checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Enable periodic checkpointing of RDDs of this DStream.
- checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets the context to periodically checkpoint the DStream operations for master
fault-tolerance.
- Checkpoint - Class in org.apache.spark.streaming
-
- Checkpoint(StreamingContext, Time) - Constructor for class org.apache.spark.streaming.Checkpoint
-
- checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Enable periodic checkpointing of RDDs of this DStream
- checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Set the context to periodically checkpoint the DStream operations for driver
fault-tolerance.
- checkpointBackupFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
-
Get the checkpoint backup file for the given checkpoint time
- checkpointClock() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
-
- checkpointData() - Method in class org.apache.spark.rdd.RDD
-
- checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
-
- checkpointDir() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- checkpointDir() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
-
- checkpointDir() - Method in class org.apache.spark.SparkContext
-
- checkpointDir() - Method in class org.apache.spark.streaming.Checkpoint
-
- checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
-
- checkpointDirToLogDir(String, int) - Static method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
-
- checkpointDirToLogDir(String) - Static method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
- checkpointDuration() - Method in class org.apache.spark.streaming.Checkpoint
-
- checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
- checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
-
- Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
-
- checkpointFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
-
Get the checkpoint file for the given checkpoint time
- CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
-
- checkpointInProgress() - Method in class org.apache.spark.streaming.DStreamGraph
-
- checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- checkpointInterval() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
-
- checkpointPath() - Method in class org.apache.spark.rdd.CheckpointRDD
-
- CheckpointRDD<T> - Class in org.apache.spark.rdd
-
This RDD represents a RDD checkpoint file (similar to HadoopRDD).
- CheckpointRDD(SparkContext, String, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CheckpointRDD
-
- checkpointRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- CheckpointRDDPartition - Class in org.apache.spark.rdd
-
- CheckpointRDDPartition(int) - Constructor for class org.apache.spark.rdd.CheckpointRDDPartition
-
- CheckpointReader - Class in org.apache.spark.streaming
-
- CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
-
- CheckpointState - Class in org.apache.spark.rdd
-
Enumeration to manage state transitions of an RDD through checkpointing
[ Initialized --> marked for checkpointing --> checkpointing in progress --> checkpointed ]
- CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
-
- checkpointTime() - Method in class org.apache.spark.streaming.Checkpoint
-
- CheckpointWriter - Class in org.apache.spark.streaming
-
Convenience class to handle the writing of graph checkpoint to file
- CheckpointWriter(JobGenerator, SparkConf, String, Configuration) - Constructor for class org.apache.spark.streaming.CheckpointWriter
-
- CheckpointWriter.CheckpointWriteHandler - Class in org.apache.spark.streaming
-
- CheckpointWriter.CheckpointWriteHandler(Time, byte[]) - Constructor for class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
-
- checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.Pool
-
- checkSpeculatableTasks() - Method in interface org.apache.spark.scheduler.Schedulable
-
- checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
Check for tasks to be speculated and return true if there are any.
- checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- checkTimeoutInterval() - Method in class org.apache.spark.storage.BlockManagerMasterActor
-
- checkUIViewPermissions(String) - Method in class org.apache.spark.SecurityManager
-
Checks the given user against the view acl list to see if they have
authorization to view the UI.
- child() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- child() - Method in class org.apache.spark.sql.execution.Aggregate
-
- child() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
-
- child() - Method in class org.apache.spark.sql.execution.DescribeCommand
-
- child() - Method in class org.apache.spark.sql.execution.Distinct
-
- child() - Method in class org.apache.spark.sql.execution.EvaluatePython
-
- child() - Method in class org.apache.spark.sql.execution.Exchange
-
- child() - Method in class org.apache.spark.sql.execution.ExternalSort
-
- child() - Method in class org.apache.spark.sql.execution.Filter
-
- child() - Method in class org.apache.spark.sql.execution.Generate
-
- child() - Method in class org.apache.spark.sql.execution.GeneratedAggregate
-
- child() - Method in class org.apache.spark.sql.execution.Limit
-
- child() - Method in class org.apache.spark.sql.execution.OutputFaker
-
- child() - Method in class org.apache.spark.sql.execution.Project
-
- child() - Method in class org.apache.spark.sql.execution.Sample
-
- child() - Method in class org.apache.spark.sql.execution.Sort
-
- child() - Method in class org.apache.spark.sql.execution.TakeOrdered
-
- child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- child() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
-
- child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
-
- children() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- children() - Method in class org.apache.spark.sql.execution.BatchPythonEvaluation
-
- children() - Method in class org.apache.spark.sql.execution.ExecutedCommand
-
- children() - Method in class org.apache.spark.sql.execution.LogicalRDD
-
- children() - Method in class org.apache.spark.sql.execution.OutputFaker
-
- children() - Method in class org.apache.spark.sql.execution.PythonUDF
-
- children() - Method in class org.apache.spark.sql.execution.SparkLogicalPlan
-
- children() - Method in class org.apache.spark.sql.execution.Union
-
- children() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
-
- children() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
-
- children() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
-
- children() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- children() - Method in class org.apache.spark.sql.hive.HiveUdaf
-
- children() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
-
- chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
-
- chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Conduct Pearson's chi-squared goodness of fit test of the observed data against the
expected distribution.
- chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform
distribution, with each category having an expected frequency of 1 / observed.size
.
- chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Conduct Pearson's independence test on the input contingency matrix, which cannot contain
negative entries or columns or rows that sum up to 0.
- chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Conduct Pearson's independence test for every feature against the label across the input RDD.
- ChiSqTest - Class in org.apache.spark.mllib.stat.test
-
Conduct the chi-squared test for the input RDDs using the specified method.
- ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
-
- ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test
-
- ChiSqTest.Method(String, Function2<Object, Object, Object>) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method
-
- ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
-
- ChiSqTest.Method$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
-
- ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
-
- ChiSqTest.NullHypothesis$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
-
- ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
-
:: Experimental ::
Object containing the test results for the chi-squared hypothesis test.
- ChiSqTestResult(double, int, double, String, String) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
- chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
Conduct Pearson's independence test for each feature against the label across the input RDD.
- chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
- chmod700(File) - Static method in class org.apache.spark.util.Utils
-
JDK equivalent of chmod 700 file
.
- classForName(String) - Static method in class org.apache.spark.util.Utils
-
Preferred alternative to Class.forName(className)
- Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- ClassificationModel - Interface in org.apache.spark.mllib.classification
-
:: Experimental ::
Represents a classification model that predicts to which of a set of categories an example
belongs.
- classIsLoadable(String) - Static method in class org.apache.spark.util.Utils
-
Determines whether the provided class is loadable in the current thread.
- classLoader() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- className() - Method in class org.apache.spark.ExceptionFailure
-
- classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaRDD
-
- classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- classTag() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- clean(F, boolean) - Method in class org.apache.spark.SparkContext
-
Clean a closure to make it ready to serialized and send to tasks
(removes unreferenced variables in $outer's, updates REPL variables)
If checkSerializable is set, clean will also proactively
check to see if f is serializable and throw a SparkException
if not.
- clean(Object, boolean) - Static method in class org.apache.spark.util.ClosureCleaner
-
- CleanBroadcast - Class in org.apache.spark
-
- CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
-
- cleaner() - Method in class org.apache.spark.SparkContext
-
- CleanerListener - Interface in org.apache.spark
-
Listener class used for testing when any item has been cleaned by the Cleaner class.
- CleanRDD - Class in org.apache.spark
-
- CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
-
- CleanShuffle - Class in org.apache.spark
-
- CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
-
- cleanup(long) - Method in class org.apache.spark.SparkContext
-
Called by MetadataCleaner to clean up the persistentRdds map periodically
- cleanup(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
-
Cleanup old checkpoint data.
- cleanup(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
-
- cleanUpAfterSchedulerStop() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- cleanupOldBatches(Time, boolean) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Clean up block information of old batches.
- cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
-
- CleanupOldBlocks - Class in org.apache.spark.streaming.receiver
-
- CleanupOldBlocks(Time) - Constructor for class org.apache.spark.streaming.receiver.CleanupOldBlocks
-
- cleanupOldBlocks(long) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
-
Cleanup old blocks older than the given threshold time
- cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
-
- cleanupOldBlocksAndBatches(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
-
Clean up the data and metadata of blocks and batches that are strictly
older than the threshold time.
- cleanupOldLogs(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
-
Delete the log files that are older than the threshold time.
- CleanupTask - Interface in org.apache.spark
-
Classes that represent cleaning tasks.
- CleanupTaskWeakReference - Class in org.apache.spark
-
A WeakReference associated with a CleanupTask.
- CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
-
- clear() - Static method in class org.apache.spark.Accumulators
-
- clear() - Method in interface org.apache.spark.sql.SQLConf
-
- clear() - Method in class org.apache.spark.storage.BlockManagerInfo
-
- clear() - Method in class org.apache.spark.storage.BlockStore
-
- clear() - Method in class org.apache.spark.storage.MemoryStore
-
- clear() - Method in class org.apache.spark.util.BoundedPriorityQueue
-
- CLEAR_NULL_VALUES_INTERVAL() - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- clearActiveContext() - Static method in class org.apache.spark.SparkContext
-
Clears the active SparkContext metadata.
- clearCache() - Method in interface org.apache.spark.sql.CacheManager
-
- clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- clearCallSite() - Method in class org.apache.spark.SparkContext
-
Clear the thread-local property for overriding the call sites
of actions and RDDs.
- clearCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
- clearCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
-
- ClearCheckpointData - Class in org.apache.spark.streaming.scheduler
-
- ClearCheckpointData(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearCheckpointData
-
- clearDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
-
- clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
-
- clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the job's list of files added by addFile
so that they do not get downloaded to
any new nodes.
- clearFiles() - Method in class org.apache.spark.SparkContext
-
Clear the job's list of files added by addFile
so that they do not get downloaded to
any new nodes.
- clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the job's list of JARs added by addJar
so that they do not get downloaded to
any new nodes.
- clearJars() - Method in class org.apache.spark.SparkContext
-
Clear the job's list of JARs added by addJar
so that they do not get downloaded to
any new nodes.
- clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the current thread's job group ID and its description.
- clearJobGroup() - Method in class org.apache.spark.SparkContext
-
Clear the current thread's job group ID and its description.
- clearMetadata(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Clear metadata that are older than rememberDuration
of this DStream.
- clearMetadata(Time) - Method in class org.apache.spark.streaming.DStreamGraph
-
- ClearMetadata - Class in org.apache.spark.streaming.scheduler
-
- ClearMetadata(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearMetadata
-
- clearNullValues() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
Remove entries with values that are no longer strongly reachable.
- clearOldValues(long, Function2<A, B, BoxedUnit>) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashMap
-
Removes old key-value pairs that have timestamp earlier than `threshTime`.
- clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashSet
-
Removes old values that have timestamp earlier than threshTime
- clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
Remove old key-value pairs with timestamps earlier than `threshTime`.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Clears the threshold so that predict
will output raw prediction scores.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Clears the threshold so that predict
will output raw prediction scores.
- client() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- client() - Method in class org.apache.spark.storage.TachyonBlockManager
-
- client() - Method in class org.apache.spark.streaming.flume.FlumeConnection
-
- Clock - Interface in org.apache.spark
-
An abstract clock for measuring elapsed time.
- clock() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
-
- clock() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- Clock - Interface in org.apache.spark.streaming.util
-
- Clock - Interface in org.apache.spark.util
-
An interface to represent clocks, so that they can be mocked out in unit tests.
- clone() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
-
- clone() - Method in class org.apache.spark.SparkConf
-
Copy this object
- clone(JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Creates a duplicated copy of the value.
- clone() - Method in class org.apache.spark.storage.StorageLevel
-
- clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- clone() - Method in class org.apache.spark.util.random.BernoulliSampler
-
- clone() - Method in class org.apache.spark.util.random.PoissonSampler
-
- clone() - Method in interface org.apache.spark.util.random.RandomSampler
-
return a copy of the RandomSampler object
- clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
-
Clone an object using a Spark serializer.
- cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
Return a sampler that is the complement of the range specified of the current sampler.
- close() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- close() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
-
- close() - Method in class org.apache.spark.input.PortableDataStream
-
Close the file (if it is currently open)
- close() - Method in class org.apache.spark.input.StreamBasedRecordReader
-
- close() - Method in class org.apache.spark.input.WholeTextFileRecordReader
-
- close() - Method in class org.apache.spark.serializer.DeserializationStream
-
- close() - Method in class org.apache.spark.serializer.JavaDeserializationStream
-
- close() - Method in class org.apache.spark.serializer.JavaSerializationStream
-
- close() - Method in class org.apache.spark.serializer.KryoDeserializationStream
-
- close() - Method in class org.apache.spark.serializer.KryoSerializationStream
-
- close() - Method in class org.apache.spark.serializer.SerializationStream
-
- close() - Method in class org.apache.spark.SparkHadoopWriter
-
- close() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
-
- close() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- close() - Method in class org.apache.spark.storage.BlockObjectWriter
-
- close() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
-
- close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- close() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
-
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
-
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
-
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter
-
- close() - Method in class org.apache.spark.util.FileLogger
-
Close the writer.
- closeIfNeeded() - Method in class org.apache.spark.util.NextIterator
-
Calls the subclass-defined close method, but only once.
- ClosureCleaner - Class in org.apache.spark.util
-
- ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
-
- closureSerializer() - Method in class org.apache.spark.SparkEnv
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- cmd() - Method in class org.apache.spark.sql.execution.ExecutedCommand
-
- cn() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
-
- CoalescedRDD<T> - Class in org.apache.spark.rdd
-
Represents a coalesced RDD that has fewer partitions than its parent RDD
This class uses the PartitionCoalescer class to find a good partitioning of the parent RDD
so that each new partition has roughly the same number of parent partitions and that
the preferred location of each new partition overlaps with as many preferred locations of its
parent partitions
- CoalescedRDD(RDD<T>, int, double, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CoalescedRDD
-
- CoalescedRDDPartition - Class in org.apache.spark.rdd
-
Class that captures a coalesced RDD by essentially keeping track of parent partitions
- CoalescedRDDPartition(int, RDD<?>, int[], Option<String>) - Constructor for class org.apache.spark.rdd.CoalescedRDDPartition
-
- CoarseGrainedClusterMessage - Interface in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
-
- CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
-
- CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillExecutors(Seq<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
-
- CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
-
- CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillTask(long, String, boolean) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
-
- CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
-
- CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.LaunchTask(SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
-
- CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.LaunchTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
-
- CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterClusterManager$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
-
- CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisteredExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
-
- CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutor(String, String, int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
-
- CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
-
- CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RemoveExecutor(String, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
-
- CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RemoveExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
-
- CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RequestExecutors(int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
-
- CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RequestExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
-
- CoarseGrainedClusterMessages.RetrieveSparkProps$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RetrieveSparkProps$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
-
- CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster
-
Alternate factory method that takes a ByteBuffer directly for the data field
- CoarseGrainedClusterMessages.ReviveOffers$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
-
- CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StatusUpdate(String, long, Enumeration.Value, SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StatusUpdate$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
-
- CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
-
- CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
-
- CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
-
- CoarseGrainedSchedulerBackend - Class in org.apache.spark.scheduler.cluster
-
A scheduler backend that waits for coarse grained executors to connect to it through Akka.
- CoarseGrainedSchedulerBackend(TaskSchedulerImpl, ActorSystem) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- CoarseGrainedSchedulerBackend.DriverActor - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedSchedulerBackend.DriverActor(Seq<Tuple2<String, String>>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
-
- CoarseMesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos
-
A SchedulerBackend that runs tasks on Mesos, but uses "coarse-grained" tasks, where it holds
onto each Mesos node for the duration of the Spark job instead of relinquishing cores whenever
a task is done.
- CoarseMesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- code() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- codegenEnabled() - Method in class org.apache.spark.sql.execution.SparkPlan
-
- codegenEnabled() - Method in interface org.apache.spark.sql.SQLConf
-
When set to true, Spark SQL will use the Scala compiler at runtime to generate custom bytecode
that evaluates expressions found in queries.
- codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- CoGroupedRDD<K> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
A RDD that cogroups its parents.
- CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
-
- CoGroupPartition - Class in org.apache.spark.rdd
-
- CoGroupPartition(int, CoGroupSplitDep[]) - Constructor for class org.apache.spark.rdd.CoGroupPartition
-
- cogroupResult2ToJava(RDD<Tuple2<K, Tuple3<Iterable<V>, Iterable<W1>, Iterable<W2>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- cogroupResult3ToJava(RDD<Tuple2<K, Tuple4<Iterable<V>, Iterable<W1>, Iterable<W2>, Iterable<W3>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- cogroupResultToJava(RDD<Tuple2<K, Tuple2<Iterable<V>, Iterable<W>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- CoGroupSplitDep - Interface in org.apache.spark.rdd
-
- collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in this RDD.
- collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- collect() - Method in class org.apache.spark.rdd.RDD
-
Return an array that contains all of the elements in this RDD.
- collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD that contains all matching values by applying f
.
- collect() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- collect() - Method in class org.apache.spark.sql.SchemaRDD
-
- collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of collect
, which returns a future for
retrieving an array containing all of the elements in this RDD.
- collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving all elements of this RDD.
- collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Returns an RDD that contains for each vertex v its local edges,
i.e., the edges that are incident on v, in the user-specified direction.
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.ByteColumnStats
-
- collectedStatistics() - Method in interface org.apache.spark.sql.columnar.ColumnStats
-
Column statistics represented as a single row, currently including closed lower bound, closed
upper bound and null count.
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.DateColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.FloatColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.GenericColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.IntColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.LongColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.NoopColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.ShortColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.StringColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
-
- CollectionsUtils - Class in org.apache.spark.util
-
- CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
-
- collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Collect the neighbor vertex ids for each vertex.
- collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Collect the neighbor vertex attributes for each vertex.
- collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in a specific partition of this RDD.
- collectPartitions() - Method in class org.apache.spark.rdd.RDD
-
A private method for tests, to look at the contents of each partition
- collectToPython() - Method in class org.apache.spark.sql.SchemaRDD
-
Serializes the Array[Row] returned by SchemaRDD's optimized collect(), using the same
format as javaToPython.
- colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Computes column-wise summary statistics for the input RDD[Vector].
- ColumnAccessor - Interface in org.apache.spark.sql.columnar
-
An Iterator
like trait used to extract values from columnar byte buffer.
- columnBatchSize() - Method in interface org.apache.spark.sql.SQLConf
-
The number of rows that will be
- ColumnBuilder - Interface in org.apache.spark.sql.columnar
-
- columnNameOfCorruptRecord() - Method in interface org.apache.spark.sql.SQLConf
-
- columnOrdinals() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
An attribute map for determining the ordinal for non-partition columns.
- columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Compute all cosine similarities between columns of this matrix using the brute-force
approach of computing normalized dot products.
- columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Compute similarities between columns of this matrix using a sampling approach.
- columnSimilaritiesDIMSUM(double[], double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Find all similar columns using the DIMSUM sampling algorithm, described in two papers
- ColumnStatisticsSchema - Class in org.apache.spark.sql.columnar
-
- ColumnStatisticsSchema(Attribute) - Constructor for class org.apache.spark.sql.columnar.ColumnStatisticsSchema
-
- columnStats() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
-
- columnStats() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
-
Column statistics information
- ColumnStats - Interface in org.apache.spark.sql.columnar
-
Used to collect statistical information when building in-memory columns.
- columnStats() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
-
- columnType() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
-
- ColumnType<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
-
An abstract class that represents type of a column.
- ColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ColumnType
-
- columnType() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
-
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the output RDD.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing
partitioner/parallelism level.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Simplified version of combineByKey that hash-partitions the output RDD.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Combine elements of each key in DStream's RDDs using custom functions.
- combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
-
- combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- Command - Interface in org.apache.spark.sql.execution
-
- command() - Method in class org.apache.spark.sql.execution.PythonUDF
-
- commands() - Method in class org.apache.spark.sql.hive.test.TestHiveContext.TestTable
-
- commit() - Method in class org.apache.spark.SparkHadoopWriter
-
- commitAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter
-
Flush the partial writes and commit them as a single atomic block.
- commitAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
-
- commitJob() - Method in class org.apache.spark.SparkHadoopWriter
-
- commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
-
- commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- commonHeaderNodes() - Static method in class org.apache.spark.ui.UIUtils
-
- comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FairSchedulingAlgorithm
-
- comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
-
- comparator(Schedulable, Schedulable) - Method in interface org.apache.spark.scheduler.SchedulingAlgorithm
-
- compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
-
- compatibilityBlackList() - Static method in class org.apache.spark.sql.hive.HiveShim
-
- compatibleType(DataType, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD
-
Returns the most general data type for two given data types.
- completedIndices() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- completedStageIndices() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
-
- completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- completedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- completion() - Method in class org.apache.spark.util.CompletionIterator
-
- CompletionEvent - Class in org.apache.spark.scheduler
-
- CompletionEvent(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.CompletionEvent
-
- CompletionIterator<A,I extends scala.collection.Iterator<A>> - Class in org.apache.spark.util
-
Wrapper around an iterator which calls a completion method after it successfully iterates
through all the elements.
- CompletionIterator(I) - Constructor for class org.apache.spark.util.CompletionIterator
-
- completionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
Time when all tasks in the stage completed or when the stage was cancelled.
- ComplexColumnBuilder<T extends org.apache.spark.sql.catalyst.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
-
- ComplexColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.ComplexColumnBuilder
-
- ComplexFutureAction<T> - Class in org.apache.spark
-
A
FutureAction
for actions that could trigger multiple Spark jobs.
- ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
-
- compressCodec() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- compressed() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- CompressedMapStatus - Class in org.apache.spark.scheduler
-
A
MapStatus
implementation that tracks the size of each block.
- CompressedMapStatus(BlockManagerId, byte[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
-
- CompressedMapStatus(BlockManagerId, long[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
-
- compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
-
- compressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
-
- CompressibleColumnAccessor<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
-
- CompressibleColumnBuilder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
-
A stackable trait that builds optionally compressed byte buffer for a column.
- COMPRESSION_CODEC_PREFIX() - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
- CompressionCodec - Interface in org.apache.spark.io
-
:: DeveloperApi ::
CompressionCodec allows the customization of choosing different compression implementations
to be used in block storage.
- compressionCodec() - Method in class org.apache.spark.scheduler.EventLoggingInfo
-
- compressionCodec() - Method in class org.apache.spark.streaming.CheckpointWriter
-
- compressionEncoders() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
-
- compressionRatio() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
-
- CompressionScheme - Interface in org.apache.spark.sql.columnar.compression
-
- compressType() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
-
Provides the RDD[(VertexId, VD)]
equivalent output.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point.
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point,
add the gradient to a provided vector to avoid creating new objects, and return loss.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
-
Compute an updated value for weights given the gradient, stepSize, iteration number and
regularization parameter.
- compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomVectorRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.BlockRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CartesianRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CheckpointRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoalescedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.EmptyRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FilteredRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FlatMappedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.FlatMappedValuesRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.GlommedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MapPartitionsRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MappedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MappedValuesRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PipedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
:: DeveloperApi ::
Implemented by subclasses to compute a given partition.
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SampledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SubtractedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.sql.SchemaRDD
-
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Generate an RDD for the given duration
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Method that generates a RDD for the given Duration
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Method that generates a RDD for the given time
- compute(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream
-
Finds the files that were modified since the last time this method was called and makes
a union RDD out of them.
- compute(Time) - Method in class org.apache.spark.streaming.dstream.FilteredDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.GlommedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.MappedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
Generates RDDs with blocks received by the receiver of this stream.
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.StateDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.TransformedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.UnionDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
-
Gets the partition data by getting the corresponding block from the block manager.
- computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes column-wise summary statistics.
- computeCorrelation(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Compute correlation for two datasets.
- computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation for two datasets.
- computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
Compute Spearman's correlation for two datasets.
- computeCorrelationMatrix(RDD<Vector>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation
between column i and j.
- computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the
correlation between column i and j.
- computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the
correlation between column i and j.
- computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation matrix from the covariance matrix.
- computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the
correlation implementation for RDD[Vector].
- computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Return the K-means cost (sum of squared distances of points to their nearest center) for this
model on the given data.
- computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the covariance matrix, treating each row as an observation.
- computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
-
Method to calculate loss of the base learner for the gradient boosting calculation.
- computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
-
Method to calculate loss of the base learner for the gradient boosting calculation.
- computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate error of the base learner for the gradient boosting calculation.
- computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
-
Method to calculate loss of the base learner for the gradient boosting calculation.
- computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils
-
Returns a sampling rate that guarantees a sample of size >= sampleSizeLowerBound 99.99% of
the time.
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Computes the Gramian matrix A^T A
.
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the Gramian matrix A^T A
.
- computeOrReadCheckpoint(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
Compute an RDD partition or read it from a checkpoint if the RDD is checkpointing.
- computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
-
Computes the preferred locations based on input(s) and returned a location to block map.
- computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the top k principal components.
- computeSplitSize(long, long, long) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
-
This input format overrides computeSplitSize() to make sure that each split
only contains full records.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Computes the singular value decomposition of this IndexedRowMatrix.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes singular value decomposition of this matrix.
- computeSVD(int, boolean, double, int, double, String) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
The actual SVD implementation, visible for testing.
- computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Given the result returned by getCounts, determine the threshold for accepting items to
generate exact sample size.
- condition() - Method in class org.apache.spark.sql.execution.Filter
-
- condition() - Method in class org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
-
- condition() - Method in class org.apache.spark.sql.execution.joins.HashOuterJoin
-
- condition() - Method in class org.apache.spark.sql.execution.joins.LeftSemiJoinBNL
-
- conditionEvaluator() - Method in class org.apache.spark.sql.execution.Filter
-
- conf() - Method in class org.apache.spark.rdd.RDD
-
- conf() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- conf() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- conf() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- conf() - Method in class org.apache.spark.SparkContext
-
- conf() - Method in class org.apache.spark.SparkEnv
-
- conf() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
- conf() - Method in class org.apache.spark.storage.BlockManager
-
- conf() - Method in class org.apache.spark.streaming.StreamingContext
-
- conf() - Method in class org.apache.spark.ui.SparkUI
-
- confidence() - Method in class org.apache.spark.partial.BoundedDouble
-
- configFile() - Method in class org.apache.spark.metrics.MetricsConfig
-
- configTestLog4j(String) - Static method in class org.apache.spark.util.Utils
-
config a log4j properties used for testsuite
- configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
-
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
- confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns confusion matrix:
predicted classes are in columns,
they are ordered by class label ascending,
as in "labels"
- connected(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- ConnectedComponents - Class in org.apache.spark.graphx.lib
-
Connected components algorithm.
- ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
-
- CONSOLE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- CONSOLE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- CONSOLE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- CONSOLE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- ConsoleProgressBar - Class in org.apache.spark.ui
-
ConsoleProgressBar shows the progress of stages in the next line of the console.
- ConsoleProgressBar(SparkContext) - Constructor for class org.apache.spark.ui.ConsoleProgressBar
-
- ConsoleSink - Class in org.apache.spark.metrics.sink
-
- ConsoleSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.ConsoleSink
-
- ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
An input stream that always returns the same RDD on each timestep.
- ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- constructURIForAuthentication(URI, SecurityManager) - Static method in class org.apache.spark.util.Utils
-
Construct a URI container information used for authentication.
- consumerConnector() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
-
- contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
-
Checks whether a parameter is explicitly specified.
- contains(String) - Method in class org.apache.spark.SparkConf
-
Does the configuration contain a given parameter?
- contains(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Check if block manager master has a block.
- contains(BlockId) - Method in class org.apache.spark.storage.BlockStore
-
- contains(BlockId) - Method in class org.apache.spark.storage.DiskStore
-
- contains(BlockId) - Method in class org.apache.spark.storage.MemoryStore
-
- contains(BlockId) - Method in class org.apache.spark.storage.TachyonStore
-
- contains(A) - Method in class org.apache.spark.util.TimeStampedHashSet
-
- containsBlock(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
-
Check if disk block manager has a block.
- containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
-
Return whether the given block is stored in this block manager in O(1) time.
- containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-
- containsShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster
-
Check if the given shuffle is being tracked
- contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
-
- context() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- context() - Method in class org.apache.spark.InterruptibleIterator
-
- context() - Method in class org.apache.spark.rdd.RDD
-
- context() - Method in class org.apache.spark.sql.execution.SparkStrategies.CommandStrategy
-
- context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- context() - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
-
- context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- context() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return the StreamingContext associated with this DStream
- ContextCleaner - Class in org.apache.spark
-
An asynchronous cleaner for RDD, shuffle, and broadcast state.
- ContextCleaner(SparkContext) - Constructor for class org.apache.spark.ContextCleaner
-
- ContextWaiter - Class in org.apache.spark.streaming
-
- ContextWaiter() - Constructor for class org.apache.spark.streaming.ContextWaiter
-
- Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- convert() - Method in class org.apache.spark.WritableConverter
-
- convertCatalystToJava(Object) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
-
Converts Java objects to catalyst rows / types
- convertFromAttributes(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- convertFromString(String) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- convertJavaToCatalyst(Object, DataType) - Static method in class org.apache.spark.sql.types.util.DataTypeConversions
-
Converts Java objects to catalyst rows / types
- convertMetastoreParquet() - Method in class org.apache.spark.sql.hive.HiveContext
-
When true, enables an experimental feature where metastore tables that use the parquet SerDe
are automatically converted to use the Spark SQL parquet table scan, instead of the Hive
SerDe.
- convertSplitLocationInfo(Object[]) - Static method in class org.apache.spark.rdd.HadoopRDD
-
- convertToAttributes(Type, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- convertToBaggedRDD(RDD<Datum>, double, int, boolean, int) - Static method in class org.apache.spark.mllib.tree.impl.BaggedPoint
-
Convert an input dataset into its BaggedPoint representation,
choosing subsamplingRate counts for each instance.
- convertToString(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- convertToTreeRDD(RDD<LabeledPoint>, Bin[][], DecisionTreeMetadata) - Static method in class org.apache.spark.mllib.tree.impl.TreePoint
-
Convert an input dataset into its TreePoint representation,
binning feature values in preparation for DecisionTree training.
- CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents a matrix in coordinate format.
- CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- copiesRunning() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- copy() - Method in class org.apache.spark.ml.param.ParamMap
-
Make a copy of this param map.
- copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
y = x
- copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Get a deep copy of the matrix.
- copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Makes a deep copy of this vector.
- copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
-
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the
class when applicable for non-locking concurrent usage.
- copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Returns a shallow copy of this instance.
- copy() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
- copy() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
- copy() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
- copy() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
-
- copy() - Method in class org.apache.spark.util.StatCounter
-
Clone this StatCounter
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
-
- copyField(Row, int, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Copies from(fromOrdinal)
to to(toOrdinal)
.
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.STRING
-
- copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils
-
Copy all data from an InputStream to an OutputStream.
- cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
-
- cores() - Method in class org.apache.spark.scheduler.WorkerOffer
-
- coresByTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
-
- corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Compute the Pearson correlation matrix for the input RDD of Vectors.
- corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Compute the correlation matrix for the input RDD of Vectors using the specified method.
- corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Compute the Pearson correlation for the input RDDs.
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
:: Experimental ::
Compute the correlation for the input RDDs using the specified method.
- Correlation - Interface in org.apache.spark.mllib.stat.correlation
-
Trait for correlation algorithms.
- CorrelationNames - Class in org.apache.spark.mllib.stat.correlation
-
Maintains supported and default correlation names.
- CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
-
- Correlations - Class in org.apache.spark.mllib.stat.correlation
-
Delegates computation to the specific correlation object based on the input method name.
- Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
-
- corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
-
- count() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
The number of edges in the RDD.
- count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
The number of vertices in the RDD.
- count() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
-
- count() - Method in class org.apache.spark.mllib.recommendation.ALS.BlockStats
-
- count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample size.
- count() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
Number of data points accounted for in the sufficient statistics.
- count() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
Number of data points accounted for in the sufficient statistics.
- count() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Number of data points accounted for in the sufficient statistics.
- count() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
-
Number of data points accounted for in the sufficient statistics.
- count() - Method in class org.apache.spark.rdd.RDD
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
- count() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
-
- count() - Method in interface org.apache.spark.sql.columnar.ColumnStats
-
- COUNT() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- count() - Method in class org.apache.spark.sql.SchemaRDD
-
:: Experimental ::
Return the number of elements in the RDD.
- count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.util.StatCounter
-
- countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
:: Experimental ::
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of count
, which returns a
future for counting the number of elements in this RDD.
- countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for counting the number of elements in the RDD.
- countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Count the number of elements for each key, and return the result to the master as a Map.
- countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Count the number of elements for each key, collecting the results to a local Map.
- countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the count of each unique value in this RDD as a map of (value, count) pairs.
- countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
- countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
(Experimental) Approximate version of countByValue().
- countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
(Experimental) Approximate version of countByValue().
- countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Approximate version of countByValue().
- countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a window over this DStream.
- countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a sliding window over this DStream.
- counter() - Method in class org.apache.spark.partial.MeanEvaluator
-
- counter() - Method in class org.apache.spark.partial.SumEvaluator
-
- CountEvaluator - Class in org.apache.spark.partial
-
An ApproximateEvaluator for counts.
- CountEvaluator(int, double) - Constructor for class org.apache.spark.partial.CountEvaluator
-
- cpFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- cpRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- cpState() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- CPUS_PER_TASK() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- CR() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
- create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Deprecated.
- create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Create a new StorageLevel object.
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
-
Create an RDD that executes an SQL query on a JDBC connection and reads results.
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
-
Create an RDD that executes an SQL query on a JDBC connection and reads results.
- create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
-
Create a PartitionPruningRDD.
- create(Object...) - Static method in class org.apache.spark.sql.api.java.Row
-
Creates a Row with the given values.
- create(Seq<Object>) - Static method in class org.apache.spark.sql.api.java.Row
-
Creates a Row with the given values.
- create(String, LogicalPlan, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
-
Creates a new ParquetRelation and underlying Parquetfile for the given LogicalPlan.
- create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
-
- createActorSystem(String, String, int, SparkConf, SecurityManager) - Static method in class org.apache.spark.util.AkkaUtils
-
Creates an ActorSystem ready for remoting, with various Spark features.
- createArrayType(DataType) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates an ArrayType by specifying the data type of elements (elementType
).
- createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates an ArrayType by specifying the data type of elements (elementType
) and
whether the array contains null values (containsNull
).
- createCombiner() - Method in class org.apache.spark.Aggregator
-
- createCommand(Protos.Offer, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- createCompiledClass(String, File, String) - Static method in class org.apache.spark.TestUtils
-
Creates a compiled class with the given name.
- createDecimal(BigDecimal) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- createDefaultDBIfNeeded(HiveContext) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- createDirectory(String, String) - Static method in class org.apache.spark.util.Utils
-
Create a directory inside the given parent directory.
- createDriverEnv(SparkConf, boolean, LiveListenerBus) - Static method in class org.apache.spark.SparkEnv
-
Create a SparkEnv for the driver.
- createDriverResultsArray() - Static method in class org.apache.spark.sql.hive.HiveShim
-
- createEmpty(String, Seq<Attribute>, boolean, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
-
Creates an empty ParquetRelation and underlying Parquetfile that only
consists of the Metadata for the given schema.
- createExecutorEnv(SparkConf, String, String, int, int, boolean, ActorSystem) - Static method in class org.apache.spark.SparkEnv
-
Create a SparkEnv for an executor.
- createExecutorInfo(String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- createFilter(Expression) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
-
- createFunction() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
-
- createHistoryUI(SparkConf, SparkListenerBus, SecurityManager, String, String) - Static method in class org.apache.spark.ui.SparkUI
-
- createJar(Seq<File>, File) - Static method in class org.apache.spark.TestUtils
-
Create a jar file that contains this set of files.
- createJarWithClasses(Seq<String>, String) - Static method in class org.apache.spark.TestUtils
-
Create a jar that defines classes with the given names.
- createJobID(Date, int) - Static method in class org.apache.spark.SparkHadoopWriter
-
- createLiveUI(SparkContext, SparkConf, SparkListenerBus, JobProgressListener, SecurityManager, String) - Static method in class org.apache.spark.ui.SparkUI
-
- createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a MapType by specifying the data type of keys (keyType
) and values
(keyType
).
- createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a MapType by specifying the data type of keys (keyType
), the data type of
values (keyType
), and whether values contain any null value
(valueContainsNull
).
- createMesosTask(TaskDescription, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
Turn a Spark TaskDescription into a Mesos task
- createMetricsSystem(String, SparkConf, SecurityManager) - Static method in class org.apache.spark.metrics.MetricsSystem
-
- createNewSparkContext(SparkConf) - Static method in class org.apache.spark.streaming.StreamingContext
-
- createNewSparkContext(String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.streaming.StreamingContext
-
- createParquetFile(Class<?>, String, boolean, Configuration) - Method in class org.apache.spark.sql.api.java.JavaSQLContext
-
:: Experimental ::
Creates an empty parquet file with the schema of class beanClass
, which can be registered as
a table.
- createParquetFile(String, boolean, Configuration, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Creates an empty parquet file with the schema of class A
, which can be registered as a table.
- createPathFromString(String, JobConf) - Static method in class org.apache.spark.SparkHadoopWriter
-
- createPathFromString(String, JobConf) - Static method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- createPlan(String) - Static method in class org.apache.spark.sql.hive.HiveQl
-
Creates LogicalPlan for a given HiveQL string.
- createPlanForView(Table, Option<String>) - Static method in class org.apache.spark.sql.hive.HiveQl
-
Creates LogicalPlan for a given VIEW
- createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPythonWorker(String, Map<String, String>) - Method in class org.apache.spark.SparkEnv
-
- createRecordFilter(Seq<Expression>) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
-
Create a FixedLengthBinaryRecordReader
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamFileInputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamInputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileInputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
-
- createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a handler that always redirects the user to the given path
- createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.json.DefaultSource
-
Returns a new base relation with the given parameters.
- createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.parquet.DefaultSource
-
Returns a new base relation with the given parameters.
- createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
-
Returns a new base relation with the given parameters.
- createRoutingTables(EdgeRDD<?>, Partitioner) - Static method in class org.apache.spark.graphx.VertexRDD
-
- createSchemaRDD(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
Creates a SchemaRDD from an RDD of case classes.
- createServlet(JettyUtils.ServletParams<T>, SecurityManager, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
-
- createServletHandler(String, JettyUtils.ServletParams<T>, SecurityManager, String, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a context handler that responds to a request with the given path prefix
- createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a context handler that responds to a request with the given path prefix
- createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a handler for serving files from a static directory
- createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Create a input stream from a Flume source.
- createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Create a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from a Kafka Broker.
- createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from a Kafka Broker.
- createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from a Kafka Broker.
- createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from a Kafka Broker.
- createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from a Kafka Broker.
- createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an InputDStream that pulls messages from a Kinesis stream.
- createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create a Java-friendly InputDStream that pulls messages from a Kinesis stream.
- createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a StructField by specifying the name (name
), data type (dataType
) and
whether values of this field can be null values (nullable
).
- createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a StructField with empty metadata.
- createStructType(List<StructField>) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a StructType with the given list of StructFields (fields
).
- createStructType(StructField[]) - Static method in class org.apache.spark.sql.api.java.DataType
-
Creates a StructType with the given StructField array (fields
).
- createTable(String, boolean, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.hive.HiveContext
-
Creates a table using the schema of the given class.
- createTable(String, String, Seq<Attribute>, boolean, Option<CreateTableDesc>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
Create table with specified database, table name, table description and schema
- CreateTableAsSelect - Class in org.apache.spark.sql.hive.execution
-
:: Experimental ::
Create table and insert the query result into it.
- CreateTableAsSelect(String, String, LogicalPlan, boolean, Option<CreateTableDesc>) - Constructor for class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- CreateTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- CreateTableUsing - Class in org.apache.spark.sql.sources
-
- CreateTableUsing(String, String, Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CreateTableUsing
-
- createTempDir(String, String) - Static method in class org.apache.spark.util.Utils
-
Create a temporary directory inside the given parent directory.
- createTempLocalBlock() - Method in class org.apache.spark.storage.DiskBlockManager
-
Produces a unique block id and File suitable for storing local intermediate results.
- createTempShuffleBlock() - Method in class org.apache.spark.storage.DiskBlockManager
-
Produces a unique block id and File suitable for storing shuffled intermediate results.
- createTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- createUsingIndex(Iterator<Product2<Object, VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Similar effect as aggregateUsingIndex((a, b) => a)
- createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
-
- creationSite() - Method in class org.apache.spark.rdd.RDD
-
User code that created this RDD (e.g.
- creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
-
- credentialsProvider() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
-
- CrossValidator - Class in org.apache.spark.ml.tuning
-
:: AlphaComponent ::
K-fold cross validation.
- CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
-
- CrossValidatorModel - Class in org.apache.spark.ml.tuning
-
:: AlphaComponent ::
Model from k-fold cross validation.
- CrossValidatorModel(CrossValidator, ParamMap, Model<?>) - Constructor for class org.apache.spark.ml.tuning.CrossValidatorModel
-
- CrossValidatorParams - Interface in org.apache.spark.ml.tuning
-
- CSV_DEFAULT_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_KEY_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CsvSink - Class in org.apache.spark.metrics.sink
-
- CsvSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.CsvSink
-
- currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
-
- currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
-
- currentInterval(Duration) - Static method in class org.apache.spark.streaming.Interval
-
- currentLocalityIndex() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- currentResult() - Method in interface org.apache.spark.partial.ApproximateEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.CountEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.GroupedCountEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.GroupedSumEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.MeanEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.SumEvaluator
-
- currentTime() - Method in interface org.apache.spark.streaming.util.Clock
-
- currentTime() - Method in class org.apache.spark.streaming.util.ManualClock
-
- currentTime() - Method in class org.apache.spark.streaming.util.SystemClock
-
- currentUnrollMemory() - Method in class org.apache.spark.storage.MemoryStore
-
Return the amount of memory currently occupied for unrolling blocks across all threads.
- currentUnrollMemoryForThisThread() - Method in class org.apache.spark.storage.MemoryStore
-
Return the amount of memory currently occupied for unrolling blocks by this thread.
- currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- DAGScheduler - Class in org.apache.spark.scheduler
-
The high-level scheduling layer that implements stage-oriented scheduling.
- DAGScheduler(SparkContext, TaskScheduler, LiveListenerBus, MapOutputTrackerMaster, BlockManagerMaster, SparkEnv, Clock) - Constructor for class org.apache.spark.scheduler.DAGScheduler
-
- DAGScheduler(SparkContext, TaskScheduler) - Constructor for class org.apache.spark.scheduler.DAGScheduler
-
- DAGScheduler(SparkContext) - Constructor for class org.apache.spark.scheduler.DAGScheduler
-
- dagScheduler() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
-
- dagScheduler() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- dagScheduler() - Method in class org.apache.spark.SparkContext
-
- DAGSchedulerActorSupervisor - Class in org.apache.spark.scheduler
-
- DAGSchedulerActorSupervisor(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerActorSupervisor
-
- DAGSchedulerEvent - Interface in org.apache.spark.scheduler
-
Types of events that can be handled by the DAGScheduler.
- DAGSchedulerEventProcessActor - Class in org.apache.spark.scheduler
-
- DAGSchedulerEventProcessActor(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerEventProcessActor
-
- DAGSchedulerSource - Class in org.apache.spark.scheduler
-
- DAGSchedulerSource(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerSource
-
- data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
-
- data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- data() - Method in class org.apache.spark.storage.BlockResult
-
- data() - Method in class org.apache.spark.storage.PutResult
-
- data() - Method in class org.apache.spark.util.Distribution
-
- data() - Method in class org.apache.spark.util.random.GapSamplingIterator
-
- data() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
-
- database() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- databaseName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- dataDeserialize(BlockId, ByteBuffer, Serializer) - Method in class org.apache.spark.storage.BlockManager
-
Deserializes a ByteBuffer into an iterator of values and disposes of it when the end of
the iterator is reached.
- dataIncludesKey() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- dataSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- dataSerialize(BlockId, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
-
Serializes into a byte buffer.
- dataSerializeStream(BlockId, OutputStream, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
-
Serializes into a stream.
- DataSinks() - Method in interface org.apache.spark.sql.hive.HiveStrategies
-
- DataSourceStrategy - Class in org.apache.spark.sql.sources
-
A Strategy for planning scans over data sources defined using the sources API.
- DataSourceStrategy() - Constructor for class org.apache.spark.sql.sources.DataSourceStrategy
-
- DataType - Class in org.apache.spark.sql.api.java
-
The base type of all Spark SQL data types.
- DataType() - Constructor for class org.apache.spark.sql.api.java.DataType
-
- dataType() - Method in class org.apache.spark.sql.columnar.NativeColumnType
-
- dataType() - Method in class org.apache.spark.sql.execution.PythonUDF
-
- dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
-
- dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
-
- dataType() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- dataType() - Method in class org.apache.spark.sql.hive.HiveUdaf
-
- DataTypeConversions - Class in org.apache.spark.sql.types.util
-
- DataTypeConversions() - Constructor for class org.apache.spark.sql.types.util.DataTypeConversions
-
- DataValidators - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
A collection of methods used to validate data before applying ML algorithms.
- DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
-
- DATE - Class in org.apache.spark.sql.columnar
-
- DATE() - Constructor for class org.apache.spark.sql.columnar.DATE
-
- DateColumnAccessor - Class in org.apache.spark.sql.columnar
-
- DateColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DateColumnAccessor
-
- DateColumnBuilder - Class in org.apache.spark.sql.columnar
-
- DateColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DateColumnBuilder
-
- DateColumnStats - Class in org.apache.spark.sql.columnar
-
- DateColumnStats() - Constructor for class org.apache.spark.sql.columnar.DateColumnStats
-
- DateType - Static variable in class org.apache.spark.sql.api.java.DataType
-
Gets the DateType object.
- DateType - Class in org.apache.spark.sql.api.java
-
The data type representing java.sql.Date values.
- datum() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
-
- DDLParser - Class in org.apache.spark.sql.sources
-
A parser for foreign DDL commands.
- DDLParser() - Constructor for class org.apache.spark.sql.sources.DDLParser
-
- dead(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- decimalMetadata() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
-
- decimalMetastoreString(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- DecimalType - Class in org.apache.spark.sql.api.java
-
The data type representing java.math.BigDecimal values.
- DecimalType(int, int) - Constructor for class org.apache.spark.sql.api.java.DecimalType
-
- DecimalType() - Constructor for class org.apache.spark.sql.api.java.DecimalType
-
- decimalTypeInfo(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- decimalTypeInfoToCatalyst(PrimitiveObjectInspector) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- DecisionTree - Class in org.apache.spark.mllib.tree
-
:: Experimental ::
A class which implements a decision tree learning algorithm for classification and regression.
- DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
-
- DecisionTreeMetadata - Class in org.apache.spark.mllib.tree.impl
-
Learning and dataset metadata for DecisionTree.
- DecisionTreeMetadata(int, long, int, int, Map<Object, Object>, Set<Object>, int[], Impurity, Enumeration.Value, int, int, double, int, int) - Constructor for class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
-
:: Experimental ::
Decision tree model for classification or regression.
- DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
-
- decoder() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
-
- decoder(ByteBuffer, NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
-
- Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
-
- decreaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
-
- deepCopy() - Method in class org.apache.spark.mllib.tree.model.Node
-
Returns a deep copy of the subtree rooted at this node.
- DEFAULT_BUFFER_SIZE() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
-
- DEFAULT_LOG_DIR() - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
- DEFAULT_MINIMUM_SHARE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- DEFAULT_POOL_NAME() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
-
- DEFAULT_PORT() - Static method in class org.apache.spark.ui.SparkUI
-
- DEFAULT_PREFIX() - Method in class org.apache.spark.metrics.MetricsConfig
-
- DEFAULT_RETAINED_JOBS() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
-
- DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
-
- DEFAULT_SCHEDULER_FILE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- DEFAULT_SCHEDULING_MODE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- DEFAULT_WEIGHT() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- defaultCorrName() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
-
- defaultFilter(Path) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
-
- defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
- defaultMinPartitions() - Method in class org.apache.spark.SparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
- defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- defaultMinSplits() - Method in class org.apache.spark.SparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
- defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Default level of parallelism to use when not given by user (e.g.
- defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- defaultParallelism() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- defaultParallelism() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- defaultParallelism() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- defaultParallelism() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- defaultParallelism() - Method in class org.apache.spark.SparkContext
-
Default level of parallelism to use when not given by user (e.g.
- defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
Returns default configuration for the boosting algorithm
- defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
-
Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
- defaultPartitioner(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
- defaultProbabilities() - Method in class org.apache.spark.util.Distribution
-
- defaultSize() - Method in class org.apache.spark.sql.columnar.ColumnType
-
- defaultSizeInBytes() - Method in interface org.apache.spark.sql.SQLConf
-
The default size in bytes to assign to a logical operator's estimation statistics.
- DefaultSource - Class in org.apache.spark.sql.json
-
- DefaultSource() - Constructor for class org.apache.spark.sql.json.DefaultSource
-
- DefaultSource - Class in org.apache.spark.sql.parquet
-
Allows creation of parquet based tables using the syntax
CREATE TEMPORARY TABLE ...
- DefaultSource() - Constructor for class org.apache.spark.sql.parquet.DefaultSource
-
- defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-
- defaultValue() - Method in class org.apache.spark.ml.param.Param
-
- DeferredObjectAdapter - Class in org.apache.spark.sql.hive
-
- DeferredObjectAdapter(ObjectInspector) - Constructor for class org.apache.spark.sql.hive.DeferredObjectAdapter
-
- degrees() - Method in class org.apache.spark.graphx.GraphOps
-
The degree of each vertex in the graph.
- degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
Returns the degree(s) of freedom of the hypothesis test.
- delaySeconds() - Method in class org.apache.spark.streaming.Checkpoint
-
- delegate() - Method in class org.apache.spark.InterruptibleIterator
-
- deleteAllCheckpoints() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
-
Call this after training is finished to delete any remaining checkpoints.
- deleteOldFiles() - Method in class org.apache.spark.util.logging.RollingFileAppender
-
Retain only last few files
- deleteRecursively(File) - Static method in class org.apache.spark.util.Utils
-
Delete a file or directory and its contents recursively.
- deleteRecursively(TachyonFile, TachyonFS) - Static method in class org.apache.spark.util.Utils
-
Delete a file or directory and its contents recursively.
- dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Creates a column-major dense matrix.
- dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from a double array.
- DenseMatrix - Class in org.apache.spark.mllib.linalg
-
Column-major dense matrix.
- DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
-
- DenseVector - Class in org.apache.spark.mllib.linalg
-
A dense vector represented by a value array.
- DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
-
- dependencies() - Method in class org.apache.spark.rdd.RDD
-
Get the list of dependencies of this RDD, taking into account whether the
RDD is checkpointed or not.
- dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
-
List of parent DStreams on which this DStream depends on
- dependencies() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.MappedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.StateDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.UnionDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
-
- Dependency<T> - Class in org.apache.spark
-
:: DeveloperApi ::
Base class for dependencies.
- Dependency() - Constructor for class org.apache.spark.Dependency
-
- deps() - Method in class org.apache.spark.rdd.CoGroupPartition
-
- depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Get depth of tree.
- DeregisterReceiver - Class in org.apache.spark.streaming.scheduler
-
- DeregisterReceiver(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.DeregisterReceiver
-
- desc() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- DescribeCommand - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
- DescribeCommand(SparkPlan, Seq<Attribute>, SQLContext) - Constructor for class org.apache.spark.sql.execution.DescribeCommand
-
- describedTable() - Method in class org.apache.spark.sql.hive.test.TestHiveContext
-
- DescribeHiveTableCommand - Class in org.apache.spark.sql.hive.execution
-
Implementation for "describe [extended] table".
- DescribeHiveTableCommand(MetastoreRelation, Seq<Attribute>, boolean, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
-
- description() - Method in class org.apache.spark.ExceptionFailure
-
- description() - Method in class org.apache.spark.storage.StorageLevel
-
- description() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- DeserializationStream - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A stream for reading serialized objects.
- DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
-
- deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserialize(Object) - Method in class org.apache.spark.sql.api.java.JavaToScalaUDTWrapper
-
Convert a SQL datum to the user type
- deserialize(Object) - Method in class org.apache.spark.sql.api.java.ScalaToJavaUDTWrapper
-
Convert a SQL datum to the user type
- deserialize(Object) - Method in class org.apache.spark.sql.api.java.UserDefinedType
-
Convert a SQL datum to the user type
- deserialize(byte[], ClassTag<T>) - Static method in class org.apache.spark.sql.execution.SparkSqlSerializer
-
- deserialize(Writable) - Method in class org.apache.spark.sql.hive.parquet.FakeParquetSerDe
-
- deserialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
-
- deserialize(byte[]) - Static method in class org.apache.spark.util.Utils
-
Deserialize an object using Java serialization
- deserialize(byte[], ClassLoader) - Static method in class org.apache.spark.util.Utils
-
Deserialize an object using Java serialization and the given ClassLoader
- deserialized() - Method in class org.apache.spark.storage.MemoryEntry
-
- deserialized() - Method in class org.apache.spark.storage.StorageLevel
-
- deserializeFilterExpressions(Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
-
Note: Inside the Hadoop API we only have access to
Configuration
, not to
SparkContext
, so we cannot use broadcasts to convey
the actual filter predicate.
- deserializeLongValue(byte[]) - Static method in class org.apache.spark.util.Utils
-
Deserialize a Long value (used for PythonPartitioner
)
- deserializeMapStatuses(byte[]) - Static method in class org.apache.spark.MapOutputTracker
-
- deserializePlan(InputStream, Class<?>) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- deserializeStream(InputStream, ClassLoader) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserializeViaNestedStream(InputStream, SerializerInstance, Function1<DeserializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Deserialize via nested stream using specific serializer
- deserializeWithDependencies(ByteBuffer) - Static method in class org.apache.spark.scheduler.Task
-
Deserialize the list of dependencies in a task serialized with serializeWithDependencies,
and return the task itself as a serialized ByteBuffer.
- destinationToken() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- destroy() - Method in class org.apache.spark.broadcast.Broadcast
-
Destroy all data and metadata related to this broadcast variable.
- destroy(boolean) - Method in class org.apache.spark.broadcast.Broadcast
-
Destroy all data and metadata related to this broadcast variable.
- destroyPythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
-
- destTableId() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- details() - Method in class org.apache.spark.scheduler.Stage
-
- details() - Method in class org.apache.spark.scheduler.StageInfo
-
- determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
-
Determines the bounds for range partitioning from candidates with weights indicating how many
items each represents.
- DeveloperApi - Annotation Type in org.apache.spark.annotation
-
A lower-level, unstable API intended for developers.
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a diagonal matrix in DenseMatrix
format from the supplied values.
- dialect() - Method in class org.apache.spark.sql.hive.HiveContext
-
- dialect() - Method in interface org.apache.spark.sql.SQLConf
-
The SQL dialect that is used when parsing queries.
- DictionaryEncoding - Class in org.apache.spark.sql.columnar.compression
-
- DictionaryEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding
-
- DictionaryEncoding.Decoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
-
- DictionaryEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
-
- DictionaryEncoding.Encoder<T extends org.apache.spark.sql.catalyst.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
-
- DictionaryEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
-
- diff(Self) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Hides vertices that are the same between this and other.
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
-
Hides vertices that are the same between this
and other
; for vertices that are different,
keeps the values from other
.
- dir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- dir() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- DirectTaskResult<T> - Class in org.apache.spark.scheduler
-
A TaskResult that contains the task's return value and accumulator updates.
- DirectTaskResult(ByteBuffer, Map<Object, Object>, TaskMetrics) - Constructor for class org.apache.spark.scheduler.DirectTaskResult
-
- DirectTaskResult() - Constructor for class org.apache.spark.scheduler.DirectTaskResult
-
- disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions
-
Allows for the spark.hadoop.validateOutputSpecs
checks to be disabled on a case-by-case
basis; see SPARK-4835 for more details.
- disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- disconnected() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-
- DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-
- DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- diskBlockManager() - Method in class org.apache.spark.storage.BlockManager
-
- DiskBlockManager - Class in org.apache.spark.storage
-
Creates and maintains the logical mapping between logical blocks and physical on-disk
locations.
- DiskBlockManager(BlockManager, SparkConf) - Constructor for class org.apache.spark.storage.DiskBlockManager
-
- DiskBlockObjectWriter - Class in org.apache.spark.storage
-
BlockObjectWriter which writes directly to a file on disk.
- DiskBlockObjectWriter(BlockId, File, Serializer, int, Function1<OutputStream, OutputStream>, boolean, ShuffleWriteMetrics) - Constructor for class org.apache.spark.storage.DiskBlockObjectWriter
-
- diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- diskSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- diskSize() - Method in class org.apache.spark.storage.BlockStatus
-
- diskSize() - Method in class org.apache.spark.storage.RDDInfo
-
- diskStore() - Method in class org.apache.spark.storage.BlockManager
-
- DiskStore - Class in org.apache.spark.storage
-
Stores BlockManager blocks on disk.
- DiskStore(BlockManager, DiskBlockManager) - Constructor for class org.apache.spark.storage.DiskStore
-
- diskUsed() - Method in class org.apache.spark.storage.StorageStatus
-
Return the disk space used by this block manager.
- diskUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the disk space used by the given RDD in this block manager in O(1) time.
- dispose(ByteBuffer) - Static method in class org.apache.spark.storage.BlockManager
-
Attempt to clean up a ByteBuffer if it is memory-mapped.
- dist(Vector) - Method in class org.apache.spark.util.Vector
-
- distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.sql.api.java.JavaSchemaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- Distinct - Class in org.apache.spark.sql.execution
-
:: DeveloperApi ::
Computes the set of distinct input rows using a HashSet.
- Distinct(boolean, SparkPlan) - Constructor for class org.apache.spark.sql.execution.Distinct
-
- distinct() - Method in class org.apache.spark.sql.SchemaRDD
-
- distinct(int, Ordering<Row>) - Method in class org.apache.spark.sql.SchemaRDD
-
- DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
-
Represents a distributively stored matrix backed by one or more RDDs.
- Distribution - Class in org.apache.spark.util
-
Util for getting some stats from a small sample of numeric values, with some handy
summary functions.
- Distribution(double[], int, int) - Constructor for class org.apache.spark.util.Distribution
-
- Distribution(Traversable<Object>) - Constructor for class org.apache.spark.util.Distribution
-
- DIV() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- div(Duration) - Method in class org.apache.spark.streaming.Duration
-
- divide(double) - Method in class org.apache.spark.util.Vector
-
- doc() - Method in class org.apache.spark.ml.param.Param
-
- doCancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- doCheckpoint() - Method in class org.apache.spark.rdd.RDD
-
Performs the checkpointing of this RDD by saving this.
- doCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- DoCheckpoint - Class in org.apache.spark.streaming.scheduler
-
- DoCheckpoint(Time) - Constructor for class org.apache.spark.streaming.scheduler.DoCheckpoint
-
- doCleanupBroadcast(long, boolean) - Method in class org.apache.spark.ContextCleaner
-
Perform broadcast cleanup.
- doCleanupRDD(int, boolean) - Method in class org.apache.spark.ContextCleaner
-
Perform RDD cleanup.
- doCleanupShuffle(int, boolean) - Method in class org.apache.spark.ContextCleaner
-
Perform shuffle cleanup, asynchronously.
- doesDirectoryContainAnyNewFiles(File, long) - Static method in class org.apache.spark.util.Utils
-
Determines if a directory contains any files newer than cutoff seconds.
- doKillExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
-
Request that the ApplicationMaster kill the specified executors.
- doRequestTotalExecutors(int) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
-
Request executors from the ApplicationMaster by specifying the total number desired.
- dot(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
dot(x, y)
- dot(Vector) - Method in class org.apache.spark.util.Vector
-
- DOUBLE - Class in org.apache.spark.sql.columnar
-
- DOUBLE() - Constructor for class org.apache.spark.sql.columnar.DOUBLE
-
- doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- DoubleColumnAccessor - Class in org.apache.spark.sql.columnar
-
- DoubleColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DoubleColumnAccessor
-
- DoubleColumnBuilder - Class in org.apache.spark.sql.columnar
-
- DoubleColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnBuilder
-
- DoubleColumnStats - Class in org.apache.spark.sql.columnar
-
- DoubleColumnStats() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnStats
-
- DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more records of type Double from each input record.
- DoubleFunction<T> - Interface in org.apache.spark.api.java.function
-
A function that returns Doubles, and can be used to construct DoubleRDDs.
- DoubleParam - Class in org.apache.spark.ml.param
-
Specialized version of Param[Double
] for Java.
- DoubleParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleRDDFunctions - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of Doubles through an implicit conversion.
- DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
-
- doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
-
- doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
-
- doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
-
- DoubleType - Static variable in class org.apache.spark.sql.api.java.DataType
-
Gets the DoubleType object.
- DoubleType - Class in org.apache.spark.sql.api.java
-
The data type representing double and Double values.
- doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- driver() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- driver() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- DRIVER_AKKA_ACTOR_NAME() - Method in class org.apache.spark.storage.BlockManagerMaster
-
- DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
-
- driverActor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- driverActor() - Method in class org.apache.spark.storage.BlockManagerMaster
-
- driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
-
- driverSideSetup() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- dropFromMemory(BlockId, Either<Object[], ByteBuffer>) - Method in class org.apache.spark.storage.BlockManager
-
Drop a block from memory, possibly putting it on disk if applicable.
- droppedBlocks() - Method in class org.apache.spark.storage.PutResult
-
- droppedBlocks() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
-
- DropTable - Class in org.apache.spark.sql.hive
-
- DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.DropTable
-
- DropTable - Class in org.apache.spark.sql.hive.execution
-
:: DeveloperApi ::
Drops a table from the metastore and removes it if it is cached.
- DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DropTable
-
- dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Drops the temporary table with the given table name in the catalog.
- Dst - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose the destination and edge fields but not the source field.
- dstAttr() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex attribute of the edge's destination vertex.
- dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
-
The destination vertex attribute
- dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- dstId() - Method in class org.apache.spark.graphx.Edge
-
- dstId() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex id of the edge's destination vertex.
- dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- dstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
-
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- DStream<T> - Class in org.apache.spark.streaming.dstream
-
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous
sequence of RDDs (of the same type) representing a continuous stream of data (see
org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
- DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
-
- DStreamCheckpointData<T> - Class in org.apache.spark.streaming.dstream
-
- DStreamCheckpointData(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStreamCheckpointData
-
- DStreamGraph - Class in org.apache.spark.streaming
-
- DStreamGraph() - Constructor for class org.apache.spark.streaming.DStreamGraph
-
- DTStatsAggregator - Class in org.apache.spark.mllib.tree.impl
-
DecisionTree statistics aggregator for a node.
- DTStatsAggregator(DecisionTreeMetadata, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
- DummyCategoricalSplit - Class in org.apache.spark.mllib.tree.model
-
Split with no acceptable feature values for categorical features.
- DummyCategoricalSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyCategoricalSplit
-
- DummyHighSplit - Class in org.apache.spark.mllib.tree.model
-
Split with maximum threshold for continuous features.
- DummyHighSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyHighSplit
-
- DummyLowSplit - Class in org.apache.spark.mllib.tree.model
-
Split with minimum threshold for continuous features.
- DummyLowSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyLowSplit
-
- dumpTree(Node, StringBuilder, int) - Static method in class org.apache.spark.sql.hive.HiveQl
-
- duration() - Method in class org.apache.spark.scheduler.TaskInfo
-
- Duration - Class in org.apache.spark.streaming
-
- Duration(long) - Constructor for class org.apache.spark.streaming.Duration
-
- duration() - Method in class org.apache.spark.streaming.Interval
-
- Durations - Class in org.apache.spark.streaming
-
- Durations() - Constructor for class org.apache.spark.streaming.Durations
-